Rescue databases from the broken cluster

Just few days ago, our team received the bad news from the cloud provider that the upgrading our cluster went awry, and they did their best, but cannot save our cluster. Unfortunately, we have a lot of database implemented and deployed in that cluster, and the actual data saved in the persistent volumes.

We had a bit of panic, how to save these very important data, regarding that we cannot connect to that cluster to export data and get other information. Within 1-2 days, we worked out the solution and able to save most of the databases, with few conditions

  • Admin so you can see most of resources
  • Have options, settings for your persistent volumes (pv)


Ā [MySQL]

  • Step 1: Take notes of the node pool and the node name of the current cluster
  • Step 2: Create snapshots from pvc (set the friendly name so you remember šŸ˜€ )
  • Step 3: Create droplet/VM & mount volume from snapshots above.
  • [Not working option]
    • – Create the pvc from snapshots and mount to the current databases’s pod.
    • – Reason: different type and storage class. You also have to install plugin (PVC from snapshots)
  • Working option
    • – Step 4: Copy data from the droplet/VM to your local machine
    • – Step 5: Create the mysql pod or using docker compose file, with the same sql version as your inaccessible database. (You can check in data/mysql_upgrade_info)
    • – Step 6: Restore data from data files (More info)
      • – (6a) Copy mysql files (ib_logfile0, ib_logfile1, ibdata1, databases: mysql, your-db1, your-db2/*.frm, *.ibd) to the pv, at the mount path (eg: /var/lib/mysql)
      • – (6b) Should login and create empty databases -> /var/lib/mysql/<database> is created
      • – (6c) Copy the frm, idb files into the directory only. They should have permissionĀ 640, orĀ -rw-r-----
      • – (6d) Update file config /etc/mysql/my.cnf with size = size of ib_logfile0, eg:
        • [mysqld]
        • innodb_log_file_size=48M
    • – Step 7: Restart the mysql:Ā service mysql restart
    • Important note: You must have information of the mysql root account + password, as you replace the mysql/ data, the users also are updated.
  • Real story: I’ve tried so many times that the whole process took about 2 days. There are a lot of mistakes that I made along the way:
    • – Chose the wrong version of mysql
    • – Missing config file for logsize
    • – Incomplete data (as the ibdata1 contains metainfo for all databases, so if you only copying 1 of them, they will issue warning or not found tables info)
    • – The size of the pv, pvc not enough to restore.
    • – Everytime you restart, you might have to change the hostpath of the pv, so it will have clean data instead of taints from previous tries.
  • Migrate and restore data to MySQL managed database
  • Verify the information to make sure the data is up-to-date.

Ā [MongoDB]

  • Create the docker-compose with same docker image of mongodb, (you can check WiredTiger version and map back to mongodb version).
  • Start the docker-compose and mount the empty folder to the data dir.
  • Notes: Copy all the files in the /data directory into the new volume mount
    • – All *.wt files
    • – storage.bson, WiredTiger*, diagnostic.data/
  • Export all collections to json files to import to the new database.
  • Useful commands:
# Export mongodb to json file 
mongoexport --collection=<tblname> --db=<dbname> --out=/tmp/backup/<dbname>.<tblname>.json

# To connect to mongodb. Have to install pymongo[svc] if using python client
mongosh "mongodb+srv://<user>:<pwd>@xxxxxx.mongo.ondigitalocean.com/admin?tls=true&authSource=admin" --tls

# Import mongodb collection 
mongoimport --collection <tblname>  "mongodb+srv://<user>:<pwd>@xxxxxx.mongo.ondigitalocean.com/<dbname>?tls=true&authSource=admin" --file collections/<dbname>.<tblname>.json

Appendix

Steps to create the k8s resources.

kubectl apply -f pv.yaml
kubectl apply -f pvc.yaml
kubectl apply -f mysql.yaml

Steps to cleanup/delete k8s resources

kubectl delete -f mysql.yaml
kubectl delete -f pvc.yaml
kubectl delete -f pv.yaml

Manifests that I used to restore mysql for reference

pv/persistent-volume

# pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  name: mysql-pv
  labels:
    type: local
spec:
  storageClassName: do-block-storage
  capacity:
    storage: 50Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    # You should change this if reapply the pv/pvc doesn't work.
    path: "/mnt/mysqldata3" 

pvc/persistent volume claim

# pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mysql-pv-claim
spec:
  storageClassName: do-block-storage
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 50Gi

mysql/deployment

# mysql.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: mysql-test
spec:
  selector:
    matchLabels:
      app: mysql-test
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: mysql-test
    spec:
      containers:
        - image: mysql:5.7.31
          name: mysql
          env:
            - name: MYSQL_ROOT_PASSWORD
              value: password123
          ports:
            - containerPort: 3306
              name: mysql
          volumeMounts:
            - name: mysql-persistent-storage
              mountPath: /var/lib/mysql
          resources: 
            limits:
              ephemeral-storage: 50Gi
              memory: 20Gi
            requests:
              cpu: 250m
              ephemeral-storage: 50Gi
              memory: 20Gi
      volumes:
        - name: mysql-persistent-storage
          persistentVolumeClaim:
            claimName: mysql-pv-claim

mongodb docker-compose

# docker-compose.yaml
version: '3'
services:
  database:
    image: 'bitnami/mongodb:3.6.20'
    container_name: "mongodbtest"
    environment:
      - MONGO_INITDB_DATABASE=test
      - MONGO_INITDB_ROOT_USERNAME=root
      - MONGO_INITDB_ROOT_PASSWORD=password123
    volumes:
      - ./mongo-volume1:/bitnami/mongodb/data/db
    ports:
      - '27017-27019:27017-27019'

Comment Disabled for this post!