Table of contents

Back up and restore Watson Studio Local

This document walks through a full data backup and restore for Watson Studio Local

The main scenarios that are encountered are:

  • Backup for future recovery
  • Backup for move to a new cluster
  • Restore from backup
Note: Before performing your backup and restore procedures, be aware of the following tasks and warnings to avoid damaging your cluster and having to recover from older existing backups.
  • Patch 07 includes updated back up and restore tools. If you want those tools and you previously installed patch 04, you must replace patch 04 with the updated back up and restore tools included in patch 07. The updated tools are in part 04 (wsl-x86-v1231-patch07-part04.tar.gz) of the patch. Learn more
  • A file or directory that is manually created and assigned the name "libraries" will not be included in the backup. Rename the file or directory and contact support if you need help.
  • Confirm that you do not have the user-home glusterfs volume mounted locally before running the backup scripts. Use this command to check: 'mount | grep -v docker | grep -v kube | grep "pvc|glusterfs"'. If the mount exists, then you must unmount or reboot the node before trying the backup again.
  • If the backup script is interrupted, then you must reboot the node that you are running the backup scripts from. Before you reboot, use this command: systemctl stop kubelet docker.

    After the reboot is completed, run this command: mount | grep -v docker | grep -v kube | grep "pvc|glusterfs" to make sure that you don't have any manual glusterfs mounts.

    You must do this reboot before running the backup script again to clean any previous backup outstanding processes.

Back up and restore images, settings, and user-home volume

Back up and restore custom images, dashboard settings from MongoDB, CloudantDB and user-home volume for the same version of Watson Studio Local 1.2.x on clusters that use only GlusterFS.

Note: If you are using InfluxDB and experience storage failure or it's not included in a backup, the following information will be lost:
  • Machine learning model evaluation history
  • All deployment metrics and history, such as number of requests.

The following data is not included in this backup and won't be available in the restored cluster:

  • Job runs and their associated job logs.
  • Model evaluation results.

Locate and download wsl-x86-v1231-patch07-part04.tar.gz.

  1. Confirm that all the nodes, pods, and gluster volumes are running well.

    kubectl get nodes (To make sure that all nodes have Ready status).

    kubectl get po --all-namespaces | grep -v Running | grep -v Completed (To make sure that you do not have failing pods.)

    gluster volume status | grep ' N ' (To make sure that all glustervolumes are online.)

  2. The scripts must run as a root on a master_1 or master_2 node.
  3. MongoDB, CloudantDB, and Customer images scripts require their pods to be in a "Running" state.

    kubectl get po --all-namespaces | grep "cloudant\|mongo\|dash-front"

  4. The back up process requires zero activity on the cluster. The cluster should be restarted followed by cordoning all the nodes to stop scheduling on nodes.

    Commands to run before backing up the cluster:

    systemctl stop kubelet; systemctl stop docker (on all nodes)

    systemctl start docker; systemctl start kubelet (on all nodes)

    kubectl get po --all-namespaces | grep -v Running | grep -v Completed (To make sure that you do not have failing pods.)

    gluster volume status | grep ' N ' (To make sure that all gluster volumes are online.)

    kubectl get no --no-headers | awk '{system("kubectl cordon "$1)}' (on master node_1)

    Command to run after the backup is completed:

    kubectl get no --no-headers | awk '{system("kubectl uncordon "$1)}' (on master node_1)

    Note that the generated backup files will contain all the data and configuration information for the cluster. You must encrypt the backup files to protect the data locally or through a network transfer.

The following items are backed up in the following order:

  1. Custom images
  2. MongoDB
  3. CloudantDB
  4. User-home volume

The following items are restored in the following order:

  1. User-home
  2. CloudantDB
  3. MongoDB
  4. Custom images

Back up and restore custom images

The backup and restore process time depends on the number of images.
  1. To back up and restore custom images, enter: backup-restore-customimages.sh.
  2. During the backup process, the custom images are downloaded from the server and tar it with the following format customImage-${timestamp}.tar.gz.
  3. Specify the path of backup directory. For example, ./backup-restore-customimages.sh [ -b | --backup ] /backupdir.
  4. During the restore process, the saved custom images are uploaded back to the server. Specify the full path of the tar file. For example, ./backup-restore-customimages.sh [ -r | --restore ] /backupdir/customImage-20190507191947.tar.gz.
Back up and restore MongoDB
  1. Back up and restore the admin dashboard SMTP settings from MongoDB and enter: backup-restore-mongo.sh. However, the process won't restore the metrics and alerts.
  2. During the backup process, download the admin settings from MongoDB and tar it with the following format: mongo-${timestamp}.tar.gz.
  3. Specify the path of the backup directory. For example, ./backup-restore-mongo.sh [ -b | --backup ] /backupdir/.
  4. During the restore process, the saved custom images are uploaded back to the server. Specify the full path of the tar file. For example, ./backup-restore-mongo.sh [ -r | --restore ] /backupdir/mongo-20190508075844.tar.gz.
Back up and restore CloudantDB
  1. Back up and restore the CloudantDB and enter: backup-restore-cloudant.sh. CloudantDB contains the user login information and other metadata.
  2. During the backup process, the CloudantDB is backed up into a tar file cloudant-backup.tar. The archive file is an archived directory of json files that correspond to each exported database.
  3. Specify the path of the backup directory. For example, ./backup-restore-cloudant.sh [ -b | --backup ] /backupdir.
  4. During the restore process, the CloudantDB tar files are restored. Specify the full path of the tar file. For example, ./backup-restore-cloudant.sh [ -r | --restore ] /backupdir/cloudant-20190508075844.tar.gz.
Back up and restore user-home
  1. Back up and restore the user-home volume and enter: backup-restore-user-home.sh.
  2. During the backup process, back up the user-home volume to a tar file user-home-${timestamp}.tar.gz.
  3. Specify the path of the backup directory. For example, ./backup-restore-user-home.sh [ -b | --backup ] /backupdir/.
  4. During the restore process, the user-home tar is restored over a user-home volume. Specify the full path of the tar file. For example, ./backup-restore-user-home.sh [ -r | --restore ] /backupdir/user-home-20190508075844.tar.gz.

Procedures for after performing a restore

All of the pods under the dsx namespace must be restarted after the restore process is complete. To do so, run the following command:

kubectl get po -l 'heritage=Tiller,!job-name' -n dsx --no-headers | awk '{system("kubectl delete po -n dsx " $1)}'