Table of contents

Manage images

In the Admin Console, click the menu icon (The menu icon) and click Image Management to augment the runtime images that DSX Local users can get started from. Images contain a specific runtime (for example, a Jupyter notebook, RStudio, or Zeppelin notebook) and a set of packages and libraries. When the DSX administrator deploys them, users can conveniently select them from the Environments page of their project without having to upload any of the packages and libraries themselves.

Complete the following steps to customize a runtime image:

  1. Download an existing image
  2. Modify the existing image
  3. Upload the customized image
  4. Validate the customized image
  5. Create a worker environment for the customized image

Want to see image management in action? Watch this short video:

Figure 1. Video iconImage Management in IBM Data Science Experience Local
This video shows how you can augment the IBM-provided images In Data Science Experience Local to add your own set of packages and libraries. You can then upload these images so that your DSX Local users can use them to create assets such as Jupyter notebooks, Zeppelin notebooks, and RStudio files.

Download an existing image

In the Image Management page, go to the Image List and click Download next to the runtime image you want to customize. Wait several minutes for DSX Local to prepare and download the TAR.GZ file.

Image Management screencap

Modify the existing image

Complete the following steps to modify the runtime image to add packages:

  1. Install Docker. See Download Docker Community Edition for details.
  2. Change the working directory to the folder with the downloaded image.
  3. Load the image:
    docker load -i <image_name>_<tag>.tar.gz
    
    You will receive either the image ID or name:tag identifier.
  4. Create a file named Dockerfile and use the loaded image as the base image. Examples to install the package:
    • Python package in a Jupyter or Zeppelin notebook for pip:
      FROM <identifier>
      RUN /opt/conda/bin/pip install arrow
      
    • Python package in a Jupyter or Zeppelin notebook for conda (requires admin access):
      FROM <identifier>
      RUN /opt/conda/bin/conda install arrow
      
    • R package in a Jupyter notebook:
      FROM <identifier>
      RUN Rscript -e "install.packages('data.table', repos='http://Rdatatable.github.io/data.table')"
      
    • For Scala JAR package in a Jupyter or Zeppelin notebook, use the wget, curl, or apt-get command to download the necessary JAR files. After download, move the JAR file to /usr/local/spark/jars. The JAR file will become available to Scala programs.
    • R package in RStudio:
      FROM <identifier>
      RUN Rscript -e "install.packages('data.table', repos='http://Rdatatable.github.io/data.table')"
      
    • Installing R packages is not supported for Zeppelin.
  5. Build the image with a new identifier. Jupyter example (do not forget the last dot):
    docker build -t modified-jupyter:v1.0 .
    
  6. Save the image and compress it as a tar file. Example:
    docker save modified-jupyter:v1.0 | gzip > modified-jupyter-v1.0.tar.gz
    

You can now upload the new TAR.GZ file to the DSX Local cluster.

Upload the customized image

Complete the following steps to upload the modified TAR.GZ file back to DSX Local:

  1. In the Image Management page, click Upload runtime image.

    Image Management screencap

  2. Type in a new image name, tag, and description in the upload page.

    Upload Image screencap

    • The name should contain only lower-case characters, underscores, or hyphens.
    • The tag should contain only lower-case characters, dots, underscores, or hyphens.
    • The name and tag should be unique and not conflict with any previously uploaded images.
    • The description field should not exceed 256 characters.
  3. Click the Browse button.
  4. Select the TAR.GZ file to be uploaded and click the Upload button. Because runtime image files are large, the upload might take awhile. Note that by default, the server times out in 60 minutes then returns a 502 error.

As a result, all collaborators in DSX Local can select the new image from their project Edit Environment page in the Images pull-down menu. By default, the image will be marked unvalidated.

Runtime environments screencap

Validate the customized image

For informational purposes, a DSX administrator can click Validate next to a modified image to mark that it has been successfully tested and approved by other DSX Local users. The DSX administrator can also invalidate the image. Only modified images can have their validation states changed, and invalidated images can still be run by DSX Local users.

Image Management screencap

When an image is validated, the following columns change in the image list:

  • Validated: indicates the current state of the image regarding validation (true or false).
  • Validation Change User: indicates the Admin who initiated the latest validation request.
  • Validation Change Date: indicates the timestamp when the latest validation change was made.

Also, DSX Local users can see an image's validation status from their list of runtime environments.

Create a worker environment for the customized image

To edit a worker environment to use the customized image:

  1. Go to the Jobs page.
  2. Click the Workers tab.
  3. Edit the worker environment and select the customized image.
  4. Save the worker.

As a result, whenever DSX Local users run a batch score or evaluate job, they can select this modified worker in the Advanced settings.

Delete the customized image

A DSX administrator can delete a non-IBM image (that is not running) by clicking Delete next to it. DSX Local then prompts for comfirmation to delete it.