Table of contents

Manage packages as a Watson Studio Local administrator

A Watson Studio Local administrator can install Python or R packages in global directories. These packages are available to all users on the cluster.

Tasks for installing global libraries and packages:

To install a global Python library

  1. Log in to Watson Studio Local as admin and create a Python notebook.
  2. Use the Python pip package installer command to install Python libraries to your notebook. For example, run the following command in a code cell to install the prettyplotlib library:
    !pip install --target
    /user-home/_global_/python-2.7 prettyplotlib

The installed packages can be used by all notebook users that use the same Python version in the Spark service. Notebook users can now use the Python import command to import the library components. For example, users can run the following command in a code cell:

import prettyplotlib as ppl

To install a global Python library when the cluster is not connected to the internet

  1. Access the shared volume on the host. As root, do the following actions on the master node:
    1. Create a directory on the master node to mount the user-home volume. For example:
      mkdir -p /mnt/shared-user-home
    2. Find a storage node:
      kubectl get nodes -l is_storage=true

      Example output:

      NAME                                STATUS   
      AGE
      dev06-kube-storage-1.ibm.com        Ready     31d
      dev06-kube-storage-2.ibm.com        Ready     31d
      dev06-kube-storage-3.ibm.com        Ready     31d

      Pick one of the nodes in the output. In this example, you might pick dev06-kube-storage-1.ibm.com.

    3. Mount the user-home volume:
      mount -t glusterfs
      <storagehost>:/<namespace>-user-home 
      <mount-point>

      For example:

      mount -t glusterfs
      dev06-kube-storage-1.ibm.com:/dsx-user-home /mnt/shared-user-home/
  2. From a computer that has access to the internet and that has pip and Python v2.7 installed, run the following command to download the module and its dependencies:
    pip download -d tmp/piptest/prettyplotlib
    --no-binary :all: prettyplotlib
  3. Use tar or zip to create an archive of the downloaded files:
    tar -cf downloadedModule.tar
    tmp/piptest/prettyplotlib
  4. Copy the archive to the cluster master node:
    scp downloadedModule.tar
    root@dev06-kube-master-1:
  5. On the cluster master node, unpack the archive onto the shared directory from above:
    cd /mnt/shared-user-home/_global_/
    tar -xf ~/downloadedModule.tar

    Note the location of the directory and the module file:

    [root@dbl164-master-1 _global_]# tar -tf
    ~/downloadedModule.tar 
     tmp/piptest/prettyplotlib/
     tmp/piptest/prettyplotlib/brewer2mpl-1.4.1.zip
     tmp/piptest/prettyplotlib/functools32-3.2.3-2.zip
     tmp/piptest/prettyplotlib/pyparsing-2.2.0.tar.gz
     tmp/piptest/prettyplotlib/cycler-0.10.0.tar.gz
     tmp/piptest/prettyplotlib/python-dateutil-2.6.0.tar.gz
     tmp/piptest/prettyplotlib/six-1.10.0.tar.gz
     tmp/piptest/prettyplotlib/pytz-2017.2.zip
     tmp/piptest/prettyplotlib/matplotlib-2.0.2.tar.gz
     tmp/piptest/prettyplotlib/subprocess32-3.2.7.tar.gz
     tmp/piptest/prettyplotlib/numpy-1.12.1.zip
     tmp/piptest/prettyplotlib/prettyplotlib-0.1.7.tar.gz

    In this example, the location of the module file is:

    /mnt/shared-user-home/tmp/piptest/prettyplotlib/prettyplotlib-0.1.7.tar.gz

    On the pod that is running the notebook server, this location is:

    /user-home/tmp/piptest/prettyplotlib/prettyplotlib-0.1.7.tar.gz
  6. From Watson Studio Local, log in as admin, create a new Python notebook, and enter the following command in a cell:
    !pip install --target
    /user-home/_global_/python-2.7  --no-index
    --find-links=/user-home/tmp/_global_/piptest/prettyplotlib2
    /user-home/_global_/tmp/piptest/prettyplotlib2/prettyplotlib-0.1.7.tar.gz
Note: If the module does not install, repeat these instructions again, but remove the --no-binary :all: turn on the pip download step.

The installed packages can be used by all notebook users that use the same Python version in the Spark service. Notebook users can now use the Python import command to import the library components. For example, users can run the following command in a code cell:

import prettyplotlib as ppl

To load a global R package

  1. Log in to Watson Studio Local as admin and create an R notebook.
  2. Use the R install.packages() function to install new R packages. For example, run the following command in a code cell to install the ggplot2 package for plotting functions:
    install.packages("ggplot2")

    The imported package can be used by all R notebooks that is running in the Spark service.

Now, users can use the R library() function to load the installed package. For example, a user can run the following command in a code cell:

library("ggplot2")

When a user adds this command, they can now call plotting functions from the ggplot2 package in their notebook.

To install a global R library when the cluster is not connected to the internet

  1. Access the shared volume on the host. As root, do the following actions on the master node:
    1. Create a directory on the master node to mount the user-home volume. For example:
      mkdir -p /mnt/shared-user-home
    2. Find a storage node:
      kubectl get nodes -l is_storage=true
      Example output:
      NAME                            STATUS   AGE
      dev06-kube-storage-1.ibm.com    Ready    31d
      dev06-kube-storage-2.ibm.com    Ready    31d
      dev06-kube-storage-3.ibm.com    Ready    31d
      Pick one of the nodes in the output. In this example, you might pick dev06-kube-storage1.ibm.com.
    3. Mount the user-home volume:
      mount -t glusterfs
      <storagehost>:/<namespace>-user-home
      <mount-point>
      For example:
      mount -t glusterfs
      dev06-kube-storage-1.ibm.com:/dsx-user-home /mnt/shared-user-home/
  2. From a computer that has access to the internet, go to R CRAN page and search for packages, and download the package TAR file directly from the browser or use the following command to download through command line.

    First, create the destination folder:

    mkdir -p tmp-r

    Then use wget or curl to download the package by URL found from the CRAN website. wget example:

    wget https://cran.r-project.org/src/contrib/ggplot2_2.2.1.tar.gz --directory-prefix=tmp-r

    If R is installed on this computer, download the R package in an R session:

    download.packages('ggplot2',destdir='tmp-r')

    A TAR file for that package will be downloaded to folder tmp-r:

    $ ls tmp-r
    ggplot2_2.2.1.tar.gz
  3. Copy the archive to the cluster master node:
    scp -f tmp-r
    root@dev06-kube-master-1:/mnt/shared-user-home/
  4. On the cluster master node, check the uploaded file or files. In this example, the location of the module file is:
    /mnt/shared-user-home/tmp-r/ggplot2_2.2.1.tar.gz
    On the pod that is running the notebook server, this location is:
    /user-home/tmp-r/ggplot2_2.2.1.tar.gz
  5. From Watson Studio Local, sign in as admin, create a new R notebook, and enter the following command in a cell:
    install.packages('/user-home/tmp-r/ggplot2_2.2.1.tar.gz',
    repos=NULL)

The installed packages can be used by all notebook users that use the same R version in the Spark service. Notebook users can now use the R library() command to load the library components. For example, users can run the following command in a code cell:

library(ggplot2)