Table of contents

Plan for Watson Studio Local

Before you install Watson Studio Local, you must set up the hardware and software for its private cloud architecture.

Planning for storing your user data and Docker images

Decide how you want to set up your file system and storage by considering the following options:
Storing your user data
The default is to use storage provided by your enterprise NFS service to store user data. It is recommended to have at least 500 GB of storage available. NFS storage is required for all production deployments.

If you are creating a small test system and want to bypass requesting NFS storage, you can use GlusterFS to store the user data on local disks spread across the nodes in a cluster. See the following details of the additional local storage required if you use GlusterFS.

Storing Docker images
Use the following table to determine the storage for Docker when you’re storing images:
If you’re using… Use…
x86 or POWER using RHEL 7.6 overlay2, devicemapper
x86 or POWER using older version of RHEL devicemapper

If your RHEL version was patched for overlay2, use overlay2.

Local storage requirements on your nodes

Refer to the following file system and storage combinations based on your needs to determine your disk requirements:

If your file system and storage combination is... Use these disk requirements
NFS + overlay2 / (root partition) at least 10GB

/ibm at least 500 GB formatted with XFS

NFS + devicemapper / (root partition) at least 10GB

/ibm at least 300 GB formatted with XFS

200 GB additional raw disk

GlusterFS + overlay2 / (root partition) at least 10GB

/ibm at least 500 GB formatted with XFS

On Master nodes additionally need a /data partition with 500 GB formatted with XFS

GlusterFS + devicemapper / (root partition) at least 10GB

/ibm at least 300 GB formatted with XFS

200 GB additional raw disk

On Master nodes additionally need a /data partition with 500 GB formatted with XFS

To visualize an example of a file system and storage combination, you can also use the following diagram:

Figure 1. Diagram of a Watson Studio Local and NFS storage example
Watson Studio Local and NFS storage diagram

Decide your node configuration

Before you install Watson Studio Local, consider how many nodes to use in the cluster. The selection should be made based on the type of workloads that will be run and the number of users that will be using the cluster. Clusters are very scalable due to the ability that additional compute and deployment (production compute) nodes can be added during or after installation. The deployment nodes are the production versions of the compute nodes, and have identical requirements.

Restriction: You cannot add more control nodes after an installation to any cluster type, so make sure you size your control node adequately to support projected growth.

The most basic configuration is a three node installation where each node shares control/compute. The three node cluster is scalable by adding additional compute or deployment nodes.

Figure 2. Architecture for a minimum of four nodes
Five node diagram

For a larger production cluster (seven or more nodes), it is recommended to have three control nodes and three compute nodes plus one or two deployment nodes. Compute and deployment nodes can be added to scale out the cluster after installation as well as during installation.

Figure 3. Architecture for a minimum of seven nodes
Eight node diagram

Common configuration examples

The following table shows examples of common configurations. You can select how to break up the cluster based on your requirements.

Cluster type # node breakdown Notes
3 nodes 3 shared control/compute Unable to deploy assets.
4 nodes 3 shared control/compute + 1 deploy  
5 nodes 3 shared control/compute + 2 deploy  
7 nodes 3 control + 3 compute + 1 deploy  
8 nodes 3 control + 3 compute + 2 deploy  
11 nodes 3 control + 6 compute + 2 deploy  

Note that asset deployment requires at least one deployment node. Although Watson Studio Local can operate with a single deployment node, the deployments cannot be made highly available in this configuration. If that deployment node fails all the deployments will go offline and remain offline until that node has been repaired or replaced. If you have a standby node available, then the downtime would be limited to the time required to remove the original node and add the standby node as the new deployment node. If you do not have a standby node, then the downtime would also include the time required to repair the original node or provision a replacement. If you choose to configure a single deployment node, you should ensure that your organization can survive such an outage of the deployed models and other analytics assets.