Collaboration in projects
Projects in Watson Studio Local are git repositories. Projects bare repositories are hosted on Watson Studio Local, Github or Bitbucket. When you are working on a project you have your own clone of the bare repository along with your own working
copy of the files stored under
/user-home/<uid>/DSX_Projects. You can add, delete and change files with no risk of impacting your collaborators who are working with their own independent copy of the files. Your own clone of
the repository tracks the bare repository.
Even if you aren't yet ready to share your work with your collaborators you can save your changes by committing one or more files. This updates your own git clone of the project with these changes. If you have files which have dependencies between them you should commit changes to all those files as part of the same git commit operation.
When you are ready to share your work with your collaborators you can push the changes in your clone to the master repository. Note that you can not push just a selected subset of the files. When you push changes all the changes for all the files will be pushed into the bare(master) repository.
To see what repository is your project clone tracking, open the web terminal and run the following git command:
git remote -v
- Commit changes to your local project repository to establish checkpoints of your work
- Reset the state of your project to previous commit/checkpoint
- Share file changes with your collaborators by pushing new commits to the remote bare repository
- Resolve merge conflicts using automatic and manual merge conflict strategies
- Create tags to mark a release of your project that is ready for deployment
You can also:
- View commit history
- Use Web Terminal to perform git actions on directories you have access to(access enforced by Linux file permissions)
Enterprise Git Integration
For GitHub Enterprise or BitBucket Enterprise linked projects, collaboration settings must be configured within the github or bitbucket application. The Collaborators tab will not appear in the Watson Studio Local interface.
Committing file changes
You can commit files through the git commit modal.
git add and
git commit are handled through one action. The user must select one or file from the modal and choose to commit it (note that this is so you can break your
changes into multiple commits, you cannot push while you have uncommitted files). File paths ending with a
/ character represent entire folders (shown when either the files must be grouped or the directory is empty).
Pushing commits to the remote
After you commit your files, you will be prompted to push your commits to the remote bare repository. If your clone of the project is behind the remote bare repository, you must pull changes from the remote repository first. You can see the list of commits that your are pushing to the remote and you can also create a tag for the last commit made.
Tagging of commits
It is possible to create tags that are associated with a git commit. This will allow you to export a project based on a tag (which is associated with a commit). You can tag a commit through either the push modal before you select the push action or through the commit history page accessed through the git actions dropdown in the action bar.
Pulling changes from the remote repository
Pulling changes brings new file changes to your local clone of the project. As part of that action, new files are added, deleted, modified. The modification involves merging of file contents. A merge conflict may occur when the files on your clone of the project and the files on the master you want to merge into have diverged.
Merge conflict resolution strategies
If there are conflicts, when you do your Pull operation you will see a list of conflicted files. If you know that all the real merge conflicts should either be resolved by discarding your changes and keeping the ones from the master repository, or conversely all the real merge conflicts can be resolved by keeping your changes and discarding the ones from the master repository then you can choose the appropriate option. Otherwise you will need to resolve the conflicts manually.
The simplest way to manually resolve your conflicts is to select the option to use the terminal.
Automatic Resolution - Resolve conflicts in favor of the remote change
If a merge conflict occurs, the merge algorithm will favor the remote changes. Your local conflicting changes will be discarded and the changes from the remote repository will be kept.
Automatic Resolution: Resolve conflicts in favor of the local changes
If a merge conflict occurs, the merge algorithm will favor the user's changes.
Manual resolution: Open a web terminal to resolve conflicts
A user can manually resolve conflicts in the web terminal. The user can find files in conflict using
git status. Then the user can use vim to resolve conflicts using
vi <fileName> and using
git add <fileName>.
The user can then conclude using
git merge --continue to create a new commit that resolves the merge conflicts. A user can also do any read/write/execute actions in the terminal so long as it pertains to areas that the user has permission
to access. This means
git pull and
git push will not work for a Watson Studio Local project because it involves reading files that the user does not have permission to access. In order to do these actions, the user will
have to invoke them from the UI.
Manual resolution: Export project to users' client machine and resolve conflicts
Select this option to export your clone of the project as a zip to your client machine. After the project is exported, unzip the archive and load the project in your favorite IDE tool that supports merge conflicts resolution. Resolve the conflicts, zip the project, upload it to Watson Studio Local and follow the steps to conclude the merge.
What about artifacts that aren't files?
Some of the assets in your project may not necessarily appear to be files. Example include data sources, models and SPSS Modeler flows. Internally all these assets are stored as files in the project file system. You can see these files when using the file explorer view to look at your projects. Git is a generic file versioning system so it saves versions of these files as well. The lists of files shown during various git operations will include the files containing the definitions of these assets.
What about files I don't recognize?
You may see files in the list of change or conflicting files that are unfamiliar to you. These will most likely be files automatically created or generated by one of the tools you used or transient files created by scripts or notebooks that you executed. These files can contain important work that you've done. Only exclude files that you created which are not ready to be committed.
A note about temporary or transient files
There are some files that are created by the tools you are using that are temporary or transient. Examples of this would include the checkpoint files created by Jupyter and intermediary data files stored during the execution of a script. These files are tracked by git and may appear in the list of conflicted files but you should be able to ignore them while resolving any real merge conflicts.
Files that don't need to be merged
It may be that for many of the files either your local copy or the remote copy is the right one to keep. You had to choose the option to resolve conflicts manually only because of other conflicting files. You can use the git command line to select whether to keep the local or remote copy for each of these files.
Files that you can merge yourself
The files that you create yourself such as scripts or notebooks you can perform a merge directly. When you open a script file in a text editor you will be able to see the differences between the files in the git diff format. As you merge the changes you can remove the diff tags to that when you are finished you have a file that looks exactly the way you want for the final merged version.
For Jupyter notebooks a notebook file will be created with comments flagging the Local and Remote changes to enable merging. While keeping the terminal open you should log in to the cluster from another tab in your browser and open the conflicting notebook in Jupyter to most easily see the conflicts and resolve them yourself.
Files that you can't merge directly
For other types of files you should not attempt to perform the merge at the file level. You should use use the git command line to select to keep either your own copy or the copy from the master. Then you can use the regular user interface to make any edits to that object to include the changes from the other .
Tools that create a set of files
Some of the tools you will be using will store their output as a group of files. In that case, when resolving conflicts you may need to look at the full set of files to make the appropriate edits to resolve your conflicts.