Table of contents

Load and access data for DSX Local

Your notebook can load and access data from these databases and services using the following methods:

Tip: When developing models and other analytics assets, you should run data preparation on demand and store a copy of that prepared data. As you create scripts for use in your project release, you should include any necessary data preparation step as part of each script. For example, when you are evaluating a model in deployment, you should prepare the data just before running the evaluation so that it is done against the most recent data available.

Sometimes the data provided to you might contain corrupt or inaccurate data, or might not be in a suitable structure for your use. You will need to prepare the data (clean and transform it) before you can use it for building models or performing other analytics. You can use three primary strategies:

  • Notebooks, R Studio, and the script editor can all be used to create Python or R code to prepare the data.
  • SPSS Modeller can include nodes to perform data preparation. Data Refinery can be used both to directly prepare the data and to generate an R script that can be used in a job.

You can do this interactively via a notebook or as a job running a script, notebook or SPSS Flow.