The Zeppelin and Spark notebook environment
Apache Zeppelin notebooks run on kernels and Spark engines.
Apache Zeppelin supports many interpreters such as Scala, Python, and R. The Spark interpreter and Livy interpreter can also be set up to connect to a designated Spark or Livy service.
By default, the Zeppelin Spark interpreter connects to the Spark that is local to the Zeppelin container. You can configure which Spark version that the Spark interpreter connects to by updating the following Spark interpreter properties:
SPARK_HOME = /usr/local/spark-2.0 master = spark://spark-master-svc:7077
Paragraphs in a notebook can alternate between Scala, Python and R code by specifying an interpreter before each code block:
Tip: If a Zeppelin notebook loads a blank page, refresh the page to load the notebook properly.
Set up your Big SQL interpreter to work within Zeppelin
For a Zeppelin notebook to use the Big SQL data source you created, you must set up a Big SQL interpreter in your Zeppelin notebook. Note that if you already defined a JDBC Big SQL interpreter, then the definition will be updated. Complete the following steps from the Zeppelin notebook:
Set up the Big SQL interpreter:
%python import dsx_core_utils; dsx_core_utils.setup_bigsql_zeppelin("_datasource_name_")
When finished, run a query. This example uses an interpreter named
%bigsql select * from GOSALESDW.EMP_EMPLOYEE_DIM limit 10
Optional: If you have another Big SQL data source defined and would like to set up an interpreter with a different name that can also query this database, then specify a new interpreter name such as
You can now run a query with the interpreter:
%my_bigsql select * from GOSALESDW.EMP_EMPLOYEE_DIM limit 10
If the data source definition gets updated, for example, by changing the user information or JDBC URL location, then the command must be rerun to specify the correct interpreter.