AVL JupyterLab user guide#

Basic usage#

This section provides a brief introduction for users to the basic features of the JupyterLab environment as deployed on the AVL system. For more in-depth documentation on the various components, see the links in the ‘Further information’ section.

Logging in#

To log in, navigate to https://jupyter.agriculturevlab.eu/ with a web browser (a recent version of Firefox, Chrome, or Safari is recommended). Log-in is currently managed via GitHub. On your first log-in you will be briefly redirected to GitHub to authorize access and, if necessary, log in to your GitHub account with your password. Subsequent log-ins will usually proceed without this step. If your Jupyter server is not already running you may see a progress bar appear for a few seconds while it is started for you. The JupyterLab interface will then appear in your web browser, ready for use.

Listing and opening datasets#

AVL data is available in in Zarr format in three object storage buckets:

  • agriculture-vlab-data, containing stable and tested datasets.
  • agriculture-vlab-data-staging, containing datasets which are currently undergoing testing and evaluation before graduating to the main agriculture-vlab-data bucket.
  • agriculture-vlab-data-test, containing experimental datasets which are not yet sufficiently stable or tested to be stored in the agriculture-vlab-data-staging bucket.

The datasets in a bucket can be listed using an xcube S3 store. The command below will search a bucket to a depth of ten subfolders and list any xcube-compatible datasets it finds, including Zarrs.

from xcube.core.store import new_data_store
store = new_data_store('s3', root='agriculture-vlab-data-staging',
                       max_depth=10, storage_options=dict(anon=True))
list(store.get_data_ids())

This will produce a list of dataset identifiers within the store, for example:

path1/path2/dataset1.zarr
path1/path2/dataset2.zarr
path2/dataset3.zarr

A dataset from this list can then be opened using the store object:

cube = store.open_data('path1/path2/dataset1.zarr')

Of course, data can also be listed and read from non-AVL S3 buckets in the same way.

A dataset can also be opened directly from an S3 path without instantiating a store object, as below. Note that there should be no trailing slash after the Zarr name.

from xcube.core.dsio import open_cube
cube = open_cube('s3://bucket/path/to/dataset.zarr', s3_kwargs=dict(anon=True))

Uploading data#

JupyterLab runs remotely on an AVL server, and can work directly with files stored in your user area on the server. To work with a file stored on your local computer, you must first upload it to the server. You can do this by clicking on the Upload (⇪) icon near the top left of the JupyterLab interface, or simply by dragging the file from your file manager to the file list along the left side of the JupyterLab interface. After upload the file will be directly accessible in the notebook environment.

Creating a notebook#

You can create a new notebook from the JupyterLab File menu (File → New → Notebook). If you are prompted to select a kernel, choose ‘Python 3 (ipykernel). You can also create a notebook by clicking on the ‘Python 3 (ipykernel)’ icon under the heading ‘Notebook’ in the JupyterLab launcher. The new notebook will open in the main part of the JupyterLab interface with an empty input cell at the top, ready for your first input to the Python interpreter.

Importing Python libraries#

The AVL Python environment includes a large number of preinstalled scientific libraries to support common use cases in data processing and analysis of EO and agricultural data. A brief list of these libraries can be found in the software reuse file for the exploitation subsystem. You can view a full and current list of installed packages in the notebook itself by entering this command into an input cell in a notebook:

!conda list

Installed libraries can be imported using the standard Python import command.

If you require a library that isn’t already installed in AVL, please contact AVL support to request it; in most cases it’s quick and easy to add a new library to the environment. This is the preferred method of adding libraries to AVL, but if you require a library urgently, and if it's available in a conda channel such as conda-forge, you can also install it yourself. For example, to install a package called example_package from the conda-forge distribution channel:

import sys
!mamba install --yes --channel conda-forge --prefix {sys.prefix} example_package

If a package is not available in any conda channel, it can also be installed with pip:

import sys
!{sys.executable} -m pip install example_package

Note: pip installation should only be used if the package is not available in a conda channel, since it can cause conflicts with the AVL’s existing conda-based package management.

Working with the Jupyter notebook#

For more information, see The JupyterLab Interface in the JupyterLab documentation.

The Jupyter scientific notebook combines features of an interactive terminal environment (like, for instance, the bash or ipython shell) with features of a programmer’s text editor. Within the notebook you can interact with the Python environment by entering commands or expressions; your command history and the associated output is stored and can be edited, re-run, rearranged, annotated, saved, and shared.

You interact with a Jupyter notebook by typing or pasting an expression or command into an input cell. When you press shift-enter or click the ▶ icon above the notebook, the contents of the cell are evaluated by the Python interpreter, and the result is displayed in a new cell below your input – depending on the command this may be text, an image, or an interactive widget. A new input cell is created below the displayed result, ready for your next input.

By clicking the ▸▸ icon, you can run the entire notebook from start to finish – not unlike a traditional Python script, but with the results from every input cell evaluation interleaved into the notebook.

You can also comment and document your notebooks by including cells that contain not Python code but Markdown source. Markdown is a simple markup language which lets you add symbols to plain text to indicate common formatting operations such as headings, bold or italic text, tables, and lists. In addition to Markdown, you can include LaTeX-style mathematical formatting by enclosing text between $ characters. To use an input cell for Markdown rather than code, use the drop-down menu at the top of the notebook on the right and change its setting from ‘Code’ to ‘Markdown’. After editing, press shift-enter as for a code cell; for a Markdown cell the source text will be hidden and the input cell will show the formatted Markdown until it is opened for editing again.

Saving results#

The notebook can write to the server-side storage associated with your AVL account, and any file writing functions in your Python code will write to this area. The resulting files will appear in the file chooser in the left-hand column of the JupyterLab environment. For instance, the following code writes a table to CSV and saves a PNG image of a graph:

import numpy as np
from matplotlib import pyplot as plt

table = np.transpose(np.array([np.arange(10), np.arange(10) ** 2]))
np.savetxt('table.csv', table, delimiter=',')
plt.plot(table[:,0], table[:,1])
plt.savefig('figure.png')

Downloading results#

Saved data files and figures – and the saved notebooks themselves – can be downloaded to your local computer. Right-click on the file in the file chooser at the left and select ‘Download’ from the context menu which appears. Alternatively, select the file in the file chooser, open the ‘File’ menu from the JupyterLab menu bar, and select ‘Download’.

Further information#