In the IPSES context, the VRE (Virtual Research Environment) is an advanced feature that is available to the registered users. A VRE is an area where it is possible to analyse data previously selected from the IPSES GUI and combine it with other data that the user can add to a dedicated, persisted, space. The VRE is implemented using the popular Jupyter Notebook/Lab technology. The Jupyter and Python ecosystem includes extensions like data analysis, processing, and visualization components, mostly available open source/for free. The target of this document is providing to the user an introduction and a basic getting started guide to the IPSES VRE. The IPSES VRE has been implemented with the approach based on JupyterHub, as outlined in D11.3 Phase 1: Platform Analysis and Design. More information can be found in the VRE section of the mentioned document. JupyterHub has been integrated into the IPSES architecture to manage the multi-user JupyterLab environments, as required by the project (VRE JupyterHub component in the architecture diagram). In the specifics, in IPSES, JupyterHub has been configured to: integrate with the IPSES' identity and access management solution, as described in the KeyCloak section, for authorization and authentication. Provided with the right authorizations, users that have logged in into the IPSES GUI will be able to access directly the VRE. instantiate a JupyterLab for each user, using IPSES' container-based infrastructure. provide a persisted JupyterLab environment and storage for each user, where user's notebooks and the Python libraries she/he has installed are saved across different sessions with the VRE. To support the integration between the IPSES GUI and the VRE, a data sharing service has been developed in the IPSES backend. Through that service, the GUI will be able to save users' bookmarks from their workspace. The VRE, on its side, will be able to retrieve data from the service and use it, e.g., to download data from providers' services, save it to files, analyse it using the tools available in the Python ecosystem. To facilitate accessing the user's workspace, a dedicated python library has been designed and implemented: pyipses. Through pyipses users can programmatically access their bookmarks and download data from a bookmark's URL. Pyipses also provides a basic set of APIs e.g., to simplify plotting data on a map, using the popular pandas, ipyleaflet and plotly python libraries. Pyipses’ APIs are documented in the library’s project on GitLab. Accessing JupyterHub from the IPSES GUI The VRE can be used only by entitled registered IPSES users. To login, first point your browser to the IPSES GUI, click the “Login to personal area” in the top right corner and insert your credentials in the login box. If you have the rights to access the VRE, a “Goto JupyterHub” link will appear. Clicking the 'Goto JupyterHub” link will open a new browser tab with a progress bar indicating that the VRE/JupyterHub session is being prepared. In the first access this phase takes some time because of the installation of the pyipses support library and some basic dependencies (e.g. pandas and ipyleaflet). When the JupyterHub session’s is ready, a JupyterLab ‘standard’ interface will appear. Details about the JupyterLab GUI and its usage (creating notebooks, executing cells, etc.) can be found in the official JupyterLab guide. The file browser in the left sidebar shows the current content of the user JupyterLab space. In the user space you can create notebook files, upload existing and/or create new data files programmatically. The personal space is saved, so files created or edited during a VRE session will be available in a later session. If needed, you can install libraries from the Python ecosystem, for example by executing a !pip install LIBRARYNAME statement, in a Jupiter notebook cell. You can find more information about the pip command syntax in the pip official guide. Please note that the installation of a new library, or the upgrade of a library already installed could require a restart of the JupyterLab kernel. The user installed libraries are private to you and persisted by the JupyterHub infrastructure, so that the next time you activate the VRE you will find them already installed. By default, you will find some libraries preinstalled in your new JupyterLab environment: pandas,
plotly and
ipyleaflet, in addition to pyipses, a connector library with the IPSES backend. It is assumed that the current user’s workspace is not empty, in the IPSES GUI. In the current version of the IPSES GUI, to share the current workspace with the VRE just click the Share button in the GUI’s workspace section. This Python code below can be pasted into a notebook cell and run. Through the pyipses APIs, it retrieves the list of bookmarks currently defined in the user’s workspace and print them: the bookmark name and the available formats for that bookmark. Then, it downloads data from the first bookmark found in the list, in this example in ‘GeoJSON’ format and saves the data to a file in the JupyterLab space. from pyipses import Ipses userdata=Ipses.get_workspace() bookmarks=userdata.get_all_bookmarks() for bookmark in bookmarks: print(f'\'{bookmark.name} {bookmark.get_format_types()}\'') Ipses.download_data_to_file(bookmarks[0].get_url('geojson'), 'datafile.json') After saving the data to a file, you can process it using Python libraries. The sections below provide more details about some of the pyipses features. The complete APIs for the functions mentioned here are documented in the
library project’s source code. To import the library: from ipses import Ipses To retrieve and display the current user’s workspace from the shared workspace data service (i.e. the bookmarks collected by using the IPSES GUI, through the Share button, in the Workspace section) workspace=Ipses.get_workspace() bookmarks=workspace.get_all_bookmarks() for bookmark in bookmarks: print(f'{bookmark.name}') To save a workspace content (i.e., the bookmarks) to a Json file, in the JupyterLab space: since the GUI workspace may change over time, freezing its current state to a file can be useful for future use. Ipses.write_workspace(workspace, 'myworkspace.json') To load a previously saved workspace Json file in a Workspace: Ipses.read_workspace('myworkspace.json') Each bookmark has list of available data formats, leading to different URLs that are the pointers to the actual data. To retrieve the list of available formats (e.g., ‘json’, ‘csv’) for a bookmark: bookmarks[0].get_format_types() To download data from the bookmark’s URL, given a FORMAT_TYPE: Ipses.download_data(bookmarks[0].get_url(FORMAT_TYPE)) To download data from an URL (e.g., from a bookmark’s URL), and save it to a file in the JupyterLab space Ipses.download_data_to_file (bookmarks[0].get_url(FORMAT_TYPE), 'mydata') Pyipses provides also some basic APIs to display data on a map, using the ipyleaflet Jupiter widget, or in a graph, using plotly; Please note that these features are not meant to replace the official and extensive, set of ipyleaflet and plotly APIs. Also, note that if these API are used to display data from generic IPSES bookmarks, some coding in Python will be required to transform that data, from its original format to a pandas dataframe. To import the support APIs: from pyipses import IpsesUtils A dataframe df containing at least two columns with latitude and longitude values could be displayed on a map, using the IpyLeaflet library. This code below displays the map and uses OpenStreetMap for the background layer. The parameters lat_col and lon_col are the names of the columns, in the dataframe df, with the latitude and longitude data, respectively. IpsesUtils.map_df(df, lat_col, lon_col, width=800, height=600) Note that each row in the dataframe will be rendered on the map with a marker icon. By clicking on a marker a popup will be displayed, with the all the other attributes found in the dataframe’s row. This code below plots a timeseries using the Plotly library. The parameters time_col and value_col are the names of the columns, in the dataframe df, with the time and value data to display, respectively. IpsesUtils.plot_df(df, time_col, value_col)Introduction
VRE, JupyterHub, JupyterLab, pyipses
JupyterLab, user space, how to install libraries
Pyipses getting started example
Pyipses APIs
Import pyipses
Retrieve Workspace shared data
Save workspace data to a file
Load workspace data from a Json file
Download data from an URL
Download data from an URL to a file
Displaying data
Displays a dataframe on a map
Plots a dataframe with time series data
IPSES VRE PROTOTYPE
Getting Started Guide
IPSES
Italian Platform for Solid Earth Sciences
Version: 0.5