CCR cluster downtime is scheduled for Tuesday 27 July 2021. CCR resources (Globus, UB-HPC, submit) will be unavailable. Details: https://ubccr.freshdesk.com/support/discussions/topics/13000027917 close

GHub tool users: please note that these tools are currently being moved to Debian10 containers, and may not run as expected: cmctgm; cmcthistplot; crevasseoib; gisplot2. Thanks for your patience! close

Accessing Data

GHub is partnered with UB CCR to provide access to large datasets. Read on to learn how to access datasets using Globus command line and web-based utilities.

Globus is a fast, convenient, and secure method to move, share and discover data via a single web interface. It lets you access multiple sources of data (HPC cluster, Cloud storage, Local Workstation or Laptop) by allowing you to use your organization's authentication and identity to connect to endpoints that provide access to data storage, all in a web browser.

Contents

  1. GHub's Globus endpoints at CCR
  2. Using the Globus web application
  3. Globus connect personal: download data
  4. Using the Globus command line interface (CLI)
    1. Install Globus CLI
    2. Log in to Globus CLI
    3. Basic CLI use
    4. CLI documentation

GHub's Globus endpoints

Prior to accessing GHub data via Globus, submit a CCR or GHub ticket specifying the dataset or datasets you would like to access, your username, institutional email address, and home institution. The GHub administrators will enable your access to the endpoint.

 

Current Globus endpoints for GHub
Name Contents
GHub-CESM CESM ice sheet forcing data files from CMIP6 experiments
Panasas scratch ghubjobs working directories for all GHub jobs run on the CCR cluster
GHub-ISMIP6-Data ISMIP6 directory on CCR /projects space
GHub-ISMIP6-Projections Projections data from GHub ISMIP6 project
GHub-scratch-regrid-tool GHub's scratch space for regrid tool result sets
GHub-cmct Datasets for use with the Cryosphere model Comparison tool (CmCt)
GHub-scratch GHub's scratch space at CCR

 

Accessing GHub endpoints via Globus UI

Access the Globus user interface here: https://app.globus.org/

In the browser window, select your organization or home institution and click Continue:

Globus web app login

Your organization or home institution will prompt you to perform authentication as recognized by Globus. This step will vary according to your organization.

Once you are logged in, the Globus File Manager page will display. Typing "ghub" in the Search text box results in the following listing of GHub endpoints:

Entering ghub into the search box displays the ghub endpoints

Globus displays all known endpoints that match your search. You may not have access to all displayed endpoints. Click on the name of an endpoint to which you have access, and Globus will display the endpoint's contents.

For example, here we view the 'GHub CESM' endpoint contents. This endpoint contains the CESM ice sheet forcing data files from CMIP6 experiments:

GHub-CESM endpoint contents in file manager

More information is available in the excellent Globus docs: How To Log In and Transfer Files with Globus

Globus connect personal software

In order to download files from Globus endpoints to your local workstation, you will use Globus Connect. Information about installing and using it is found here:

Globus connect personal documentation

Mac OS
Windows
Linux

If you have issues or questions, please contact us by submitting a VHub/GHub or CCR ticket.


Using the Globus command line interface (CLI)

The Globus CLI is a standalone application that you can install on your own machine to access Globus endpoints and the data stored on them. You can interact with the CLI on the command line, or control it with scripting.

Note that the Globus CLI is also installed and available in the Jupyter Notebooks (Debian10) GHub tool, under the geospatial-python3 kernel.

Prior to accessing GHub data via Globus, submit a CCR or GHub ticket specifying the dataset you would like to access, your username, institutional email address and home institution. The GHub administrators will enable your access to the endpoint.

 

Install Globus CLI

The Globus CLI can be installed using either pip or pipx running with Python3. We recommend pipx; instructions for installing it in Linux, Mac, or Windows are found here:  https://pypa.github.io/pipx/installation/

Quick install

  1. install pipx.
  2. install globus cli:
        pipx install globus-cli

For more information, refer to the Globus docs on installing the CLI.

Log in to the Globus CLI   

To login and start the CLI, run this command in your workstation terminal:

globus login

Your terminal should display:

You are running 'globus login', which should automatically open a browser window for you to login.

Globus will direct you to a browser window. There, select your institution that you use to authenticate with Globus and click Continue:

Log in to use Globus CLI

In the second window that displays, click Allow to enable Globus to authenticate you:

Globus CLI requests your consent

Your workstation terminal should display:

You have successfully logged in to the Globus CLI!

Alternately, you may be provided the message "Please authenticate with Globus here:" and a long URL.  If so:

  1. Copy and paste the URL into a browser window, and then login to Globus as above.  
  2. Once you log in, the browser displays a "Native App Authorization Code." 
  3. In your terminal, you're prompted with:  "Enter the resulting Authorization Code here:" where you should paste the "Native App Authorization Code" given in the browser window.

 

Basics of the Globus CLI

The Globus CLI supports a rich set of commands for terminal use or for scripting. Users can list endpoints, and if so authorized they can download, upload, and list endpoint contents. A few basic commands are shown here to get you started. Refer to the Globus documentation for comprehensive documentation.

Selected Globus CLI commands
Task Command  
check your current identity
globus whoami
 
show all GHub endpoints
globus endpoint search "ghub"
 
list contents of an endpoint
globus ls <endpoint id>
 
log out
globus logout
 

 

Example session

Below is shown a brief example globus CLI session. The session shows a successful authentication, a session information query, "GHub" endpoints returned from a search, and a successful listing of the contents of one such endpoint.

typical globus cli session

Note that the 'globus search' command returns all endpoints that satisfy search criteria--even those on which you lack permissions. For example, here I try to list contents of an endpoint to which I have no access:

An expected Transfer API Error

Globus CLI Documentation

CCR's Globus CLI documentation shows selected additional commands (such as download and scripting information):

https://ubccr.freshdesk.com/support/solutions/articles/13000088456-globus-command-line-interface-cli-

Globus documentation provides comprehensive information about installing, using, and scripting the Globus CLI: