Globus: Difference between revisions

From Alliance Doc
Jump to navigation Jump to search
No edit summary
No edit summary
Line 27: Line 27:


<!--T:23-->
<!--T:23-->
Globus transfers happen between so-called "collections" (formerly known as "endpoints" in previous Globus versions).  Most Compute Canada systems have some standard collections set up for you to use.  To transfer files to and from your computer, you need to create a collection for it. This requires a bit of setup initially, but once it has been done, transfers via Globus require little more than making sure the Globus Connect Personal software is running on your machine. More on this below under [[#Personal Computers|Personal Computers]].
Globus transfers happen between "collections" (formerly known as "endpoints" in previous Globus versions).  Most Compute Canada systems have some standard collections set up for you to use.  To transfer files to and from your computer, you need to create a collection for it. This requires a bit of setup initially, but once it has been done, transfers via Globus require little more than making sure the Globus Connect Personal software is running on your machine. More on this below under [[#Personal Computers|Personal Computers]].


<!--T:6-->
<!--T:6-->

Revision as of 22:52, 16 February 2021

Other languages:

Globus is a service for fast, reliable, secure data movement. Designed specifically for researchers, Globus has an easy-to-use interface with background monitoring features that automate the management of file transfers between any two resources, whether they are at Compute Canada, another supercomputing facility, a campus cluster, lab server, desktop or laptop.

Globus leverages GridFTP for its transfer protocol but shields the end user from complex and time consuming tasks related to GridFTP and other aspects of data movement. It improves transfer performance over GridFTP, rsync, scp, and sftp, by automatically tuning transfer settings, restarting interrupted transfers, and checking file integrity.

Globus can be accessed via the main Globus website or via the Compute Canada Globus portal at https://globus.computecanada.ca.


Screenshots out-of-date

The Globus web interface has changed as of April 2019 and control screens will look different from the screenshots on this page. Please see https://docs.globus.org/how-to/ for the latest documentation, while we update this page.


Using Globus[edit]

Go to http://globus.computecanada.ca. Your "existing organizational login" is your CCDB account. Ensure that "Compute Canada" is selected in the drop-down, then click Continue. Supply your CCDB username (not your e-mail address or other identifier) and password on the Compute Canada MyProxy page which appears. This takes you to the web portal for Globus.

CC Globus Authentication page. (Click for larger image.)

To Start a Transfer[edit]

Globus transfers happen between "collections" (formerly known as "endpoints" in previous Globus versions). Most Compute Canada systems have some standard collections set up for you to use. To transfer files to and from your computer, you need to create a collection for it. This requires a bit of setup initially, but once it has been done, transfers via Globus require little more than making sure the Globus Connect Personal software is running on your machine. More on this below under Personal Computers.

If the "File Manager" page in the Globus Portal is not already showing (see image), select it from the left sidebar.

Globus File Manager. (Click for larger image.)

On the right of the page there is a pair of buttons labelled "Panels". Select the second button (this will allow you to see two collections at the same time).

Click on the first collection where the page says "-- start here, select a collection --".

Selecting a Globus collection. (Click for larger image.)

You can start typing a collection name to select it. For example, if you want to transfer data to or from the Béluga cluster, type "beluga", wait two seconds for a list of matching sites to appear, and select computecanada#beluga-dtn. All Compute Canada resources have names prefixed with computecanada#. For example, computecanada#cedar-dtn, computecanada#graham-globus or computecanada#niagara (note that 'dtn' stands for 'data transfer node').

You may be prompted to "authenticate" the collection, depending on which site is hosting the collection. For example, if you are activating a collection hosted on Graham, you will be asked for your Compute Canada username and password. The authentication of a collection remains valid for some time - typically one week for CC collections while personal collections do not expire.

Now select a second collection, authenticating if required.

Once a collection has been activated you should see a list of directories and files. You can navigate these by double-clicking on directories and using the "up one folder" button. Highlight a file or directory that you want to transfer by single-clicking on it. Control-click to highlight multiple things. Then click one of the big blue buttons with white arrowheads to initiate the transfer. The transfer job will be given a unique number and will begin right away. You will receive an email when the transfer is complete. You can also monitor in-progress transfers and view details of completed transfers from the Activity tab on the Globus Portal.

Initiating a transfer. Note the highlighted file in the left-hand pane. (Click for larger image.)

See also How To Log In and Transfer Files with Globus at the Globus.org site.

Options[edit]

Globus provides several other options in the "Transfer & Sync Options" area at the bottom of the File Manager page. Here you can direct Globus to

  • sync - only transfer new or changed files
  • delete files on destination that do not exist on source
  • preserve source file modification times
  • verify file integrity after transfer (on by default)
  • encrypt transfer

Note that enabling encryption significantly reduces transfer performance, so it should only be used for sensitive data.

Personal Computers[edit]

Globus provides a desktop client, Globus Connect Personal, to make it easy to transfer files to and from a personal computer running Windows, MacOS X, or Linux.

There are links on the Globus Connect Personal page which walk you through the setup of globus connect personal on the various operating systems, including setting it up from the commandline on Linux. If you are running Globus Connect Personal from the command line on linux, this FAQ on the Globus site describes configuring which paths you share and their permissions.

To install Globus Connect Personal[edit]

Finding the installation button. (Click for larger image.)
  1. Go to the Compute Canada Globus portal and log in if you have not already done so.
  2. From the File Manager screen, choose one of the Collection selectors. In either the 'Recent' or 'More Options' tab, click on “Install Globus Connect Personal”.
  3. Enter an "Endpoint Display Name" of your choice, which you will use to access the computer you will be installing Globus Connect Personal on. Example: MacLaptop or WorkPC.
  4. Click the “Generate Setup Key” button. Copy the key to your computer’s clipboard, then click the download link for your operating system.
  5. Install the program.
  6. Once it is installed, run the Globus Connect Personal program.
  7. The first time you run the program, enter the Setup Key from step 4 in the box that pops up.
  8. You should now be able to access the endpoint through Globus. The full endpoint name is [your username]#[name from step 3] Example: smith#WorkPC

To run Globus Connect Personal[edit]

The above steps are only needed once, to setup the endpoint. For further file transfer operations, one has to make sure Globus Connect Personal is running, i.e., start the program, and ensure that the endpoint isn't paused.

Globus Connect Personal application for a personal endpoint.

Note that if the Globus Connect Personal program at your end point is closed during a file transfer to or from that endpoint, the transfer will stop. To restart the transfer, simply reopen the program.

Transfer between two personal endpoints[edit]

Although you can create endpoints for any number of personal computers, transfers between two personal endpoints is not enabled by default. If you need this capability, please contact globus@computecanada.ca to setup a "Globus Plus" account.

For more information see the Globus.org how-to pages, particularly:

Globus Sharing[edit]

Globus sharing makes collaboration with your colleagues easy. Sharing enables people to access files stored on your account on a Compute Canada system even if the other user does not have an account on that system. Files can be shared with any user, anywhere in the world, who has a Globus account. See How To Share Data Using Globus.

Creating a Shared Endpoint[edit]

To share a file or folder on an endpoint first requires that the system hosting the files has sharing enabled.

Log into globus.computecanada.ca with your Globus credentials. Once you are logged in, you will see a transfer window. In the ‘endpoint’ field, type the endpoint identifier for the endpoint you wish to share from (e.g.computecanada#gpc) and activate the endpoint, if asked to.

Creating a shared endpoint (Click for larger image.)

Select an item that you wish to share, then click the three horizontal lines on the right side of the endpoint’s window to access the menu where you can then select share.

Selecting the share option

Selecting share opens a window that shows your current shared endpoints, if you have any, and a button labeled ‘Add Shared Endpoint’. Clicking that button will bring up the ‘Create New Shared Endpoint’ window. By default it will have values filled based on your previous selections. You can modify these as necessary, then click the ‘Create and Manage Access’ button.

Managing a Shared Endpoint

Managing Access[edit]

Once the endpoint is created, you will be shown the current access list, with only your account on it. Since sharing is of little use without someone to share with, click the ‘Add Permission’ button to add people or groups that you wish to share with.

You will now be prompted to select whether to share with people via email, username, or group.

  • E-mail is a good choice if you don’t know a person’s username on Globus. It will also allow you to share with people who do not currently have a Globus account, though they will need to create one to be able to access your share.
  • User presents a search box that allows you to search by name or Globus username. This is best if someone already has a Globus account, as it does not require any action on their part to be added to the share. Enter a name or Globus username (if you know it), and select the appropriate match from the list, then click ‘Use Selected’
  • Group allows you to share with a number of people simultaneously. You can search by group name or UUID. Group names may be ambiguous, so be sure to verify you are sharing with the correct one. This can be avoided by using the group’s UUID, which is available on the Groups page (See Groups Section)
Managing Shared Endpoint Permissions

To add or remove write permissions from a user, click the checkbox next to their name under the write column. It is not possible to remove read access.

Deleting users or groups from the list of people you are sharing with is as simple as clicking the ‘x’ at the end of the line containing their information.

Removing a Shared Endpoint[edit]

Once you no longer need your shared endpoint, remove it. To do this, go to the top of the page, and select ‘Manage Endpoints’ from the ‘Manage Data’ menu.

You will be shown a list of endpoints that you have created, including Globus Connect Personal or shared endpoints, as well as those you have recently used. Find the shared endpoint you wish to delete in the list, and expand it. Click the ‘delete endpoint’ button, and confirm the removal when prompted.

Removing a Shared Endpoint

The endpoint is now deleted. Your files will not be affected by this action, nor will those others may have uploaded.

Sharing Security[edit]

Sharing files entails a certain level of risk. By creating a share, you are opening up files that up to now have been in your exclusive control to others. The following list is some things to think about before sharing, though it is far from comprehensive.

  • Make sure you have permission to share the files, if you are not the data’s owner
  • Make sure you are sharing with only those you intend to. Verify the person you add to the access list is the person you think, there are often people with the same or similar names. Remember that Globus usernames are not linked to Compute Canada usernames. The recommended method of sharing is to use the email address of the person you wish to share with, unless you have the exact account name.
  • If you are sharing with a group you do not control, make sure you trust the owner of the group. They may add people who are not authorized to access your files.
  • If granting write access, make sure that you have backups of important files that are not on the shared endpoint, as users of the shared endpoint may delete or overwrite files, and do anything that you yourself can do to a file.
  • It is highly recommended that sharing be restricted to a subdirectory, rather than your top-level home directory.

Globus Groups[edit]

Globus groups provide an easy way to manage permissions for sharing with multiple users. When you create a group, you can use it from the sharing interface easily to control access for multiple users.

Creating a Group[edit]

Go to Groups tab at the top of the page. In the ‘My Groups’ tab there is ‘Create New Group’ button at the bottom of the page. Pressing this button brings up the ‘Create New Group’ window.

Creating a Globus Group
  • Enter the name of the group in the ‘Group Name’ field
  • Enter the group description in the ‘Group Description’ field
  • Select if the group is visible to only group members (private group) or all Globus users.
  • Click ‘Create Group’ to add the group.

Inviting Users[edit]

Once a group has been created, users can be added by selecting ‘Invite users’, and then either entering an email address (preferred) or searching for the username. Once users have been selected for invitation, click the invite button and they will be sent an email inviting them to connect. Once they’ve accepted, they will be visible in the group.

Modifying Membership[edit]

Click the pencil icon next to a user to modify their membership. The first tab shows the member info (username, email, name). The Role & Status tab allows status to be set to either Active, Suspended, or Remove which will change the group membership.

Role allows you to grant permissions to the user, including Admin (Full access), Manager (Change user roles), or Member (no management functions). The ‘Save Changes’ button commits the changes.

Group Settings[edit]

  • Policies Tab:
    • Shows current policies
    • Policies can be edited by clicking ‘edit’ button
  • Membership Requirements:
    • Can require additional information from a user requesting membership.
    • The options are from a list of predefined options.
  • Advanced Tab:
    • Allows group to be deleted.

Command Line Interface (CLI)[edit]

Installing[edit]

The Globus command line interface is a python module which can be installed using pip. Below are the steps to install Globus CLI on one of our clusters.

  1. Create a virtual environment to install the Globus CLI into (see creating and using a virtual environment).
    $ virtualenv $HOME/.globus-cli-virtualenv
    
  2. Activate the virtual environment
    $ source $HOME/.globus-cli-virtualenv/bin/activate
    
  3. Install Globus CLI into the virtual environment (see installing modules).
    $ pip install globus-cli
    
  4. Then deactivate the virtual environment.
    $ deactivate
    
  5. To avoid having to load that virtual environment every time before using Globus, you can add it to your path.
    $ export PATH=$PATH:$HOME/.globus-cli-virtualenv/bin
    $ echo 'export PATH=$PATH:$HOME/.globus-cli-virtualenv/bin'>>$HOME/.bashrc
    

See the Globus docs page on installation for information on installing on different platforms, updating, and uninstalling.

Using[edit]

Scripting[edit]

Support and More Information[edit]

If you would like more information on Compute Canada’s use of Globus, or require support in using this service, please send an email to globus@computecanada.ca and provide the following information:

  • Name
  • Compute Canada Role Identifier (CCRI)
  • Institution
  • Inquiry or issue. Be sure to indicate which sites you want to transfer to and from.