Arbutus object storage clients: Difference between revisions

From Alliance Doc
Jump to navigation Jump to search
(Marked this version for translation)
(fix internal link broken by name change)
 
(12 intermediate revisions by 6 users not shown)
Line 3: Line 3:


<!--T:1-->
<!--T:1-->
For information on obtaining Arbutus Object Storage, please see [[Arbutus_Object_Storage|this page]]. Below, we describe how to configure and use three common object storage clients:
For information on obtaining Arbutus Object Storage, please see [[Arbutus object storage|this page]]. For information on how to use an object storage client to manage your Arbutus object store, choose a client and follow instructions from these pages:
# s3cmd
* [[ Accessing object storage with s3cmd ]]
# WinSCP
* [[ Accessing object storage with WinSCP ]]
# awscli
* [[Accessing the Arbutus object storage with AWS CLI ]]


<!--T:2-->
<!--T:2-->
It is important to note that Arbutus' Object Storage solution does not use Amazon's [https://documentation.help/s3-dg-20060301/VirtualHosting.html S3 Virtual Hosting] (i.e. DNS-based bucket) approach which these clients assume by default. They need to be configured not to use that approach as described below.
It is important to note that Arbutus' Object Storage solution does not use Amazon's [https://documentation.help/s3-dg-20060301/VirtualHosting.html S3 Virtual Hosting] (i.e. DNS-based bucket) approach which these clients assume by default. They need to be configured not to use that approach, as described in the pages linked above.  
 
== s3cmd == <!--T:3-->
=== Installing s3cmd ===
Depending on your Linux distribution, the <code>s3cmd</code> command can be installed using the appropriate <code>yum</code> (RHEL, CentOS) or <code>apt</code> (Debian, Ubuntu) command:
 
<!--T:4-->
<code>$ sudo yum install s3cmd</code><br/>
<code>$ sudo apt install s3cmd </code>
 
=== Configuring s3cmd === <!--T:5-->
To configure the <code>s3cmd</code> tool use the command:</br>
<code>$ s3cmd --configure</code>
 
<!--T:6-->
And make the following configurations with the keys provided or crated with the <code>openstack ec2 credentials create</code> command:
<pre>
Enter new values or accept defaults in brackets with Enter.
Refer to user manual for detailed description of all options.
 
<!--T:7-->
Access key and Secret key are your identifiers for Amazon S3. Leave them empty for using the env variables.
Access Key []: 20_DIGIT_ACCESS_KEY
Secret Key []: 40_DIGIT_SECRET_KEY
Default Region [US]:
 
<!--T:8-->
Use "s3.amazonaws.com" for S3 Endpoint and not modify it to the target Amazon S3.
S3 Endpoint []: object-arbutus.cloud.computecanada.ca
 
<!--T:9-->
Use "%(bucket)s.s3.amazonaws.com" to the target Amazon S3. "%(bucket)s" and "%(location)s" vars can be used
if the target S3 system supports dns based buckets.
DNS-style bucket+hostname:port template for accessing a bucket []: object-arbutus.cloud.computecanada.ca
 
<!--T:10-->
Encryption password is used to protect your files from reading
by unauthorized persons while in transfer to S3
Encryption password []: PASSWORD
Path to GPG program []: /usr/bin/gpg
 
<!--T:11-->
When using secure HTTPS protocol all communication with Amazon S3
servers is protected from 3rd party eavesdropping. This method is
slower than plain HTTP, and can only be proxied with Python 2.7 or newer
Use HTTPS protocol []: Yes
 
<!--T:12-->
On some networks all internet access must go through a HTTP proxy.
Try setting it here if you can't connect to S3 directly
HTTP Proxy server name:
</pre>
 
=== Create buckets === <!--T:13-->
The next task is to make a bucket.  Buckets contain files. Bucket names must be unique across the Arbutus object storage solution.  Therefore, you will need to create a uniquely named bucket which will not conflict with other users.  For example, buckets <tt>s3://test/</tt> and <tt>s3://data/</tt> are likely already taken.  Consider creating buckets reflective of your project, for example <tt>s3://def-test-bucket1</tt> or <tt>s3://atlas_project_bucket</tt>.  Valid bucket names may only use the upper case characters, lower case characters, digits, periods, hyphens, and underscores (i.e. A-Z, a-z, 0-9, ., -, and _ ).
 
<!--T:14-->
To create a bucket, use the tool's <code>mb</code> (make bucket) command:
 
<!--T:15-->
<code>$ s3cmd mb s3://BUCKET_NAME/</code>
 
<!--T:16-->
To see the status of a bucket, use the <code>info</code> command:
 
<!--T:17-->
<code>$ s3cmd info s3://BUCKET_NAME/</code>
 
<!--T:18-->
The output will look something like this:
 
<!--T:19-->
<pre>
s3://BUCKET_NAME/ (bucket):
  Location:  default
  Payer:    BucketOwner
  Expiration Rule: none
  Policy:    none
  CORS:      none
  ACL:      *anon*: READ
  ACL:      USER: FULL_CONTROL
  URL:      http://object-arbutus.cloud.computecanada.ca/BUCKET_NAME/
</pre>
 
=== Upload files === <!--T:20-->
To upload a file to the bucket, use the <code>put</code> command similar to this:
 
<!--T:21-->
<code>$ s3cmd put --guess-mime-type FILE_NAME.dat s3://BUCKET_NAME/FILE_NAME.dat</code>
 
<!--T:22-->
Where the bucket name and the file name are specified.  Multipurpose Internet Mail Extensions (MIME) is a mechanism for handling files based on their type. The <code>--guess-mime-type</code> command parameter will guess the MIME type based on the file extension.  The default MIME type is <code>binary/octet-stream</code>.
 
=== Delete File === <!--T:23-->
To delete a file from the bucket, use the  <code>rm</code> command similar to this:<br/>
<code>$ s3cmd rm s3://BUCKET_NAME/FILE_NAME.dat</code>
 
=== Access Control Lists (ACLs) and Policies === <!--T:24-->
Buckets can have ACLs and policies which govern who can access what resources in the object store.  These features are quite sophisticated.  Here are two simple examples of using ACLs using the tool's <code>setacl</code> command.
 
<!--T:25-->
<code>$ s3cmd setacl --acl-public -r s3://BUCKET_NAME/</code>
 
<!--T:26-->
The result of this command is that the public can access the bucket and recursively (-r) every file in the bucket.  Files can be accessed via URLs such as<br/>
<code><nowiki>https://object-arbutus.cloud.computecanada.ca/BUCKET_NAME/FILE_NAME.dat</nowiki></code>
 
<!--T:27-->
The second ACL example limits access to the bucket to only the owner:
 
<!--T:28-->
<code>$ s3cmd setacl --acl-private s3://BUCKET_NAME/</code>
 
<!--T:29-->
Other more sophisticated examples can be found in the s3cmd [https://www.s3express.com/help/help.html help site] or s3cmd(1) man page.
 
== WinSCP == <!--T:30-->
 
=== Installing WinSCP === <!--T:31-->
WinSCP can be installed from https://winscp.net/.
 
=== Configuring WinSCP === <!--T:32-->
Under "New Session", make the following configurations:
<ul>
<li>File protocol: Amazon S3</li>
<li>Host name: object-arbutus.cloud.computecanada.ca</li>
<li>Port number: 443</li>
<li>Access key ID: 20_DIGIT_ACCESS_KEY provided by the Arbutus team</li>
</ul>
and "Save" these settings as shown below
 
<!--T:33-->
[[File:WinSCP Configuration.png|600px|thumb|center|WinSCP configuration screen]]
 
<!--T:34-->
Next, click on the "Edit" button and then click on "Advanced..." and navigate to "Environment" to "S3" to "Protocol options" to "URL style:" which <b>must</b> changed from "Virtual Host" to "Path" as shown below:
 
<!--T:35-->
[[File:WinSCP Path Configuration.png|600px|thumb|center|WinSCP Path Configuration]]
 
<!--T:36-->
This "Path" setting is important, otherwise WinSCP will not work and you will see hostname resolution errors, like this:
[[File:WinSCP resolve error.png|400px|thumb|center|WinSCP resolve error]]
 
=== Using WinSCP === <!--T:37-->
Click on the "Login" button and use the WinSCP GUI to create buckets and to transfer files:
 
<!--T:38-->
[[File:WinSCP transfers.png|800px|thumb|center|WinSCP file transfer screen]]
 
=== Access Control Lists (ACLs) and Policies === <!--T:41-->
Right-clicking on a file will allow you to set a file's ACL, like this:
[[File:WinSCP ACL.png|400px|thumb|center|WinSCP ACL screen]]
 
== AWS CLI == <!--T:43-->
 
<!--T:44-->
The <code>awscli</code> client also works with the Object Store service with better support for large (>5GB) files and the helpful <code>sync</code> command. However, not all features have not been tested.
 
=== Installing awscli === <!--T:45-->
 
<!--T:46-->
<pre>
pip install awscli awscli-plugin-endpoint
</pre>
 
=== Configuring awscli === <!--T:47-->
 
<!--T:48-->
Generate an access key ID & secret key
 
<!--T:49-->
<pre>
openstack ec2 credentials create
</pre>
 
<!--T:50-->
Edit or create <code>~/.aws/credentials</code> and add the credentials generated above
 
<!--T:51-->
<pre>
[default]
aws_access_key_id = <access_key>
aws_secret_access_key = <secret_key>
</pre>
 
<!--T:52-->
Edit <code>~/.aws/config</code> and add the following configuration
 
<!--T:53-->
<pre>
[plugins]
endpoint = awscli_plugin_endpoint
 
<!--T:54-->
[profile default]
s3 =
  endpoint_url = https://object-arbutus.cloud.computecanada.ca
  signature_version = s3v4
s3api =
  endpoint_url = https://object-arbutus.cloud.computecanada.ca
</pre>
 
=== Using awscli === <!--T:55-->
 
<!--T:56-->
<pre>
export AWS_PROFILE=default
aws s3 ls <container-name>
aws s3 sync local_directory s3://container-name/prefix
</pre>
 
<!--T:57-->
More examples can be found here: https://docs.ovh.com/us/en/storage/getting_started_with_the_swift_S3_API/


<!--T:42-->
<!--T:42-->
[[Category:CC-Cloud]]
[[Category:Cloud]]
</translate>
</translate>

Latest revision as of 18:32, 20 June 2024

Other languages:

For information on obtaining Arbutus Object Storage, please see this page. For information on how to use an object storage client to manage your Arbutus object store, choose a client and follow instructions from these pages:

It is important to note that Arbutus' Object Storage solution does not use Amazon's S3 Virtual Hosting (i.e. DNS-based bucket) approach which these clients assume by default. They need to be configured not to use that approach, as described in the pages linked above.