Accessing object storage with s3cmd: Difference between revisions
(removed duplicate content and added signpost to main object store page) |
(Marked this version for translation) |
||
Line 2: | Line 2: | ||
<translate> | <translate> | ||
<!--T:1--> | |||
This page contains instructions on how to set up and access [[Arbutus object storage]] with s3cmd, one of the [[Arbutus_object_storage_clients | object storage clients ]] available for this storage type. | This page contains instructions on how to set up and access [[Arbutus object storage]] with s3cmd, one of the [[Arbutus_object_storage_clients | object storage clients ]] available for this storage type. | ||
== Installing s3cmd == | == Installing s3cmd == <!--T:2--> | ||
Depending on your Linux distribution, the <code>s3cmd</code> command can be installed using the appropriate <code>yum</code> (RHEL, CentOS) or <code>apt</code> (Debian, Ubuntu) command: | Depending on your Linux distribution, the <code>s3cmd</code> command can be installed using the appropriate <code>yum</code> (RHEL, CentOS) or <code>apt</code> (Debian, Ubuntu) command: | ||
<!--T:3--> | |||
<code>$ sudo yum install s3cmd</code><br/> | <code>$ sudo yum install s3cmd</code><br/> | ||
<code>$ sudo apt install s3cmd </code> | <code>$ sudo apt install s3cmd </code> | ||
== Configuring s3cmd == | == Configuring s3cmd == <!--T:4--> | ||
<!--T:5--> | |||
To configure the <code>s3cmd</code> tool use the command:</br> | To configure the <code>s3cmd</code> tool use the command:</br> | ||
<code>$ s3cmd --configure</code> | <code>$ s3cmd --configure</code> | ||
<!--T:6--> | |||
And make the following configurations with the keys provided or created with the <code>openstack ec2 credentials create</code> command: | And make the following configurations with the keys provided or created with the <code>openstack ec2 credentials create</code> command: | ||
<pre> | <pre> | ||
Line 21: | Line 25: | ||
Refer to user manual for detailed description of all options. | Refer to user manual for detailed description of all options. | ||
<!--T:7--> | |||
Access key and Secret key are your identifiers for Amazon S3. Leave them empty for using the env variables. | Access key and Secret key are your identifiers for Amazon S3. Leave them empty for using the env variables. | ||
Access Key []: 20_DIGIT_ACCESS_KEY | Access Key []: 20_DIGIT_ACCESS_KEY | ||
Line 26: | Line 31: | ||
Default Region [US]: | Default Region [US]: | ||
<!--T:8--> | |||
Use "s3.amazonaws.com" for S3 Endpoint and not modify it to the target Amazon S3. | Use "s3.amazonaws.com" for S3 Endpoint and not modify it to the target Amazon S3. | ||
S3 Endpoint []: object-arbutus.cloud.computecanada.ca | S3 Endpoint []: object-arbutus.cloud.computecanada.ca | ||
<!--T:9--> | |||
Use "%(bucket)s.s3.amazonaws.com" to the target Amazon S3. "%(bucket)s" and "%(location)s" vars can be used | Use "%(bucket)s.s3.amazonaws.com" to the target Amazon S3. "%(bucket)s" and "%(location)s" vars can be used | ||
if the target S3 system supports dns based buckets. | if the target S3 system supports dns based buckets. | ||
DNS-style bucket+hostname:port template for accessing a bucket []: object-arbutus.cloud.computecanada.ca | DNS-style bucket+hostname:port template for accessing a bucket []: object-arbutus.cloud.computecanada.ca | ||
<!--T:10--> | |||
Encryption password is used to protect your files from reading | Encryption password is used to protect your files from reading | ||
by unauthorized persons while in transfer to S3 | by unauthorized persons while in transfer to S3 | ||
Line 38: | Line 46: | ||
Path to GPG program []: /usr/bin/gpg | Path to GPG program []: /usr/bin/gpg | ||
<!--T:11--> | |||
When using secure HTTPS protocol all communication with Amazon S3 | When using secure HTTPS protocol all communication with Amazon S3 | ||
servers is protected from 3rd party eavesdropping. This method is | servers is protected from 3rd party eavesdropping. This method is | ||
Line 43: | Line 52: | ||
Use HTTPS protocol []: Yes | Use HTTPS protocol []: Yes | ||
<!--T:12--> | |||
On some networks all internet access must go through a HTTP proxy. | On some networks all internet access must go through a HTTP proxy. | ||
Try setting it here if you can't connect to S3 directly | Try setting it here if you can't connect to S3 directly | ||
Line 48: | Line 58: | ||
</pre> | </pre> | ||
<!--T:13--> | |||
This should produce a s3cmd configuration file as in the example below. You are also free to explore additional s3cmd configuration options to fit your use case. Note that in the example the keys are redacted and you will need to replace them with your provided key values: | This should produce a s3cmd configuration file as in the example below. You are also free to explore additional s3cmd configuration options to fit your use case. Note that in the example the keys are redacted and you will need to replace them with your provided key values: | ||
<!--T:14--> | |||
<pre>[default] | <pre>[default] | ||
access_key = <redacted> | access_key = <redacted> | ||
Line 60: | Line 72: | ||
</pre> | </pre> | ||
== Create buckets == | == Create buckets == <!--T:15--> | ||
The next task is to make a bucket. Buckets contain files. Bucket names must be unique across the Arbutus object storage solution. Therefore, you will need to create a uniquely named bucket which will not conflict with other users. For example, buckets <tt>s3://test/</tt> and <tt>s3://data/</tt> are likely already taken. Consider creating buckets reflective of your project, for example <tt>s3://def-test-bucket1</tt> or <tt>s3://atlas_project_bucket</tt>. Valid bucket names may only use the upper case characters, lower case characters, digits, periods, hyphens, and underscores (i.e. A-Z, a-z, 0-9, ., -, and _ ). | The next task is to make a bucket. Buckets contain files. Bucket names must be unique across the Arbutus object storage solution. Therefore, you will need to create a uniquely named bucket which will not conflict with other users. For example, buckets <tt>s3://test/</tt> and <tt>s3://data/</tt> are likely already taken. Consider creating buckets reflective of your project, for example <tt>s3://def-test-bucket1</tt> or <tt>s3://atlas_project_bucket</tt>. Valid bucket names may only use the upper case characters, lower case characters, digits, periods, hyphens, and underscores (i.e. A-Z, a-z, 0-9, ., -, and _ ). | ||
<!--T:16--> | |||
To create a bucket, use the tool's <code>mb</code> (make bucket) command: | To create a bucket, use the tool's <code>mb</code> (make bucket) command: | ||
<!--T:17--> | |||
<code>$ s3cmd mb s3://BUCKET_NAME/</code> | <code>$ s3cmd mb s3://BUCKET_NAME/</code> | ||
<!--T:18--> | |||
To see the status of a bucket, use the <code>info</code> command: | To see the status of a bucket, use the <code>info</code> command: | ||
<!--T:19--> | |||
<code>$ s3cmd info s3://BUCKET_NAME/</code> | <code>$ s3cmd info s3://BUCKET_NAME/</code> | ||
<!--T:20--> | |||
The output will look something like this: | The output will look something like this: | ||
<!--T:21--> | |||
<pre> | <pre> | ||
s3://BUCKET_NAME/ (bucket): | s3://BUCKET_NAME/ (bucket): | ||
Line 85: | Line 103: | ||
</pre> | </pre> | ||
== Upload files == | == Upload files == <!--T:22--> | ||
To upload a file to the bucket, use the <code>put</code> command similar to this: | To upload a file to the bucket, use the <code>put</code> command similar to this: | ||
<!--T:23--> | |||
<code>$ s3cmd put --guess-mime-type FILE_NAME.dat s3://BUCKET_NAME/FILE_NAME.dat</code> | <code>$ s3cmd put --guess-mime-type FILE_NAME.dat s3://BUCKET_NAME/FILE_NAME.dat</code> | ||
<!--T:24--> | |||
Where the bucket name and the file name are specified. Multipurpose Internet Mail Extensions (MIME) is a mechanism for handling files based on their type. The <code>--guess-mime-type</code> command parameter will guess the MIME type based on the file extension. The default MIME type is <code>binary/octet-stream</code>. | Where the bucket name and the file name are specified. Multipurpose Internet Mail Extensions (MIME) is a mechanism for handling files based on their type. The <code>--guess-mime-type</code> command parameter will guess the MIME type based on the file extension. The default MIME type is <code>binary/octet-stream</code>. | ||
== Delete File == | == Delete File == <!--T:25--> | ||
To delete a file from the bucket, use the <code>rm</code> command similar to this:<br/> | To delete a file from the bucket, use the <code>rm</code> command similar to this:<br/> | ||
<code>$ s3cmd rm s3://BUCKET_NAME/FILE_NAME.dat</code> | <code>$ s3cmd rm s3://BUCKET_NAME/FILE_NAME.dat</code> | ||
== Access Control Lists (ACLs) and Policies == | == Access Control Lists (ACLs) and Policies == <!--T:26--> | ||
Buckets can have ACLs and policies which govern who can access what resources in the object store. These features are quite sophisticated. Here are two simple examples of using ACLs using the tool's <code>setacl</code> command. | Buckets can have ACLs and policies which govern who can access what resources in the object store. These features are quite sophisticated. Here are two simple examples of using ACLs using the tool's <code>setacl</code> command. | ||
<!--T:27--> | |||
<code>$ s3cmd setacl --acl-public -r s3://BUCKET_NAME/</code> | <code>$ s3cmd setacl --acl-public -r s3://BUCKET_NAME/</code> | ||
<!--T:28--> | |||
The result of this command is that the public can access the bucket and recursively (-r) every file in the bucket. Files can be accessed via URLs such as<br/> | The result of this command is that the public can access the bucket and recursively (-r) every file in the bucket. Files can be accessed via URLs such as<br/> | ||
<code><nowiki>https://object-arbutus.cloud.computecanada.ca/BUCKET_NAME/FILE_NAME.dat</nowiki></code> | <code><nowiki>https://object-arbutus.cloud.computecanada.ca/BUCKET_NAME/FILE_NAME.dat</nowiki></code> | ||
<!--T:29--> | |||
The second ACL example limits access to the bucket to only the owner: | The second ACL example limits access to the bucket to only the owner: | ||
<!--T:30--> | |||
<code>$ s3cmd setacl --acl-private s3://BUCKET_NAME/</code> | <code>$ s3cmd setacl --acl-private s3://BUCKET_NAME/</code> | ||
<!--T:31--> | |||
The current configuration of a bucket can be viewed via the command: | The current configuration of a bucket can be viewed via the command: | ||
<!--T:32--> | |||
<code>$ s3cmd info s3://testbucket </code> | <code>$ s3cmd info s3://testbucket </code> | ||
<!--T:33--> | |||
Other more sophisticated examples can be found in the s3cmd [https://www.s3express.com/help/help.html help site] or s3cmd(1) man page. | Other more sophisticated examples can be found in the s3cmd [https://www.s3express.com/help/help.html help site] or s3cmd(1) man page. | ||
<!--T:34--> | |||
Instructions on [[ Arbutus_object_storage#Managing_data_container_(bucket)_policies_for_your_Arbutus_Object_Store | managing bucket policies ]] for your object store, including examples using s3cmd are available on the main [[Arbutus_object_storage | object storage]] page. | Instructions on [[ Arbutus_object_storage#Managing_data_container_(bucket)_policies_for_your_Arbutus_Object_Store | managing bucket policies ]] for your object store, including examples using s3cmd are available on the main [[Arbutus_object_storage | object storage]] page. | ||
<!--T:35--> | |||
[[Category:Cloud]] | [[Category:Cloud]] | ||
</translate> | </translate> |
Revision as of 16:57, 8 March 2023
This page contains instructions on how to set up and access Arbutus object storage with s3cmd, one of the object storage clients available for this storage type.
Installing s3cmd[edit]
Depending on your Linux distribution, the s3cmd
command can be installed using the appropriate yum
(RHEL, CentOS) or apt
(Debian, Ubuntu) command:
$ sudo yum install s3cmd
$ sudo apt install s3cmd
Configuring s3cmd[edit]
To configure the s3cmd
tool use the command:
$ s3cmd --configure
And make the following configurations with the keys provided or created with the openstack ec2 credentials create
command:
Enter new values or accept defaults in brackets with Enter. Refer to user manual for detailed description of all options. Access key and Secret key are your identifiers for Amazon S3. Leave them empty for using the env variables. Access Key []: 20_DIGIT_ACCESS_KEY Secret Key []: 40_DIGIT_SECRET_KEY Default Region [US]: Use "s3.amazonaws.com" for S3 Endpoint and not modify it to the target Amazon S3. S3 Endpoint []: object-arbutus.cloud.computecanada.ca Use "%(bucket)s.s3.amazonaws.com" to the target Amazon S3. "%(bucket)s" and "%(location)s" vars can be used if the target S3 system supports dns based buckets. DNS-style bucket+hostname:port template for accessing a bucket []: object-arbutus.cloud.computecanada.ca Encryption password is used to protect your files from reading by unauthorized persons while in transfer to S3 Encryption password []: PASSWORD Path to GPG program []: /usr/bin/gpg When using secure HTTPS protocol all communication with Amazon S3 servers is protected from 3rd party eavesdropping. This method is slower than plain HTTP, and can only be proxied with Python 2.7 or newer Use HTTPS protocol []: Yes On some networks all internet access must go through a HTTP proxy. Try setting it here if you can't connect to S3 directly HTTP Proxy server name:
This should produce a s3cmd configuration file as in the example below. You are also free to explore additional s3cmd configuration options to fit your use case. Note that in the example the keys are redacted and you will need to replace them with your provided key values:
[default] access_key = <redacted> check_ssl_certificate = True check_ssl_hostname = True host_base = object-arbutus.cloud.computecanada.ca host_bucket = object-arbutus.cloud.computecanada.ca secret_key = <redacted> use_https = True
Create buckets[edit]
The next task is to make a bucket. Buckets contain files. Bucket names must be unique across the Arbutus object storage solution. Therefore, you will need to create a uniquely named bucket which will not conflict with other users. For example, buckets s3://test/ and s3://data/ are likely already taken. Consider creating buckets reflective of your project, for example s3://def-test-bucket1 or s3://atlas_project_bucket. Valid bucket names may only use the upper case characters, lower case characters, digits, periods, hyphens, and underscores (i.e. A-Z, a-z, 0-9, ., -, and _ ).
To create a bucket, use the tool's mb
(make bucket) command:
$ s3cmd mb s3://BUCKET_NAME/
To see the status of a bucket, use the info
command:
$ s3cmd info s3://BUCKET_NAME/
The output will look something like this:
s3://BUCKET_NAME/ (bucket): Location: default Payer: BucketOwner Expiration Rule: none Policy: none CORS: none ACL: *anon*: READ ACL: USER: FULL_CONTROL URL: http://object-arbutus.cloud.computecanada.ca/BUCKET_NAME/
Upload files[edit]
To upload a file to the bucket, use the put
command similar to this:
$ s3cmd put --guess-mime-type FILE_NAME.dat s3://BUCKET_NAME/FILE_NAME.dat
Where the bucket name and the file name are specified. Multipurpose Internet Mail Extensions (MIME) is a mechanism for handling files based on their type. The --guess-mime-type
command parameter will guess the MIME type based on the file extension. The default MIME type is binary/octet-stream
.
Delete File[edit]
To delete a file from the bucket, use the rm
command similar to this:
$ s3cmd rm s3://BUCKET_NAME/FILE_NAME.dat
Access Control Lists (ACLs) and Policies[edit]
Buckets can have ACLs and policies which govern who can access what resources in the object store. These features are quite sophisticated. Here are two simple examples of using ACLs using the tool's setacl
command.
$ s3cmd setacl --acl-public -r s3://BUCKET_NAME/
The result of this command is that the public can access the bucket and recursively (-r) every file in the bucket. Files can be accessed via URLs such as
https://object-arbutus.cloud.computecanada.ca/BUCKET_NAME/FILE_NAME.dat
The second ACL example limits access to the bucket to only the owner:
$ s3cmd setacl --acl-private s3://BUCKET_NAME/
The current configuration of a bucket can be viewed via the command:
$ s3cmd info s3://testbucket
Other more sophisticated examples can be found in the s3cmd help site or s3cmd(1) man page.
Instructions on managing bucket policies for your object store, including examples using s3cmd are available on the main object storage page.