Accessing object storage with s3cmd: Difference between revisions

From Alliance Doc
Jump to navigation Jump to search
(merged example operations with ACLs above)
Line 109: Line 109:


Other more sophisticated examples can be found in the s3cmd [https://www.s3express.com/help/help.html help site] or s3cmd(1) man page.
Other more sophisticated examples can be found in the s3cmd [https://www.s3express.com/help/help.html help site] or s3cmd(1) man page.
= Example operations on a bucket =
<ul>
<li><p>Make a bucket public so that it is Web accessible:</p>
<p><code>s3cmd setacl s3://testbucket --acl-public</code></p></li>
<li><p>Make the bucket private again:</p>
<p><code>s3cmd setacl s3://testbucket --acl-private</code></p></li>
<li><p>View the configuration of a bucket:</p>
<p><code>s3cmd info s3://testbucket</code></p></li>
</ul>


= Bucket policies =
= Bucket policies =

Revision as of 19:59, 27 February 2023

Installing s3cmd[edit]

Depending on your Linux distribution, the s3cmd command can be installed using the appropriate yum (RHEL, CentOS) or apt (Debian, Ubuntu) command:


$ sudo yum install s3cmd
$ sudo apt install s3cmd

Configuring s3cmd[edit]

To configure the s3cmd tool use the command:
$ s3cmd --configure

And make the following configurations with the keys provided or created with the openstack ec2 credentials create command:

Enter new values or accept defaults in brackets with Enter.
Refer to user manual for detailed description of all options.

Access key and Secret key are your identifiers for Amazon S3. Leave them empty for using the env variables.
Access Key []: 20_DIGIT_ACCESS_KEY
Secret Key []: 40_DIGIT_SECRET_KEY
Default Region [US]:

Use "s3.amazonaws.com" for S3 Endpoint and not modify it to the target Amazon S3.
S3 Endpoint []: object-arbutus.cloud.computecanada.ca

Use "%(bucket)s.s3.amazonaws.com" to the target Amazon S3. "%(bucket)s" and "%(location)s" vars can be used
if the target S3 system supports dns based buckets.
DNS-style bucket+hostname:port template for accessing a bucket []: object-arbutus.cloud.computecanada.ca

Encryption password is used to protect your files from reading
by unauthorized persons while in transfer to S3
Encryption password []: PASSWORD
Path to GPG program []: /usr/bin/gpg

When using secure HTTPS protocol all communication with Amazon S3
servers is protected from 3rd party eavesdropping. This method is
slower than plain HTTP, and can only be proxied with Python 2.7 or newer
Use HTTPS protocol []: Yes

On some networks all internet access must go through a HTTP proxy.
Try setting it here if you can't connect to S3 directly
HTTP Proxy server name:

This should produce a s3cmd configuration file as in the example below. You are also free to explore additional s3cmd configuration options to fit your use case. Note that in the example the keys are redacted and you will need to replace them with your provided key values:

[default]
access_key = <redacted>
check_ssl_certificate = True
check_ssl_hostname = True
host_base = object-arbutus.cloud.computecanada.ca
host_bucket = object-arbutus.cloud.computecanada.ca
secret_key = <redacted>
use_https = True

Create buckets[edit]

The next task is to make a bucket. Buckets contain files. Bucket names must be unique across the Arbutus object storage solution. Therefore, you will need to create a uniquely named bucket which will not conflict with other users. For example, buckets s3://test/ and s3://data/ are likely already taken. Consider creating buckets reflective of your project, for example s3://def-test-bucket1 or s3://atlas_project_bucket. Valid bucket names may only use the upper case characters, lower case characters, digits, periods, hyphens, and underscores (i.e. A-Z, a-z, 0-9, ., -, and _ ).

To create a bucket, use the tool's mb (make bucket) command:

$ s3cmd mb s3://BUCKET_NAME/

To see the status of a bucket, use the info command:

$ s3cmd info s3://BUCKET_NAME/

The output will look something like this:

s3://BUCKET_NAME/ (bucket):
   Location:  default
   Payer:     BucketOwner
   Expiration Rule: none
   Policy:    none
   CORS:      none
   ACL:       *anon*: READ
   ACL:       USER: FULL_CONTROL
   URL:       http://object-arbutus.cloud.computecanada.ca/BUCKET_NAME/

Upload files[edit]

To upload a file to the bucket, use the put command similar to this:

$ s3cmd put --guess-mime-type FILE_NAME.dat s3://BUCKET_NAME/FILE_NAME.dat

Where the bucket name and the file name are specified. Multipurpose Internet Mail Extensions (MIME) is a mechanism for handling files based on their type. The --guess-mime-type command parameter will guess the MIME type based on the file extension. The default MIME type is binary/octet-stream.

Delete File[edit]

To delete a file from the bucket, use the rm command similar to this:
$ s3cmd rm s3://BUCKET_NAME/FILE_NAME.dat

Access Control Lists (ACLs) and Policies[edit]

Buckets can have ACLs and policies which govern who can access what resources in the object store. These features are quite sophisticated. Here are two simple examples of using ACLs using the tool's setacl command.

$ s3cmd setacl --acl-public -r s3://BUCKET_NAME/

The result of this command is that the public can access the bucket and recursively (-r) every file in the bucket. Files can be accessed via URLs such as
https://object-arbutus.cloud.computecanada.ca/BUCKET_NAME/FILE_NAME.dat

The second ACL example limits access to the bucket to only the owner:

$ s3cmd setacl --acl-private s3://BUCKET_NAME/

The current configuration of a bucket can be viewed via the command:

$ s3cmd info s3://testbucket

Other more sophisticated examples can be found in the s3cmd help site or s3cmd(1) man page.

Bucket policies[edit]

Attention

Be careful with policies because an ill-conceived policy can lock you out of your bucket.



Currently, Arbutus Object Storage only implements a subset of Amazon's specification for [bucket polices]. The following example shows how to create, apply, and view a bucket's policy. The first step is create a policy json file, e.g. testbucket.policy :

{
    "Version": "2012-10-17",
    "Id": "S3PolicyId1",
    "Statement": [
        {
            "Sid": "IPAllow",
            "Effect": "Deny",
            "Principal": "*",
            "Action": "s3:*",
            "Resource": [
                "arn:aws:s3:::testbucket",
                "arn:aws:s3:::testbucket/*"
            ],
            "Condition": {
                "NotIpAddress": {
                    "aws:SourceIp": "206.12.0.0/16"
                    "aws:SourceIp": "142.104.0.0/16"
                }
            }
        }
    ]
}

This example denies access except from the specified source IP address ranges in Classless Inter-Domain Routing (CIDR) notation. In this example the s3://testbucket is limited to the public IP address range (206.12.0.0/16) used by the Arbutus cloud and the public IP address range (142.104.0.0/16) used by the University of Victoria.

Once you have your policy file, you can implement that policy on the bucket:

s3cmd setpolicy testbucket.policy s3://testbucket

To view the policy you can use the following command:

s3cmd info s3://testbucket