Skip to content

4 Easy Ways to Upload a File to S3 Using Python

Posted on:October 1, 2023 at 05:17 AM

In this tutorial, we will learn four ways to upload a file to S3 using Python. This is a continuation of the series where we write scripts to work with AWS S3 in Python.

Setting up permissions for S3

For this tutorial to work, we will need an IAM user who has access to upload a file to S3. We can configure this user on our local machine using AWS CLI or use its credentials directly in Python script. We have already covered how to create an IAM user with S3 access. If you do not have this user setup, please follow that blog first and then continue with this blog.

Upload a file to S3 using s3 client

One of the most common ways to upload files on your local machine to S3 is using the S3 client class. You need to provide the bucket name, the file you want to upload, and the object name in S3.

import boto3
from pprint import pprint
import pathlib
import os

def upload_file_using_client():
    """
    Uploads file to S3 bucket using S3 client object
    :return: None
    """
    s3 = boto3.client("s3")
    bucket_name = "binary-guy-frompython-1"
    object_name = "sample1.txt"
    file_name = os.path.join(pathlib.Path(__file__).parent.resolve(), "sample_file.txt")

    response = s3.upload_file(file_name, bucket_name, object_name)
    pprint(response)  # prints None

When you run this function, it will upload “sample_file.txt” to S3 and have the name “sample1.txt” in S3. We can verify this in the console.

s3 console showing file uploaded

In the above code, we have not specified any user credentials. In such cases, boto3 uses your local machine’s default AWS CLI profile. You can also specify which profile should be used by boto3 if you have multiple profiles on your device. All you need to do is add the below line to your code.

# setting up default profile for session
boto3.setup_default_session(profile_name='PROFILE_NAME_FROM_YOUR_MACHINE')

Another option is to specify the code’s access key ID and secret access key. This is not a recommended approach, and I believe using IAM credentials directly in code should be avoided in most cases. If you have to do this, you can use the access key ID and secret access key in the code, as shown below.

s3 = boto3.client("s3",
                aws_access_key_id=ACCESS_KEY,
                aws_secret_access_key=SECRET_KEY)

Upload a file to S3 using S3 resource class

Another option to upload files to s3 using Python is to use the S3 resource class.

def upload_file_using_resource():
    """
    Uploads file to S3 bucket using S3 resource object.
    This is useful when you are dealing with multiple buckets st same time.
    :return: None
    """
    s3 = boto3.resource("s3")
    bucket_name = "binary-guy-frompython-2"
    object_name = "sample2.txt"
    file_name = os.path.join(pathlib.Path(__file__).parent.resolve(), "sample_file.txt")

    bucket = s3.Bucket(bucket_name)
    response = bucket.upload_file(file_name, object_name)
    print(response)  # Prints None

The above code will also upload files to S3. The above approach is beneficial when you are dealing with multiple buckets. You can create different bucket objects and use them to upload files.

Upload a file to S3 using S3 resource class

Uploading a file to S3 using the put object

We have seen two ways to upload files to S3. Both are easy, but we do not have much control over the files we upload to S3. What if we want to add encryption when we upload files to s3 or decide which kind of access level our file has (we will dive deep into file/object access levels in another blog).

When we need such fine-grained control while uploading files to S3, we can use the put_object function, as shown in the below code.

def upload_file_to_s3_using_put_object():
    """
    Uploads file to s3 using put_object function of resource object.
    Same function is available for s3 client object as well.
    put_object function gives us much more options and we can set object access policy, tags, encryption etc
    :return: None
    """
    s3 = boto3.resource("s3")
    bucket_name = "binary-guy-frompython-2"
    object_name = "sample_using_put_object.txt"
    file_name = os.path.join(pathlib.Path(__file__).parent.resolve(), "sample_file.txt")

    bucket = s3.Bucket(bucket_name)
    response = bucket.put_object(
        ACL="private",
        Body=file_name,
        ServerSideEncryption="AES256",
        Key=object_name,
        Metadata={"env": "dev", "owner": "binary guy"},
    )
    print(
        response
    )  # prints s3.Object(bucket_name='binary-guy-frompython-2', key='sample_using_put_object.txt')

S3 console showing upload on put object

When we run the above code, we see that our file has been uploaded to S3. But we also need to check if our file has other properties mentioned in our code. In S3, to check object details, click on that object. When we click “sample_using_put_object.txt,” we see the details below.

s3_object_encryption_and_tag_details

We can see that our object is encrypted, and our tags are showing in the metadata. There are many other options that you can set for objects using the put_object function. You can find those details in the boto3 documentation for put_object.

Uploading byte data to S3

Sometimes, you may have byte data as the output of some process, and you want to upload that to S3. You can think that it’s easy. We write that data to a file and upload that file to S3. But what if there is a simple way where you do not have to write byte data to a file?

Of course, there is. We use the upload_fileobj function to upload byte data directly to S3. In the below code, I am reading a file in binary format and then using that data to create an object in S3. But you have any binary data written to S3 using the below code.

def upload_file_to_s3_using_file_object():
    """
    Uploads to file to s3 using upload_fileobj function of s3 client object.
    Similar function is available for s3 resource object as well.
    In this case, instead of copying file, we open that file and copy data of that file to S3.
    This can be useful when you have binary data already created as output of some process.
    We do not have to write this binary data to local file and then upload that file.
    We can use upload_fileobj function
    :return: None
    """
    s3 = boto3.client("s3")
    bucket_name = "binary-guy-frompython-1"
    object_name = "sample_file_object.txt"
    file_name = os.path.join(pathlib.Path(__file__).parent.resolve(), "sample_file.txt")

    with open(file_name, "rb") as data:
        s3.upload_fileobj(data, bucket_name, object_name)

Let us check if this has created an object in S3 or not.

uploading byte data to s3 using upload_fileobj

As we can see, it has successfully created an S3 object using our byte data.

Conclusion

In this blog, we have learned four different ways to upload files and binary data to s3 using Python. You can get all the code in this blog at GitHub. I hope you found this helpful. In the next blog, we will learn different ways to list objects in the S3 bucket. See you soon.