In this tutorial, we will learn about 4 different ways to upload a file to S3 using python. This is a continuation of the series where we are writing scripts to work with AWS S3 in Python language.
Setting up permissions for S3
For this tutorial to work, we will need an IAM user who has access to upload a file to S3. We can configure this user on our local machine using AWS CLI or we can use its credentials directly in python script. We have already covered this topic on how to create an IAM user with S3 access. If you do not have this user setup please follow that blog first and then continue with this blog.
Upload a file to S3 using s3 client
One of the most common ways to upload files on your local machine to S3 is using the client class for S3. You need to provide the bucket name, file which you want to upload and object name in S3.
import boto3 from pprint import pprint import pathlib import os def upload_file_using_client(): """ Uploads file to S3 bucket using S3 client object :return: None """ s3 = boto3.client("s3") bucket_name = "binary-guy-frompython-1" object_name = "sample1.txt" file_name = os.path.join(pathlib.Path(__file__).parent.resolve(), "sample_file.txt") response = s3.upload_file(file_name, bucket_name, object_name) pprint(response) # prints None
When you run this function, it will upload “sample_file.txt” to S3 and it will have the name “sample1.txt” in S3. We can verify this in the console.
In the above code, we have not specified any user credentials. In such cases, boto3 uses the default AWS CLI profile set up on your local machine. You can also specify which profile should be used by boto3 if you have multiple profiles on your machine. All you need to do is add the below line to your code.
# setting up default profile for session boto3.setup_default_session(profile_name='PROFILE_NAME_FROM_YOUR_MACHINE')
Another option is you can specify the access key id and secret access key in the code itself. This is not recommended approach and I strongly believe using IAM credentials directly in code should be avoided in most cases. You can use access key id and secret access key in code as shown below, in case you have to do this.
s3 = boto3.client("s3", aws_access_key_id=ACCESS_KEY, aws_secret_access_key=SECRET_KEY)
Upload a file to S3 using S3 resource class
Another option to upload files to s3 using python is to use the S3 resource class.
def upload_file_using_resource(): """ Uploads file to S3 bucket using S3 resource object. This is useful when you are dealing with multiple buckets st same time. :return: None """ s3 = boto3.resource("s3") bucket_name = "binary-guy-frompython-2" object_name = "sample2.txt" file_name = os.path.join(pathlib.Path(__file__).parent.resolve(), "sample_file.txt") bucket = s3.Bucket(bucket_name) response = bucket.upload_file(file_name, object_name) print(response) # Prints None
The above code will also upload files to S3. The above approach is especially useful when you are dealing with multiple buckets. You can create different bucket objects and use them to upload files.
Uploading a file to S3 using put object
Till now we have seen 2 ways to upload files to S3. Both of them are easy but we do not have much control over the files we are uploading to S3. What if we want to add encryption when we upload files to s3 or decide which kind of access level our file has (we will dive deep into file/object access levels in another blog).
When we need such fine-grained control while uploading files to S3, we can use the put_object function as shown in the below code.
def upload_file_to_s3_using_put_object(): """ Uploads file to s3 using put_object function of resource object. Same function is available for s3 client object as well. put_object function gives us much more options and we can set object access policy, tags, encryption etc :return: None """ s3 = boto3.resource("s3") bucket_name = "binary-guy-frompython-2" object_name = "sample_using_put_object.txt" file_name = os.path.join(pathlib.Path(__file__).parent.resolve(), "sample_file.txt") bucket = s3.Bucket(bucket_name) response = bucket.put_object( ACL="private", Body=file_name, ServerSideEncryption="AES256", Key=object_name, Metadata={"env": "dev", "owner": "binary guy"}, ) print( response ) # prints s3.Object(bucket_name='binary-guy-frompython-2', key='sample_using_put_object.txt')
When we run the above code we can see that our file has been uploaded to S3. But we also need to check if our file has other properties mentioned in our code. In S3, to check object details click on that object. When we click on “sample_using_put_object.txt ” we will see the below details.
We can see that our object is encrypted and our tags showing in object metadata. There are many other options that you can set for objects using the put_object function. You can find those details at boto3 documentation for put_object.
Uploading byte data to S3
In some cases, you may have byte data as the output of some process and you want to upload that to S3. You can think that it’s easy. We write that data to file and upload that file to S3. But what if there is a simple way where you do not have to write byte data to file?
Of course, there is. We use the upload_fileobj function to directly upload byte data to S3. In the below code, I am reading a file in binary format and then using that data to create object in S3. But you have any binary data written to S3 using the below code.
def upload_file_to_s3_using_file_object(): """ Uploads to file to s3 using upload_fileobj function of s3 client object. Similar function is available for s3 resource object as well. In this case, instead of copying file, we open that file and copy data of that file to S3. This can be useful when you have binary data already created as output of some process. We do not have to write this binary data to local file and then upload that file. We can use upload_fileobj function :return: None """ s3 = boto3.client("s3") bucket_name = "binary-guy-frompython-1" object_name = "sample_file_object.txt" file_name = os.path.join(pathlib.Path(__file__).parent.resolve(), "sample_file.txt") with open(file_name, "rb") as data: s3.upload_fileobj(data, bucket_name, object_name)
Let us check if this has created an object in S3 or not.
As we can see, it has successfully created an S3 object using our byte data.
Conclusion
In this blog, we have learned 4 different ways to upload files and binary data to s3 using python. You can get all the code in this blog at GitHub. I hope you found this useful. In the next blog, we will learn different ways to list down objects in the S3 bucket. See you soon.