6

I am trying to write and save a CSV file to a specific folder in s3 (exist). this is my code:

from io import BytesIO
import pandas as pd
import boto3
s3 = boto3.resource('s3')

d = {'col1': [1, 2], 'col2': [3, 4]}
df = pd.DataFrame(data=d)

csv_buffer = BytesIO()

bucket = 'bucketName/folder/'
filename = "test3.csv"
df.to_csv(csv_buffer)
content = csv_buffer.getvalue()

def to_s3(bucket,filename,content):
  s3.Object(bucket,filename).put(Body=content)

to_s3(bucket,filename,content)

this is the error that I get:

Invalid bucket name "bucketName/folder/": Bucket name must match the regex "^[a-zA-Z0-9.\-_]{1,255}$"

I also tried :

bucket = bucketName/folder

and:

bucket = bucketName
key = folder/
s3.Object(bucket,key,filename).put(Body=content)

Any suggestions?

4 Answers 4

9

Saving into s3 buckets can be also done with upload_file with an existing .csv file:

import boto3
s3 = boto3.resource('s3')

bucket = 'bucket_name'
filename = 'file_name.csv'
s3.meta.client.upload_file(Filename = filename, Bucket= bucket, Key = filename)
2
3

This should work

def to_s3(bucket,filename, content):
    client = boto3.client('s3')
    k = "folder/subfolder"+filename
    client.put_object(Bucket=bucket, Key=k, Body=content)
3
  • where do I put the content?
    – HilaD
    Commented Jan 23, 2018 at 11:29
  • so how it will save content as a csv file?
    – HilaD
    Commented Jan 23, 2018 at 11:30
  • Oh sorry you need to replace the filename with file.
    – Rakesh
    Commented Jan 23, 2018 at 11:32
2

This should work:

bucket = bucketName
key = f"{folder}/{filename}"
csv_buffer=StringIO()
df.to_csv(csv_buffer)
content = csv_buffer.getvalue()
s3.put_object(Bucket=bucket, Body=content,Key=key)

AWS bucket names are not allowed to have slashes ("/"), which should be part of Key. AWS uses slashes to show "virtual" folders in the dashboard. Since csv is a text file I'm using StringIO instead of BytesIO

2
0

This works for me.

import os
import pandas as pd
import boto3
from io import StringIO

from dotenv import load_dotenv
load_dotenv("/path/to/.env", override=True)


def df_to_s3(df, bucket, key):
    # Create a session
    session = boto3.session.Session(profile_name=os.environ.get("AWS_SECRETS_PROFILE_NAME"))
    aws_s3_client = session.client(
        service_name="s3",
        region_name=os.environ.get("AWS_SECRETS_REGION_NAME"),
    )

    # Create a CSV string from the DataFrame
    csv_buffer = StringIO()
    df.to_csv(csv_buffer, index=False)

    # Put the CSV string to S3
    aws_s3_client.put_object(
        Body=csv_buffer.getvalue(),
        Bucket=bucket,
        Key=key
    )
    print(f'Successfully put DataFrame to {bucket}/{key}')
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
df_to_s3(df, 'bucketName/folder/', 'test3.csv')
3

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.