37

I am uploading files to Amazon S3 bucket. The files are being uploaded but i get the following Warning.

WARNING: No content length specified for stream data. Stream contents will be buffered in memory and could result in out of memory errors.

So I added the following line to my code

metaData.setContentLength(IOUtils.toByteArray(input).length);

but then i got the following message. I don't even know if it is a warning or what.

Data read has a different length than the expected: dataLength=0; expectedLength=111992; includeSkipped=false; in.getClass()=class sun.net.httpserver.FixedLengthInputStream; markedSupported=false; marked=0; resetSinceLastMarked=false; markCount=0; resetCount=0

How can i set contentLength to the metaData of InputSteam? Any help would be greatly appreciated.

3 Answers 3

50

When you read the data with IOUtils.toByteArray, this consumes the InputStream. When the AWS API tries to read it, it's zero length.

Read the contents into a byte array and provide an InputStream wrapping that array to the API:

byte[] bytes = IOUtils.toByteArray(input);
metaData.setContentLength(bytes.length);
ByteArrayInputStream byteArrayInputStream = new ByteArrayInputStream(bytes);
PutObjectRequest putObjectRequest = new PutObjectRequest(bucket, key, byteArrayInputStream, metadata);
client.putObject(putObjectRequest);

You should consider using the multipart upload API to avoid loading the whole InputStream into memory. For example:

byte[] bytes = new byte[BUFFER_SIZE];
String uploadId = client.initiateMultipartUpload(new InitiateMultipartUploadRequest(bucket, key)).getUploadId();

int bytesRead = 0;
int partNumber = 1;
List<UploadPartResult> results = new ArrayList<>();
bytesRead = input.read(bytes);
while (bytesRead >= 0) {
    UploadPartRequest part = new UploadPartRequest()
        .withBucketName(bucket)
        .withKey(key)
        .withUploadId(uploadId)
        .withPartNumber(partNumber)
        .withInputStream(new ByteArrayInputStream(bytes, 0, bytesRead))
        .withPartSize(bytesRead);
    results.add(client.uploadPart(part));
    bytesRead = input.read(bytes);
    partNumber++;
}
CompleteMultipartUploadRequest completeRequest = new CompleteMultipartUploadRequest()
    .withBucketName(bucket)
    .withKey(key)
    .withUploadId(uploadId)
    .withPartETags(results);
client.completeMultipartUpload(completeRequest);
6
  • 12
    Note that if you are uploading large files to s3 the above approach first load the data in a byte array. This can cause an out of memory exception. Commented Mar 23, 2017 at 15:32
  • @maxTrialfire In that case, how would you prevent OOM from happening when you are uploading a large file?
    – user482594
    Commented Jul 12, 2017 at 8:12
  • 2
    @user482594 In that case you need to do a chunked (multipart) upload. Commented Jul 13, 2017 at 19:23
  • 1
    you could also write to a file first to avoid OOM
    – Dave Moten
    Commented Apr 26, 2018 at 1:54
  • 1
    This still loads the entire file into memory. It did not solve the problem it just moved it.
    – BrianC
    Commented May 15, 2019 at 17:53
12

Note that by using a ByteBuffer you simply do manually what the AWS SDK already did for you automatically! It still buffers the entire stream into memory and is as good as the original solution which produces the warning from the SDK.

You can only get rid of the memory-problem if you have another way to know the length of the stream, for instance, when you create the stream from a file:

void uploadFile(String bucketName, File file) {
    try (final InputStream stream = new FileInputStream(file)) {
        ObjectMetadata metadata = new ObjectMetadata();
        metadata.setContentLength(file.length());
        s3client.putObject(
                new PutObjectRequest(bucketName, file.getName(), stream, metadata)
        );
    }
}
2
  • It is only for specific case, InputStream interface does not allow you to know exactly what is your length unless you read it
    – kidnan1991
    Commented Jan 26, 2021 at 2:59
  • so there's no way if that file is being downloaded from the internet using URL class?
    – shinzou
    Commented Nov 8, 2021 at 13:29
0

Breaking News! AWS SDK 2.0 has built-in support for uploading files:

        s3client.putObject(
                (builder) -> builder.bucket(myBucket).key(file.getName()),
                RequestBody.fromFile(file)
        );

There's also RequestBody methods for taking Strings or Buffers which automatically and efficiently set Content-Length. Only when you have another kind of InputStream, you still need to provide the length yourself – however that case should be more rare now with all the other options available.

3
  • 1
    This method does not close input stream automatically and eventually you will incur on "Too many open files" exception
    – Zoraida
    Commented Mar 20, 2019 at 16:03
  • But you still have to know the length which you can not do if you are only passing an input stream. So this still requires you to load the entire file into memory before sending it to S3, or have S3 load the file into memory. Either way you are not streaming the data.
    – BrianC
    Commented May 15, 2019 at 17:55
  • what if you need to upload using a stream from a public URL? will it not load the entire file into memory/file?
    – shinzou
    Commented Nov 8, 2021 at 13:28

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.