java - How do I convert String to S3ObjectInputStream - Stack Overflow
java - Maping S3ObjectInputStream into input stream - Stack Overflow
amazon s3 - How to download s3 object directly into memory in java - Stack Overflow
java - How to write an S3 object to a file? - Stack Overflow
Use the AWS SDK for Java and Apache Commons IO as such:
//import org.apache.commons.io.IOUtils
AmazonS3 s3 = new AmazonS3Client(credentials); // anonymous credentials are possible if this isn't your bucket
S3Object object = s3.getObject("bucket", "key");
byte[] byteArray = IOUtils.toByteArray(object.getObjectContent());
Not sure what you mean by "get it removed", but IOUtils will close the object's input stream when it's done converting it to a byte array. If you mean you want to delete the object from s3, that's as easy as:
s3.deleteObject("bucket", "key");
As of AWS JAVA SDK 2 you can you use ReponseTransformer to convert the response to different types. (See javadoc).
Below is the example for getting the object as bytes
GetObjectRequest request = GetObjectRequest.builder().bucket(bucket).key(key).build()
ResponseBytes<GetObjectResponse> result = bytess3Client.getObject(request, ResponseTransformer.toBytes())
// to get the bytes
result.asByteArray()
Since Java 7 (published back in July 2011), there’s a better way: Files.copy() utility from java.util.nio.file.
Copies all bytes from an input stream to a file.
So you need neither an external library nor rolling your own byte array loops. Two examples below, both of which use the input stream from S3Object.getObjectContent().
InputStream in = s3Client.getObject("bucketName", "key").getObjectContent();
1) Write to a new file at specified path:
Files.copy(in, Paths.get("/my/path/file.jpg"));
2) Write to a temp file in system's default tmp location:
File tmp = File.createTempFile("s3test", "");
Files.copy(in, tmp.toPath(), StandardCopyOption.REPLACE_EXISTING);
(Without specifying the option to replace existing file, you'll get a FileAlreadyExistsException.)
Also note that getObjectContent() Javadocs urge you to close the input stream:
If you retrieve an S3Object, you should close this input stream as soon as possible, because the object contents aren't buffered in memory and stream directly from Amazon S3. Further, failure to close this stream can cause the request pool to become blocked.
So it should be safest to wrap everything in try-catch-finally, and do in.close(); in the finally block.
The above assumes that you use the official SDK from Amazon (aws-java-sdk-s3).
While IOUtils.copy() and IOUtils.copyLarge() are great, I would prefer the old school way of looping through the inputstream until the inputstream returns -1. Why? I used IOUtils.copy() before but there was a specific use case where if I started downloading a large file from S3 and then for some reason if that thread was interrupted, the download would not stop and it would go on and on until the whole file was downloaded.
Of course, this has nothing to do with S3, just the IOUtils library.
So, I prefer this:
InputStream in = s3Object.getObjectContent();
byte[] buf = new byte[1024];
OutputStream out = new FileOutputStream(file);
while( (count = in.read(buf)) != -1)
{
if( Thread.interrupted() )
{
throw new InterruptedException();
}
out.write(buf, 0, count);
}
out.close();
in.close();
Note: This also means you don't need additional libraries
Because the original question was never answered, and I had to run into this same problem, the solution for the MD5 problem is that S3 doesn't want the Hex encoded MD5 string we normally think about.
Instead, I had to do this.
// content is a passed in InputStream
byte[] resultByte = DigestUtils.md5(content);
String streamMD5 = new String(Base64.encodeBase64(resultByte));
metaData.setContentMD5(streamMD5);
Essentially what they want for the MD5 value is the Base64 encoded raw MD5 byte-array, not the Hex string. When I switched to this it started working great for me.
If all you are trying to do is solve the content length error from amazon then you could just read the bytes from the input stream to a Long and add that to the metadata.
/*
* Obtain the Content length of the Input stream for S3 header
*/
try {
InputStream is = event.getFile().getInputstream();
contentBytes = IOUtils.toByteArray(is);
} catch (IOException e) {
System.err.printf("Failed while reading bytes from %s", e.getMessage());
}
Long contentLength = Long.valueOf(contentBytes.length);
ObjectMetadata metadata = new ObjectMetadata();
metadata.setContentLength(contentLength);
/*
* Reobtain the tmp uploaded file as input stream
*/
InputStream inputStream = event.getFile().getInputstream();
/*
* Put the object in S3
*/
try {
s3client.putObject(new PutObjectRequest(bucketName, keyName, inputStream, metadata));
} catch (AmazonServiceException ase) {
System.out.println("Error Message: " + ase.getMessage());
System.out.println("HTTP Status Code: " + ase.getStatusCode());
System.out.println("AWS Error Code: " + ase.getErrorCode());
System.out.println("Error Type: " + ase.getErrorType());
System.out.println("Request ID: " + ase.getRequestId());
} catch (AmazonClientException ace) {
System.out.println("Error Message: " + ace.getMessage());
} finally {
if (inputStream != null) {
inputStream.close();
}
}
You'll need to read the input stream twice using this exact method so if you are uploading a very large file you might need to look at reading it once into an array and then reading it from there.