Scenario: I have a REST API endpoint built with Spring Boot. The endpoint is used to dynamically generate an excel file based off input parameters. When the file is done being generated, it returns the file as an InputStreamResource. The main goal is minimum memory usage.
I'm using fastexcel to create the excel file, and I'm flushing it to the OutputStream after every row is written. Right now, I am using a FileOutputStream to write to disk. When the excel file is done being generated, I read it back in using InputStreamResource and stream the response. My thought process is that a ByteArrayOutputStream keeps everything in memory even if I'm flushing the excel file after every row, so I used the FileOutputStream. Does my logic track here? Or am I unnecessarily slowing things down with expensive filesystem IO?
You can do it with using a FileOutputStream and the writeTo method.
ByteArrayOutputStream byteArrayOutputStream = getByteStreamMethod();
try(OutputStream outputStream = new FileOutputStream("thefilename")) {
byteArrayOutputStream.writeTo(outputStream);
}
Source: "Creating a file from ByteArrayOutputStream in Java." on Code Inventions
You can use a FileOutputStream for this.
FileOutputStream fos = null;
try {
fos = new FileOutputStream(new File("myFile"));
ByteArrayOutputStream baos = new ByteArrayOutputStream();
// Put data in your baos
baos.writeTo(fos);
} catch(IOException ioe) {
// Handle exception here
ioe.printStackTrace();
} finally {
fos.close();
}
java - Write ByteArrayOutputStream to FileOutputStream vs Write ByteArray to FileOutputStream? - Stack Overflow
java - Difference between ByteArrayOutputStream and BufferedOutputStream - Stack Overflow
java - How do I convert an OutputStream to an InputStream? - Stack Overflow
android - is possible to convert FileOutputStream to byte array? - Stack Overflow
Videos
ByteArrayOutputStream writes bytes to a byte array in memory. Not to any other destination, such as a file or a network socket. After writing the data, you can get the byte array by calling toByteArray() on it.
BufferedOutputStream wraps another, underlying OutputStream and provides buffering for that underlying stream, to make I/O operations more efficient. The underlying stream can be any kind of OutputStream, for example one that writes to a file or a network socket.
Why you might want to use buffering: Writing a large block of data to the file system is more efficient than writing byte by byte. If your program needs to write many small pieces of data, it's more efficient to first gather these small pieces in a buffer and then write the entire buffer to disk at once. This is what BufferedOutputStream does automatically for you.
Just look at the javadoc:
ByteArrayOutputStream:
This class implements an output stream in which the data is written into a byte array.
BufferedOutputStream:
The class implements a buffered output stream. By setting up such an output stream, an application can write bytes to the underlying output stream without necessarily causing a call to the underlying system for each byte written.
So, those are really two very different things:
- the first one you use when you know that you have some data that in the end you need as array of bytes
- the second one is just a wrapper around any other kind of output stream - which adds buffering.
That is all there is to this!
And if you want to experience a different behavior: create a buffered one that writes to a file, and an array one. Then just keep pushing bytes into each one. The array one will cause a memory problem at some point, the other one might not stop until all of your disk space is used up.
There seem to be many links and other such stuff, but no actual code using pipes. The advantage of using java.io.PipedInputStream and java.io.PipedOutputStream is that there is no additional consumption of memory. ByteArrayOutputStream.toByteArray() returns a copy of the original buffer, so that means that whatever you have in memory, you now have two copies of it. Then writing to an InputStream means you now have three copies of the data.
The code using lambdas (hat-tip to @John Manko from the comments):
PipedInputStream in = new PipedInputStream();
final PipedOutputStream out = new PipedOutputStream(in);
// in a background thread, write the given output stream to the
// PipedOutputStream for consumption
new Thread(() -> {originalOutputStream.writeTo(out);}).start();
One thing that @John Manko noted is that in certain cases, when you don't have control of the creation of the OutputStream, you may end up in a situation where the creator may clean up the OutputStream object prematurely. If you are getting the ClosedPipeException, then you should try inverting the constructors:
PipedInputStream in = new PipedInputStream(out);
new Thread(() -> {originalOutputStream.writeTo(out);}).start();
Note you can invert the constructors for the examples below too.
Thanks also to @AlexK for correcting me with starting a Thread instead of just kicking off a Runnable.
The code using try-with-resources:
// take the copy of the stream and re-write it to an InputStream
PipedInputStream in = new PipedInputStream();
new Thread(new Runnable() {
public void run () {
// try-with-resources here
// putting the try block outside the Thread will cause the
// PipedOutputStream resource to close before the Runnable finishes
try (final PipedOutputStream out = new PipedOutputStream(in)) {
// write the original OutputStream to the PipedOutputStream
// note that in order for the below method to work, you need
// to ensure that the data has finished writing to the
// ByteArrayOutputStream
originalByteArrayOutputStream.writeTo(out);
}
catch (IOException e) {
// logging and exception handling should go here
}
}
}).start();
The original code I wrote:
// take the copy of the stream and re-write it to an InputStream
PipedInputStream in = new PipedInputStream();
final PipedOutputStream out = new PipedOutputStream(in);
new Thread(new Runnable() {
public void run () {
try {
// write the original OutputStream to the PipedOutputStream
// note that in order for the below method to work, you need
// to ensure that the data has finished writing to the
// ByteArrayOutputStream
originalByteArrayOutputStream.writeTo(out);
}
catch (IOException e) {
// logging and exception handling should go here
}
finally {
// close the PipedOutputStream here because we're done writing data
// once this thread has completed its run
if (out != null) {
// close the PipedOutputStream cleanly
out.close();
}
}
}
}).start();
This code assumes that the originalByteArrayOutputStream is a ByteArrayOutputStream as it is usually the only usable output stream, unless you're writing to a file. The great thing about this is that since it's in a separate thread, it also is working in parallel, so whatever is consuming your input stream will be streaming out of your old output stream too. That is beneficial because the buffer can remain smaller and you'll have less latency and less memory usage.
If you don't have a ByteArrayOutputStream, then instead of using writeTo(), you will have to use one of the write() methods in the java.io.OutputStream class or one of the other methods available in a subclass.
An OutputStream is one where you write data to. If some module exposes an OutputStream, the expectation is that there is something reading at the other end.
Something that exposes an InputStream, on the other hand, is indicating that you will need to listen to this stream, and there will be data that you can read.
So it is possible to connect an InputStream to an OutputStream
InputStream----read---> intermediateBytes[n] ----write----> OutputStream
As someone metioned, this is what the copy() method from IOUtils lets you do. It does not make sense to go the other way... hopefully this makes some sense
UPDATE:
Of course the more I think of this, the more I can see how this actually would be a requirement. I know some of the comments mentioned Piped input/ouput streams, but there is another possibility.
If the output stream that is exposed is a ByteArrayOutputStream, then you can always get the full contents by calling the toByteArray() method. Then you can create an input stream wrapper by using the ByteArrayInputStream sub-class. These two are pseudo-streams, they both basically just wrap an array of bytes. Using the streams this way, therefore, is technically possible, but to me it is still very strange...
To convert a file to byte array, ByteArrayOutputStream class is used. This class implements an output stream in which the data is written into a byte array. The buffer automatically grows as data is written to it. The data can be retrieved using toByteArray() and toString().
To convert byte array back to the original file, FileOutputStream class is used. A file output stream is an output stream for writing data to a File or to a FileDescriptor.
The following code has been fully tested.
public static void main(String[] args) throws FileNotFoundException, IOException {
File file = new File("java.pdf");
FileInputStream fis = new FileInputStream(file);
//System.out.println(file.exists() + "!!");
//InputStream in = resource.openStream();
ByteArrayOutputStream bos = new ByteArrayOutputStream();
byte[] buf = new byte[1024];
try {
for (int readNum; (readNum = fis.read(buf)) != -1;) {
bos.write(buf, 0, readNum); //no doubt here is 0
//Writes len bytes from the specified byte array starting at offset off to this byte array output stream.
System.out.println("read " + readNum + " bytes,");
}
} catch (IOException ex) {
Logger.getLogger(genJpeg.class.getName()).log(Level.SEVERE, null, ex);
}
byte[] bytes = bos.toByteArray();
//below is the different part
File someFile = new File("java2.pdf");
FileOutputStream fos = new FileOutputStream(someFile);
fos.write(bytes);
fos.flush();
fos.close();
}
how to write a byte array to a file using a FileOutputStream. The FileOutputStream is an output stream for writing data to a File or to a FileDescriptor.
public static void main(String[] args) {
String s = "input text to be written in output stream";
File file = new File("outputfile.txt");
FileOutputStream fos = null;
try {
fos = new FileOutputStream(file);
// Writes bytes from the specified byte array to this file output stream
fos.write(s.getBytes());
}
catch (FileNotFoundException e) {
System.out.println("File not found" + e);
}
catch (IOException ioe) {
System.out.println("Exception while writing file " + ioe);
}
finally {
// close the streams using close method
try {
if (fos != null) {
fos.close();
}
}
catch (IOException ioe) {
System.out.println("Error while closing stream: " + ioe);
}
}
}
You may use ByteArrayOutputStream like that
private byte[] filetoByteArray(String path) {
byte[] data;
try {
InputStream input = new FileInputStream(path);
int byteReads;
ByteArrayOutputStream output = new ByteArrayOutputStream(1024);
while ((byteReads = input.read()) != -1) {
output.write(byteReads);
}
data = output.toByteArray();
output.close();
input.close();
return data;
} catch (Exception e) {
e.printStackTrace();
return null;
}
}
Heh, sounds like they copied and pasted code from different sources? :-P No, seriously, unless you need to inspect the decompressed data, you can just use a BufferedOutputStream for both compression and decompression.
The ByteArrayOutputStream is more memory hogging since it stores the entire content in Java's memory (in flavor of a byte[]). The FileOutputStream writes to disk directly and is hence less memory hogging. I don't see any sensible reason to use ByteArrayOutputStream in this particular case. It is not modifying the individual bytes afterwards. It just get written unchanged to file afterwards. It's thus an unnecessary intermediate step.