You need to seek back to the beginning of the file after writing the initial in memory file...
myio.seek(0)
Answer from mgilson on Stack OverflowYou need to seek back to the beginning of the file after writing the initial in memory file...
myio.seek(0)
How about we write and read gzip content in the same context like this?
#!/usr/bin/env python
from io import BytesIO
import gzip
content = b"does it work"
# write bytes to zip file in memory
gzipped_content = None
with BytesIO() as myio:
with gzip.GzipFile(fileobj=myio, mode='wb') as g:
g.write(content)
gzipped_content = myio.getvalue()
print(gzipped_content)
print(content == gzip.decompress(gzipped_content))
shutil has a utility that will write the file efficiently. It copies in chunks, defaulting to 16K. Any multiple of 4K chunks should be a good cross platform number. I chose 131072 rather arbitrarily because really the file is written to the OS cache in RAM before going to disk and the chunk size isn't that big of a deal.
import shutil
myBytesIOObj.seek(0)
with open('myfile.ext', 'wb') as f:
shutil.copyfileobj(myBytesIOObj, f, length=131072)
BTW, there was no need to close the file object at the end. with defines a scope, and the file object is defined inside that scope. The file handle is therefore closed automatically on exit from the with block.
Since Python 3.2 it's possible to use the BytesIO.getbuffer() method as follows:
from io import BytesIO
buf = BytesIO(b'test')
with open('path/to/file', 'wb') as f:
f.write(buf.getbuffer())
This way it doesn't copy the buffer's content, streaming it straight to the open file.
Note: The StringIO buffer doesn't support the getbuffer() protocol (as of Python 3.9).
Before streaming the BytesIO buffer to file, you might want to set its position to the beginning:
buf.seek(0)
# Create an example
from io import BytesIO
bytesio_object = BytesIO(b"Hello World!")
# Write the stuff
with open("output.txt", "wb") as f:
f.write(bytesio_object.getbuffer())
It would be helpful if you supplied the library you were using to work on excel files, but here's a buckshot of solutions, based on some assumptions I'm making:
- Based on the first paragraph in the io module's documentation, it sounds like all the concrete classes- including BytesIO- are file-like objects. Without knowing what code you've tried so far, I don't know if you have tried passing the BytesIO to the module you're using.
- On the off chance that doesn't work, you can simply convert BytesIO to a another io Writer/Reader/Wrapper by passing it to the constructor. Example:
.
import io
b = io.BytesIO(b"Hello World") ## Some random BytesIO Object
print(type(b)) ## For sanity's sake
with open("test.xlsx") as f: ## Excel File
print(type(f)) ## Open file is TextIOWrapper
bw=io.TextIOWrapper(b) ## Conversion to TextIOWrapper
print(type(bw)) ## Just to confirm
- You may need to check which kind of Reader/Writer/Wrapper is expected by the module you're using to convert the BytesIO to the correct one
- I believe I have heard that (for memory reasons, due to extremely large excel files) excel modules do not load the entire file. If this ends up meaning that what you need is a physical file on the disk, then you can easily write the Excel file temporarily and just delete it when you're done. Example:
.
import io
import os
with open("test.xlsx",'rb') as f:
g=io.BytesIO(f.read()) ## Getting an Excel File represented as a BytesIO Object
temporarylocation="testout.xlsx"
with open(temporarylocation,'wb') as out: ## Open temporary file as bytes
out.write(g.read()) ## Read bytes into file
## Do stuff with module/file
os.remove(temporarylocation) ## Delete file when done
I'll hope that one of these points will solve your problem.
For simplicity's sake, let's consider writing instead of reading for now.
So when you use open() like say:
with open("test.dat", "wb") as f:
f.write(b"Hello World")
f.write(b"Hello World")
f.write(b"Hello World")
After executing that a file called test.dat will be created, containing 3x Hello World. The data wont be kept in memory after it's written to the file (unless being kept by a name).
Now when you consider io.BytesIO() instead:
with io.BytesIO() as f:
f.write(b"Hello World")
f.write(b"Hello World")
f.write(b"Hello World")
Which instead of writing the contents to a file, it's written to an in memory buffer. In other words a chunk of RAM. Essentially writing the following would be the equivalent:
buffer = b""
buffer += b"Hello World"
buffer += b"Hello World"
buffer += b"Hello World"
In relation to the example with the with statement, then at the end there would also be a del buffer.
The key difference here is optimization and performance. io.BytesIO is able to do some optimizations that makes it faster than simply concatenating all the b"Hello World" one by one.
Just to prove it here's a small benchmark:
- Concat: 1.3529 seconds
- BytesIO: 0.0090 seconds
import io
import time
begin = time.time()
buffer = b""
for i in range(0, 50000):
buffer += b"Hello World"
end = time.time()
seconds = end - begin
print("Concat:", seconds)
begin = time.time()
buffer = io.BytesIO()
for i in range(0, 50000):
buffer.write(b"Hello World")
end = time.time()
seconds = end - begin
print("BytesIO:", seconds)
Besides the performance gain, using BytesIO instead of concatenating has the advantage that BytesIO can be used in place of a file object. So say you have a function that expects a file object to write to. Then you can give it that in-memory buffer instead of a file.
The difference is that open("myfile.jpg", "rb") simply loads and returns the contents of myfile.jpg; whereas, BytesIO again is just a buffer containing some data.
Since BytesIO is just a buffer - if you wanted to write the contents to a file later - you'd have to do:
buffer = io.BytesIO()
# ...
with open("test.dat", "wb") as f:
f.write(buffer.getvalue())
Also, you didn't mention a version; I'm using Python 3. Related to the examples: I'm using the with statement instead of calling f.close()
Using open opens a file on your hard drive. Depending on what mode you use, you can read or write (or both) from the disk.
A BytesIO object isn't associated with any real file on the disk. It's just a chunk of memory that behaves like a file does. It has the same API as a file object returned from open (with mode r+b, allowing reading and writing of binary data).
BytesIO (and it's close sibling StringIO which is always in text mode) can be useful when you need to pass data to or from an API that expect to be given a file object, but where you'd prefer to pass the data directly. You can load your input data you have into the BytesIO before giving it to the library. After it returns, you can get any data the library wrote to the file from the BytesIO using the getvalue() method. (Usually you'd only need to do one of those, of course.)