What Is io.BytesIO Useful For?
io.BytesIO
is a standard class that creates an in-memory binary stream, that is, it behaves like a file but exists only in our program's memory. This means you can read from and write to it just like a file, but without creating any actual files on disk. In Python, file-like objects are objects that implement methods like read()
, write()
, and seek()
, allowing you to interact with data in a way similar to working with files.
How is BytesIO
useful? Consider, for example, the following table of products stored in a Pandas DataFrame
:
|
import pandas as pd
|
|
|
|
products = pd.DataFrame(
|
|
columns=("name", "price"),
|
|
data=(
|
|
("Keyboard", 20),
|
|
("Mouse", 5),
|
|
("Printer", 100),
|
|
("Headphones", 35.5)
|
|
)
|
|
)
|
Let's suppose we want to convert this products
table to an Excel file. Pandas supports that by using:
products.to_excel("products.xlsx")
This will create a new products.xlsx
file in the current working directory with our table of products. But what if we want to store that Excel file in memory? Luckily, the first argument of to_excel()
can be either a string with a file name or path, or the so-called file-like instance, which in short is, as we said above, any object that supports the write()
and flush()
methods. And io.BytesIO
supports these methods! Hence this will store an Excel of products in memory:
This could be really helpful if we want, for example, to retrieve the contents of the Excel file in an HTTP endpoint. We can easily do that with Flask:
Save this code as endpoint.py
and run python endpoint.py
, and we will have a simple web service that triggers the download of the products
table as an Excel file at http://127.0.0.1:5000/. This way, we don't need to store the Excel file in the disk in order to retrieve it in a web response.