What Is io.BytesIO Useful For?
io.BytesIO is a standard class that creates an in-memory binary stream, that is, it behaves like a file but exists only in our program's memory. This means you can read from and write to it just like a file, but without creating any actual files on disk. In Python, file-like objects are objects that implement methods like read(), write(), and seek(), allowing you to interact with data in a way similar to working with files.
How is BytesIO useful? Consider, for example, the following table of products stored in a Pandas DataFrame:
|
import pandas as pd
|
|
|
|
products = pd.DataFrame(
|
|
columns=("name", "price"),
|
|
data=(
|
|
("Keyboard", 20),
|
|
("Mouse", 5),
|
|
("Printer", 100),
|
|
("Headphones", 35.5)
|
|
)
|
|
)
|
Let's suppose we want to convert this products table to an Excel file. Pandas supports that by using:
products.to_excel("products.xlsx")
This will create a new products.xlsx file in the current working directory with our table of products. But what if we want to store that Excel file in memory? Luckily, the first argument of to_excel() can be either a string with a file name or path, or the so-called file-like instance, which in short is, as we said above, any object that supports the write() and flush() methods. And io.BytesIO supports these methods! Hence this will store an Excel of products in memory:
This could be really helpful if we want, for example, to retrieve the contents of the Excel file in an HTTP endpoint. We can easily do that with Flask:
Save this code as endpoint.py and run python endpoint.py, and we will have a simple web service that triggers the download of the products table as an Excel file at http://127.0.0.1:5000/. This way, we don't need to store the Excel file in the disk in order to retrieve it in a web response.