What Is io.BytesIO Useful For?

Python Assets

2024-07-18

io.BytesIO is a standard class that creates an in-memory binary stream, that is, it behaves like a file but exists only in our program's memory. This means you can read from and write to it just like a file, but without creating any actual files on disk. In Python, file-like objects are objects that implement methods like read(), write(), and seek(), allowing you to interact with data in a way similar to working with files.

How is BytesIO useful? Consider, for example, the following table of products stored in a Pandas DataFrame:

	`import pandas as pd`

	`products = pd.DataFrame(`
	`columns=("name", "price"),`
	`data=(`
	`("Keyboard", 20),`
	`("Mouse", 5),`
	`("Printer", 100),`
	`("Headphones", 35.5)`
	`)`
	`)`

Let's suppose we want to convert this products table to an Excel file. Pandas supports that by using:

products.to_excel("products.xlsx")

This will create a new products.xlsx file in the current working directory with our table of products. But what if we want to store that Excel file in memory? Luckily, the first argument of to_excel() can be either a string with a file name or path, or the so-called file-like instance, which in short is, as we said above, any object that supports the write() and flush() methods. And io.BytesIO supports these methods! Hence this will store an Excel of products in memory:

	`import pandas as pd`
	`from io import BytesIO`

	`products = pd.DataFrame(`
	`columns=("name", "price"),`
	`data=(`
	`("Keyboard", 20),`
	`("Mouse", 5),`
	`("Printer", 100),`
	`("Headphones", 35.5)`
	`)`
	`)`
	`buffer = BytesIO()`
	`products.to_excel(buffer)`

This could be really helpful if we want, for example, to retrieve the contents of the Excel file in an HTTP endpoint. We can easily do that with Flask:

	`from io import BytesIO`
	`import pandas as pd`
	`from flask import make_response, Flask`

	`app = Flask(__name__)`

	`@app.route("/")`
	`def hello_world():`
	`products = pd.DataFrame(`
	`columns=("name", "price"),`
	`data=(`
	`("Keyboard", 20),`
	`("Mouse", 5),`
	`("Printer", 100),`
	`("Headphones", 35.5)`
	`)`
	`)`
	`buffer = BytesIO()`
	`products.to_excel(buffer)`
	`resp = make_response(buffer.getvalue())`
	`mime = "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"`
	`resp.content_type = mime`
	`return resp`

	`app.run()`

Save this code as endpoint.py and run python endpoint.py, and we will have a simple web service that triggers the download of the products table as an Excel file at http://127.0.0.1:5000/. This way, we don't need to store the Excel file in the disk in order to retrieve it in a web response.

Related Posts