Bodo Dataframe Library: First Look

May 19, 2025

Rohit Krishnan

Last week, we shipped Bodo 2025.5, the first release that bundles our new Bodo DataFrame library: a drop-in-replacement for Pandas that provides advanced database optimizations and Bodo’s MPI backend.

Why This Matters

Data scientists love the Pandas API but hit a wall the moment data stops fitting in memory or when single core processing is way too slow. SQL and SQL-like engines solve scale but abandon Python ergonomics. Bodo DataFrame closes that gap: easy as Pandas, fast as a data warehouse, on your laptop or a 1000-node cluster.

What’s Inside 2025.5

I/O: read_parquet, from_pandas
Transform: Series.map, DataFrame.apply, string ops (str.lower, str.strip)
Query: Column projection, filter & limit push-down, head()
Mutate: In-place column assignment
Engine: DuckDB optimizer, lazy plans, streaming execution to avoid OOM
Safety net: Automatic fallback to Pandas for unsupported ops

Under the hood we integrate DuckDB’s optimizer for logical plan optimization and use Bodo and BodoSQL’s high-performance execution runtime. Anything not yet covered drops through to Pandas, so you can start migrating notebooks today without rewrites.

Quick Taste

import bodo_dataframe as bd    # same shape as `import pandas as pd`

taxi = bd.read_parquet("s3://nyc/trips_2024/")
short = taxi[taxi.trip_distance < 10][["fare_amount", "trip_distance"]]
print(short.head())            # compiles, optimizes, runs in parallel

What’s Next

Expect rapid coverage of the Pandas surface area, vectorized UDFs, and tighter Iceberg integration. As always, we value brutal feedback—file issues, benchmark us, break things.

Read the design backstory in Rethinking DataFrames: Easy as Pandas, Fast as a Data Warehouse and check the docs for the growing API matrix.

Our goal is to provide a DataFrame experience that is intuitive for Pandas users while delivering the speed and scalability of a distributed data warehouse.

This is an early experimental release of the Bodo DataFrame, and we encourage you to try it out. You can install Bodo using just pip install bodo. Visit our GitHub repository for more information and join the conversation in our community Slack.