Bodo’s July release updates the best of the Bodo DataFrame library and adds new features—from the best way to interact with Iceberg, to making database-grade analytics now possible!
New features
DataFrame.to_iceberg()
now writes Iceberg tables (including partition spec and sort order).DataFrame.to_parquet()
adds first‑class Parquet write.read_iceberg()
supports simple filesystem reads.So what?
You can move computation‐heavy Bodo pipelines straight into open‑table‑format data lakes—no Spark detour, no flaky export scripts. That shrinks end‑to‑end latency and lets you query the same files immediately from DuckDB, Trino, or any Iceberg‑aware engine.
New features
DataFrame.groupby()
with sum, count, max
.DataFrameGroupBy
and SeriesGroupBy.agg()
.So what?
The workhorse of analytical code is now compiled and parallelized by Bodo. Complex aggregations that once forced a round‑trip to Pandas or PySpark stay in‑process and scale linearly with cores.
Bodo 2025.7 eliminates the “last‑mile” friction between high‑speed DataFrame computation and production‑grade data lakes, while pushing API coverage toward parity with Pandas. If you’re building heterogeneous pipelines that must both crunch numbers fast and land in Iceberg/Parquet for everyone else to consume, this release is the missing piece.
👉 Try it today: pip install bodo
.
👉 Check out our Github repo