Procedural derive macros for converting your Rust types into Polars DataFrames.
Deriving ToDataFrame on your structs and tuple structs generates fast, allocation-conscious code to:
- Convert a single value to a
polars::prelude::DataFrame - Convert a slice of values via a columnar path (efficient batch conversion)
- Inspect the schema (column names and
DataTypes) at compile time via a generated method
It supports nested structs (flattened with dot notation), Option<T>, Vec<T>, tuple structs, and key domain types like chrono::DateTime<Utc> and rust_decimal::Decimal.
Add the macro crate and Polars. You will also need a trait defining the to_dataframe behavior (you can use your own runtime crate/traits; see the override section below). For a minimal inline trait you can copy, see the Quick start example.
[dependencies]
df-derive = "0.2.0"
polars = { version = "0.52", features = ["timezones", "dtype-decimal"] }
# If you use these types in your models
chrono = { version = "0.4", features = ["serde"] }
rust_decimal = { version = "1.36", features = ["serde"] }Copy-paste runnable example without any external runtime traits. This is a complete working example that you can run with cargo run --example quickstart.
Cargo.toml:
[package]
name = "quickstart"
version = "0.1.0"
edition = "2024"
[dependencies]
df-derive = "0.2"
polars = { version = "0.52", features = ["timezones", "dtype-decimal"] }src/main.rs:
use df_derive::ToDataFrame;
mod dataframe {
use polars::prelude::{DataFrame, DataType, PolarsResult};
pub trait ToDataFrame {
fn to_dataframe(&self) -> PolarsResult<DataFrame>;
fn empty_dataframe() -> PolarsResult<DataFrame>;
fn schema() -> PolarsResult<Vec<(&'static str, DataType)>>;
}
pub trait Columnar: Sized {
fn columnar_to_dataframe(items: &[Self]) -> PolarsResult<DataFrame>;
}
pub trait ToDataFrameVec {
fn to_dataframe(&self) -> PolarsResult<DataFrame>;
}
impl<T> ToDataFrameVec for [T]
where
T: Columnar + ToDataFrame,
{
fn to_dataframe(&self) -> PolarsResult<DataFrame> {
if self.is_empty() {
return <T as ToDataFrame>::empty_dataframe();
}
<T as Columnar>::columnar_to_dataframe(self)
}
}
}
#[derive(ToDataFrame)]
#[df_derive(trait = "crate::dataframe::ToDataFrame")] // Columnar path auto-infers to crate::dataframe::Columnar
struct Trade {
symbol: String,
price: f64,
size: u64,
}
fn main() -> polars::prelude::PolarsResult<()> {
let t = Trade { symbol: "AAPL".into(), price: 187.23, size: 100 };
let df_single = <Trade as crate::dataframe::ToDataFrame>::to_dataframe(&t)?;
println!("{}", df_single);
let trades = vec![
Trade { symbol: "AAPL".into(), price: 187.23, size: 100 },
Trade { symbol: "MSFT".into(), price: 411.61, size: 200 },
];
use crate::dataframe::ToDataFrameVec;
let df_batch = trades.as_slice().to_dataframe()?;
println!("{}", df_batch);
Ok(())
}Run it:
cargo run- Nested structs (flattening): fields of nested structs appear as
outer.innercolumns - Vec of primitives and structs: becomes Polars
Listcolumns;Vec<Nested>becomes multipleouter.subfieldlist columns Option<T>: null-aware materialization for both scalars and lists- Tuple structs: supported; columns are named
field_0,field_1, ... - Empty structs: produce
(1, 0)for instances and(0, 0)for empty frames - Schema discovery:
T::schema() -> Vec<(&'static str, DataType)> - Columnar batch conversion:
[T]::to_dataframe()via theColumnarimplementation
Use #[df_derive(as_string)] to stringify values during conversion. This is particularly useful for enums:
#[derive(Clone, Debug, PartialEq)]
enum Status { Active, Inactive }
// Required: implement Display for the enum
impl std::fmt::Display for Status {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
Status::Active => write!(f, "Active"),
Status::Inactive => write!(f, "Inactive"),
}
}
}
#[derive(ToDataFrame)]
#[df_derive(trait = "crate::dataframe::ToDataFrame")]
struct WithEnums {
#[df_derive(as_string)]
status: Status,
#[df_derive(as_string)]
opt_status: Option<Status>,
#[df_derive(as_string)]
statuses: Vec<Status>,
}Columns will use DataType::String (or List<String> for Vec<_>), and values are produced via ToString. See the complete working example with cargo run --example as_string.
- Primitives:
String,bool, integer types (i8/i16/i32/i64/isize,u8/u16/u32/u64/usize),f32,f64 - Time:
chrono::DateTime<Utc>→ materialized asDatetime(Milliseconds, None) - Decimal:
rust_decimal::Decimal→Decimal(38, 10) - Wrappers:
Option<T>,Vec<T>in any nesting order - Custom structs: any other struct deriving
ToDataFrame(supports nesting andVec<Nested>) - Tuple structs: unnamed fields are emitted as
field_{index}
- Named struct fields:
field_name - Nested structs:
outer.inner(recursively) - Vec of custom structs:
vec_field.subfield(list dtype) - Tuple structs:
field_0,field_1, ...
For every #[derive(ToDataFrame)] type T the macro generates implementations of two traits (paths configurable via #[df_derive(...)]):
ToDataFrameforT:fn to_dataframe(&self) -> PolarsResult<DataFrame>fn empty_dataframe() -> PolarsResult<DataFrame>fn schema() -> PolarsResult<Vec<(&'static str, DataType)>>
ColumnarforT:fn columnar_to_dataframe(items: &[Self]) -> PolarsResult<DataFrame>
This crate includes several runnable examples in the examples/ directory. You can run any example with:
cargo run --example <example_name>Or run all examples to see the full feature set:
cargo run --example quickstart && \
cargo run --example nested && \
cargo run --example vec_custom && \
cargo run --example tuple && \
cargo run --example datetime_decimal && \
cargo run --example as_stringquickstart- Basic usage with single and batch DataFrame conversionnested- Nested structs with dot notation column namingvec_custom- Vec of custom structs creating List columnstuple- Tuple structs with field_0, field_1 namingdatetime_decimal- DateTime and Decimal type supportas_string-#[df_derive(as_string)]attribute for enum conversion
#[derive(ToDataFrame)]
#[df_derive(trait = "crate::dataframe::ToDataFrame")]
struct Address { street: String, city: String, zip: u32 }
#[derive(ToDataFrame)]
#[df_derive(trait = "crate::dataframe::ToDataFrame")]
struct Person { name: String, age: u32, address: Address }
// Columns: name, age, address.street, address.city, address.zipNote: the runnable examples define a small
dataframemodule with the traits used by the macro. Some helper trait items are not used in every snippet (for exampleempty_dataframeorColumnar). To avoid noise duringcargo run --example …, the examples annotate that module with#[allow(dead_code)].
#[derive(ToDataFrame)]
#[df_derive(trait = "crate::dataframe::ToDataFrame")]
struct Quote { ts: i64, open: f64, high: f64, low: f64, close: f64, volume: u64 }
#[derive(ToDataFrame)]
#[df_derive(trait = "crate::dataframe::ToDataFrame")]
struct MarketData { symbol: String, quotes: Vec<Quote> }
// Columns include: symbol, quotes.ts, quotes.open, quotes.high, ... (each a List)#[derive(ToDataFrame)]
#[df_derive(trait = "crate::dataframe::ToDataFrame")]
struct SimpleTuple(i32, String, f64);
// Columns: field_0 (Int32), field_1 (String), field_2 (Float64)#[derive(ToDataFrame)]
#[df_derive(trait = "crate::dataframe::ToDataFrame")]
struct TxRecord { amount: rust_decimal::Decimal, ts: chrono::DateTime<chrono::Utc> }
// Schema dtypes: amount = Decimal(38, 10), ts = Datetime(Milliseconds, None)Why
#[allow(dead_code)]in examples? The examples include a minimaldataframemodule to provide the traits that the macro implements. Not every example calls every method (e.g.,empty_dataframe,schema), and compile-time warnings would otherwise distract from the output. Adding#[allow(dead_code)]to that module keeps the examples clean while remaining fully correct.
#[derive(Clone, Debug, PartialEq)]
enum Status { Active, Inactive }
impl std::fmt::Display for Status {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
Status::Active => write!(f, "Active"),
Status::Inactive => write!(f, "Inactive"),
}
}
}
#[derive(ToDataFrame)]
#[df_derive(trait = "crate::dataframe::ToDataFrame")]
struct WithEnums {
#[df_derive(as_string)]
status: Status,
#[df_derive(as_string)]
opt_status: Option<Status>,
#[df_derive(as_string)]
statuses: Vec<Status>,
}
// Columns use DataType::String or List<String>Note: All examples require the trait definitions shown in the Quick start section. See the complete working examples in the
examples/directory.
- Unsupported container types: maps/sets like
HashMap<_, _>are not supported. - Enums: derive on enums is not supported; use
#[df_derive(as_string)]on enum fields. - Generics: generic structs are not supported by the derive (see tests/fail for examples).
- All nested types must also derive: if you nest a struct, it must also derive
ToDataFrame.
- The derive implements an internal
Columnarpath used by the runtime to convert slices efficiently, avoiding per-row DataFrame builds. - Criterion benches in
benches/exercise wide, deep, and nested-Vec shapes (100k+ rows), demonstrating consistent performance across shapes.
Performance is continuously monitored and tracked using Bencher:
- Rust edition: 2024
- Polars: 0.52 (tested)
- Enable Polars features
timezonesanddtype-decimalif you useDateTime<Utc>orDecimal.
MIT. See LICENSE.
This crate currently resolves default trait paths to a dataframe module under the paft ecosystem. Concretely, it attempts to implement:
paft::dataframe::ToDataFrameandpaft::dataframe::Columnar(orpaft-core::dataframe::...) if those crates are present.
You can override these paths for any runtime by annotating your type with #[df_derive(...)]:
#[derive(df_derive::ToDataFrame)]
#[df_derive(trait = "my_runtime::dataframe::ToDataFrame")] // Columnar will be inferred as my_runtime::dataframe::Columnar
struct MyType { /* fields */ }If you need to override both explicitly:
#[derive(df_derive::ToDataFrame)]
#[df_derive(
trait = "my_runtime::dataframe::ToDataFrame",
columnar = "my_runtime::dataframe::Columnar",
)]
struct MyType { /* fields */ }