Silva is a tiny inference engine for tree ensemble models (a.k.a forest models) in Rust.
Silva makes it easier for Rust programs to use pre-trained XGBoost and LightGBM models by providing a lightweight inference engine that avoids runtime dependencies on external machine learning libraries. Key benefits include:
- Pure Rust: Entirely written in Rust for high performance, memory safety, and zero-cost abstractions
- Simple Codebase: Minimal, clean implementation that's easy to understand, integrate, and maintain
- No External Dependencies: Parses and runs models from XGBoost and LightGBM without requiring those libraries to be installed or linked
- Native format using efficient serde serialization
- Most compact and fastest to load
- Booster Types:
gbtreeonly (gblinear and dart are not supported) - Supported Objectives:
reg:squarederror(regression)binary:logistic(binary classification)multi:softmax(multiclass classification)multi:softprob(multiclass classification)
- Note: Unsupported booster types/objectives will return descriptive errors
- All regression and classification models
- Text format only (binary format not supported)
- Tree structure only (no linear models)
- Note: LightGBM incorporates all bias into leaf values (no separate base_score)
cargo add silvaA container for multi-output models (e.g., multi-class classification). Holds a vector of Forest instances, one per output class. Returns a vector of predictions, one per output.
Single-output tree ensemble containing:
base_value: Bias/baseline score added to all predictionstrees: Vector of decision trees
Prediction formula: base_value + Σ tree_predictions
Individual decision tree represented as:
node_map: Hash map of node ID →TreeNoderoot: Root node ID
Traverses tree from root to leaf based on feature comparisons.
Single node with:
split_index: Feature index for splittingsplit_condition: Threshold value (NotNan)left/right: Child node IDs (None for leaves)value: Leaf value (NotNan)
Leaves have no children; internal nodes contain split logic.
{
"forests": [
{
"base_value": 0.5,
"trees": [
{
"nm": {
"0": {"id": 0, "si": 0, "sc": 2.5, "l": 1, "r": 2, "v": 0.0},
"1": {"id": 1, "si": 1, "sc": 1.5, "l": null, "r": null, "v": 3.0},
"2": {"id": 2, "si": 1, "sc": 3.5, "l": null, "r": null, "v": 5.0}
},
"root": 0
},
{
"nm": {
"0": {"id": 0, "si": 0, "sc": 5.0, "l": 1, "r": 2, "v": 0.0},
"1": {"id": 1, "si": 1, "sc": 2.0, "l": null, "r": null, "v": 10.0},
"2": {"id": 2, "si": 1, "sc": 3.0, "l": null, "r": null, "v": 20.0}
},
"root": 0
}
]
}
]
}| Abbreviation | Full Name | Description |
|---|---|---|
nm |
node_map | Hash map mapping node ID to TreeNode |
si |
split_index | Feature index used for splitting at this node |
sc |
split_condition | Threshold value for the split comparison |
l |
left | ID of left child node (null for leaves) |
r |
right | ID of right child node (null for leaves) |
v |
value | Leaf prediction value (only used in leaf nodes) |
MultiOutputForest
└── forests: Forest[]
├── base_value: f64 (baseline score)
├── trees: Tree[]
│ ├── nm: {node_id: TreeNode}
│ │ ├── id: node ID
│ │ ├── si: feature index to split on
│ │ ├── sc: split threshold
│ │ ├── l: left child ID (or null)
│ │ ├── r: right child ID (or null)
│ │ └── v: leaf value
│ └── root: ID of the root node
Prediction Flow: Start at root → compare feature[si] with sc → follow l or r → repeat until leaf → sum all tree values → add base_value
The predict methods work with feature vectors (&[f64]) and return prediction values.
use silva::Tree;
let tree = Tree::new(node_map, root_id);
let prediction = tree.predict(&[1.5, 2.3, 0.8]); // returns NotNan<f64>use silva::Forest;
let forest = Forest::new(base_value, trees);
let prediction = forest.predict(&[1.5, 2.3, 0.8]); // returns NotNan<f64>use silva::MultiOutputForest;
let model = MultiOutputForest::new(forests);
let predictions = model.predict(&[1.5, 2.3, 0.8]); // returns Vec<NotNan<f64>>use silva::MultiOutputForest;
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Load model from file
let model = MultiOutputForest::from_file("model.json")?;
// Prepare feature data
let features = vec![vec![1.5, 2.3, 0.8], vec![0.5, 1.2, 3.4]];
// Make predictions
for x in &features {
let prediction = model.predict(x);
println!("Predictions: {:?}", prediction);
}
Ok(())
}The predict methods return raw values that may require post-processing depending on the model type and objective:
For binary classification using binary:logistic, apply sigmoid to the raw prediction:
let raw = forest.predict(&features);
let probability = 1.0 / (1.0 + (-raw).exp()); // sigmoidFor multiclass classification using multi:softmax or multi:softprob, apply softmax to the predictions:
let raw_values = model.predict(&features);
let exp_values: Vec<f64> = raw_values.iter().map(|&v| v.exp()).collect();
let sum: f64 = exp_values.iter().sum();
let probabilities: Vec<f64> = exp_values.iter().map(|&v| v / sum).collect();For Poisson regression objectives, apply exponential to the raw prediction:
let raw = forest.predict(&features);
let count_prediction = raw.exp();For standard regression (e.g., reg:squarederror), the raw value can be used directly.
LightGBM models incorporate bias into leaf values, so no separate base_score adjustment is needed for predictions.
For more examples, see examples/prediction.rs.
