Important note: The {duckspatial} R package is undergoing major changes in preparation for v1.0.0. This release will align with DuckDB v1.5 and its native CRS support in the Spatial extension.
Previously, {duckspatial} relied on a workaround to store CRS information in a dedicated column (named “crs_duckspatial” by default and referenced via the crs_column argument). With native CRS support now available, this workaround is no longer needed and will be removed. As a result, the crs and crs_column arguments are deprecated as of v0.9.0.
The upcoming version also introduces a new native spatial class, duckspatial_df. This represents a lazy spatial table backed by a temporary DuckDB view, allowing workflows to remain fully lazy until results are explicitly materialized with ddbs_collect(). In practice, this means you can work with {duckspatial} without eagerly loading data into R.
Finally, v1.0.0 will include many new functions and broader API improvements.
{duckspatial} provides fast, memory-efficient functions for analysing and manipulating large spatial vector datasets in R. It bridges DuckDB’s spatial extension with R’s spatial ecosystem — in particular {sf} — so you can leverage DuckDB’s analytical power without leaving your familiar R workflow.
Starting from v1.0.0, {duckspatial} introduces a native S3 class called
duckspatial_df: a tibble-like object with a geometry column that
lives outside R’s memory. Data is read and evaluated lazily (similar
to how duckplyr handles lazy tables) and is only loaded into R when
you explicitly materialise it (e.g. with ddbs_collect()).
When the first duckspatial_df is created (either by reading a file or
converting an sf object) a temporary view is registered in a default
DuckDB connection with the spatial extension enabled. All spatial
operations run inside that connection, letting DuckDB apply its own
query optimisations before any data reaches R.
All spatial functions follow the ddbs_*() prefix (DuckDB Spatial),
and their names deliberately mirror the {sf} package, so users
already familiar with sf can get started immediately.
Install the stable release from CRAN:
# install.packages("pak")
pak::pak("duckspatial")Install the latest GitHub version (more features, fewer accumulated bugs):
pak::pak("Cidree/duckspatial")Install the development version (may be unstable):
pak::pak("Cidree/duckspatial@dev")A central design principle of {duckspatial} is that the same spatial operation can be used in different ways depending on how your data is stored and how you want to manage memory and performance.
Most functions support four complementary input/output combinations:
| Input | Output |
|---|---|
duckspatial_df / sf |
duckspatial_df / sf |
duckspatial_df / sf |
DuckDB table |
| DuckDB table | duckspatial_df / sf |
| DuckDB table | DuckDB table |
This means you can keep data inside DuckDB for as long as possible, pulling results into R only when you need them. See the Get Started vignette for worked examples of each workflow.
Bug reports, feature requests, and pull requests are very welcome!
