fastduck provides some development experience improvements for the
standard duckdb python API.
pip install fastduckimport fastduck as fuck
from fastduck import databasedb = database('../data/chinook.duckdb')
dbDuckDBPyConnection (chinook_main)
dt = db.t
dt(chinook_main) Tables: Album, Artist, Customer, Employee, Genre, Invoice, InvoiceLine, MediaType, Playlist, PlaylistTrack, Track, todos
You can use this to grab a single table…
artist = dt.Artist
artist| ArtistId | Name |
|---|---|
| 1 | AC/DC |
| 2 | Accept |
| 3 | Aerosmith |
| … | … |
| 274 | Nash Ensemble |
| 275 | Philip Glass Ensemble |
275 rows x 2 cols
customer = dt['Customer']
customer| CustomerId | FirstName | LastName | Company | Address | City | State | Country | PostalCode | Phone | Fax | SupportRepId | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Luís | Gonçalves | Embraer - Empresa Brasileira de Aeronáutica S.A. | Av. Brigadeiro Faria Lima, 2170 | São José dos Campos | SP | Brazil | 12227-000 | +55 (12) 3923-5555 | +55 (12) 3923-5566 | [email protected] | 3 |
| 2 | Leonie | Köhler | Theodor-Heuss-Straße 34 | Stuttgart | Germany | 70174 | +49 0711 2842222 | [email protected] | 5 | |||
| 3 | François | Tremblay | 1498 rue Bélanger | Montréal | QC | Canada | H2G 1A7 | +1 (514) 721-4711 | [email protected] | 3 | ||
| … | … | … | … | … | … | … | … | … | … | … | … | … |
| 58 | Manoj | Pareek | 12,Community Centre | Delhi | India | 110017 | +91 0124 39883988 | [email protected] | 3 | |||
| 59 | Puja | Srivastava | 3,Raj Bhavan Road | Bangalore | India | 560001 | +91 080 22289999 | [email protected] | 3 |
59 rows x 13 cols
… or multiple tables at once:
dt['Artist', 'Album', 'Genre'][DuckDBPyRelation BASE TABLE chinook.main.Artist
,
DuckDBPyRelation BASE TABLE chinook.main.Album
,
DuckDBPyRelation BASE TABLE chinook.main.Genre
]
It also provides auto-complete in Jupyter, IPython and nearly any other interactive Python environment:
You can check if a table is in the database already:
'Artist' in dtTrue
Column work in a similar way to tables, using the c property:
ac = artist.c
ac, artist.columns(chinook.main.Artist Columns: ArtistId, Name, ['ArtistId', 'Name'])
Auto-complete works for columns too:
The tables and views of a database got some interesting new attributes….
artist.meta{'base': DuckDBPyConnection (chinook_main),
'catalog': 'chinook',
'schema': 'main',
'name': 'Artist',
'type': 'BASE TABLE',
'comment': None,
'shape': (275, 2)}
artist.model[{'name': 'ArtistId',
'type': 'INTEGER',
'nullable': False,
'default': None,
'pk': True},
{'name': 'Name',
'type': 'VARCHAR',
'nullable': True,
'default': None,
'pk': False}]
artist.cls, type(artist.cls)(fastduck.core.Artist, type)
duckdb replacement scans keep working and are wonderful for usage in
SQL statements:
db.sql("select * from artist where artist.Name like 'AC/%'")| ArtistId | Name |
|---|---|
| 1 | AC/DC |
1 rows x 2 cols
You can view the results of a query as records
db.sql("select * from artist where artist.Name like 'AC/%'").to_recs()[{'ArtistId': 1, 'Name': 'AC/DC'}]
or as a list of lists
db.sql("select * from artist where artist.Name like 'AC/%'").to_list()[[1, 'AC/DC']]
And you there is also an alias for sql with to_recs simply called
q
db.q("select * from artist where artist.Name like 'AC/%'")[{'ArtistId': 1, 'Name': 'AC/DC'}]
As we briefly saw, a dataclass type with the names, types and defaults
of the table is added to the Relation:
abm = db.t.Album
art = db.t.Artist
acca_sql = f"""
select abm.*
from abm join art using (ArtistID)
where art.Name like 'AC/%'
"""
acca_dacca = db.q(acca_sql)
acca_dacca[{'AlbumId': 1,
'Title': 'For Those About To Rock We Salute You',
'ArtistId': 1},
{'AlbumId': 4, 'Title': 'Let There Be Rock', 'ArtistId': 1}]
let_b_rock_obj = abm.cls(**acca_dacca[-1])
let_b_rock_objAlbum(AlbumId=4, Title='Let There Be Rock', ArtistId=1)
You can get the definition of the dataclass using fastcore’s
dataclass_src – everything is treated as nullable, in order to handle
auto-generated database values:
from fastcore.xtras import hl_md, dataclass_src
src = dataclass_src(db.t.Album.cls)
hl_md(src, 'python')@dataclass
class Album:
AlbumId: int32 = None
Title: str = None
ArtistId: int32 = None