Thanks to visit codestin.com
Credit goes to Github.com

Skip to content

Duckdb server you can talk to over http

License

Notifications You must be signed in to change notification settings

gramian/radio-duck

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

60 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Radio-Duck

Connect to a Duckdb server over Http.

Project Image

Project Objectives

  • Talk to duckDb over transport protocols like Http.
  • Provide an alternative to embedding duckDb
  • Make DuckDb highly available for apps like Business intelligence dashboards
  • Ability to query data while data is being loaded.

Use cases

  • Act as data backend for Business intelligence dashboards like (metabase/superset) without embedding duckDb

UseCase1 Image

  • Highly available Duckdb

UseCase2 Image

Features

Run sql on Duckdb over http

Run sql

You can send a sql via http post request. refer to docs @ http://localhost:8000/docs

Transaction support

It's possible but in a limited fashion (i.e. in a single http post request)

#example:
curl -X 'POST' \
  'http://localhost:8000/v1/sql/' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "sql": "BEGIN; select * from pond; COMMIT;"
}'

Load data from cloud blob storage like s3/azure

It's the same as you are interacting with duckDb using sql.

Consume data from streams like kafka

[todo]

Requirements

cd PROJECT_DIR
mamba create -n radio-duck python=3.10
mamba activate radio-duck
mamba install poetry
poetry install

Run Tests

pytest

Live Demo

Feel free to run some sql with https://radioduck.pythonanywhere.com/docs

Note: Please try it on firefox/chrome. safari has some issue.

Try me with Docker

#pull from docker hub
docker pull jaihind213/radio-duck:latest
# for arm64 -> 'docker run --platform linux/arm64 -p 8000:8000 -t jaihind213/radio-duck:latest'
docker run -p 8000:8000 -t jaihind213/radio-duck:latest
#or
docker run -p 8000:8000 -v <path_to_data_dir>:/radio-duck/pond -t jaihind213/radio-duck:latest
#or
docker run -p 8000:8000 -v <path_to_data_dir>:/radio-duck/pond -v <path_to_my_config.ini>:/radio-duck/pond/my_config.ini -t jaihind213/radio-duck:latest python3 /radio-duck/server.py /radio-duck/pond/my_config.ini

Then access http://localhost:8000/docs and play with the api.

Configure RadioDuck with your secret credentials to access Azure blob storage / Aws s3

refer to https://duckdb.org/docs/configuration/secrets_manager.html

Build Docker image

sh buildDocker.sh <proj_version> <duckb_version>
#sh buildDocker.sh 0.1.2 0.9.0 local no

Docker Image versioning

The docker image version is in the following format:

d<DUCKDB_VERSION>-v<PROJECT_VERSION>-<DOCKERFILE_VERSION>
ex: d0.10.0-v0.1.2-c1d9c346 

This specifies that radio duck is running with duckdb=0.10.0, the project version being 0.1.2.
These versions are derived from pyproject.toml

Try me with Apache Superset

#run radio-duck
docker run -p 8000:8000 -t jaihind213/radio-duck:latest
# for macM1 -> 'add --platform linux/amd64 '
#run superset
docker pull jaihind213/superset-radioduck:latest
# for macM1 -> 'add --platform linux/amd64 '
docker run -e TALISMAN_ENABLED="False" -e ADMIN_USERNAME=admin -e ADMIN_PASSWORD=admin -e SUPERSET_SECRET_KEY=<some_good_key> -p 8088:8088 jaihind213/superset-radioduck:latest

Then access http://localhost:8088/

Then configure database with alchemy URI 'radio_duck+district5://user:pass@<HOST_IP>:8000/?api=/v1/sql/&scheme=http'

Video of Demo.

Demo

Try me with Python code via District5 library

https://github.com/jaihind213/district5

#python 3.9^
pip install district5
#start the duckdb server i.e. radio-duck
docker run -p 8000:8000 -t jaihind213/radio-duck:latest
#the duckdb starts up with a sample table whose ddl is: 
#'create table pond(duck_type string, total int)'
echo "we will try to query that"

from sqlalchemy import create_engine, text
from sqlalchemy.dialects import registry

registry.register(
    "radio_duck.district5", "radio_duck.sqlalchemy", "RadioDuckDialect"
)


#run docker instance of radio-duck
#docker run -p 8000:8000 -t jaihind213/radio-duck:latest
engine = create_engine(
    "radio_duck+district5://user:pass@localhost:8000/?api=/v1/sql/&scheme=http"
)
# Establish a database connection
conn = engine.connect()

# Define a SQL query using qmark style or positional style
try:
    query_1 = text("""SELECT duck_type, total FROM pond where total > :total""")
    params = {"total": 0}
    result = conn.execute(query_1, params)
    # Fetch and print the results
    for row in result:
        print(row)

    print("--------------")
    query_2 = "SELECT duck_type, total FROM pond where total > ?"
    result = conn.execute(query_2, (0,))

    for row in result:
        print(row)

except Exception as e:
    print(f"An error occurred: {e}")

finally:
    # Close the database connection
    conn.close()
    engine.dispose()

Check for Docker image Vulnerabilities

sh buildDocker.sh <version>
sh docker_sec_check.sh
# github actions is scanning for Vulnerabilities

Notes:

  • todo: stream large datasets / compress results say json-smile?
  • todo: limit number of conns to duckdb ?
  • todo: return query_id ,useful for debugging.
  • todo: handle transactions over http ? i.e. multiple http requests.

About

Duckdb server you can talk to over http

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 82.4%
  • Dockerfile 9.3%
  • Shell 8.3%