This AI Agent Should Have Been a SQL Query
Explores building AI Agents as streaming SQL queries using platforms like Apache Flink for improved consistency, scalability, and developer experience.
Explores building AI Agents as streaming SQL queries using platforms like Apache Flink for improved consistency, scalability, and developer experience.
Part two of building a personal recommendation system, covering data collection from Pocket and content extraction using the Jina Reader API.
A developer documents the first steps in building a personalized content recommendation system using saved articles, text embeddings, and algorithms.
Introduces the 'leopards' Python library for filtering and aggregating lists, offering a lightweight alternative to pandas for basic data operations.
A cleaned-up, de-interleaved transcript of text message exhibits from the Twitter v. Elon Musk lawsuit, presented for clarity.
A talk on using Python to efficiently process and analyze large datasets from mass spectrometry, presented at a Python Frederick event.
Explains the APPROX_COUNT_DISTINCT function for faster, memory-efficient distinct counts in SQL, comparing it to exact COUNT(DISTINCT).
A guide to using the Unix command-line for efficient data science workflows, including data processing, exploration, and modeling.
A guide to using SQLite and Python's sqlite3 module to efficiently manage and query large datasets from text files.
A guide to seven essential command-line tools (jq, csvkit, Rio, etc.) for data scientists to obtain, scrub, explore, and model data.