data-load-tool

dlt is an open-source library that you can add to your Python scripts to load data from various and often messy data sources into well-structured, live datasets.

Extracting API data with a generator

Premise:

For this example, a simple http api is created that returns json "page by page", 1000 records per page.

It accepts a parameter called page, representing the page number. If we request a larger page number than there is data, we get an empty response.

To get the pages, we write a loop that asks for pages starting from 1 and increasing, until we receive an empty page.

As we do not know ahead of time how many pages have data and if they fit in memory, yielding the data so it can be handled page by page scales better than first collecting all pages in memory and then returning them.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
dlt_(data_load_tool)_data_extraction.ipynb		dlt_(data_load_tool)_data_extraction.ipynb
index.js		index.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

data-load-tool

Extracting API data with a generator

About

Uh oh!

Releases

Packages

Languages

sartajhajam/data-load-tool

Folders and files

Latest commit

History

Repository files navigation

data-load-tool

Extracting API data with a generator

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages