Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Data Structure encapsulation for New York taxi datasets

License

yleprince/ravioly

Repository files navigation

ravioly 🍝

Context 🚗 🏙️ 🇺🇸

The goal is to have a production ready code to analyse New York taxi data.

Ravioly is a way to encapsulate the raw data contained in NYC taxi csv files. As mentionned in the documentation (https://yleprince.github.io/ravioly/) it is built on top of pandas.DataFrame and provide specific processing functionalities and dedicated methods.

Install and Use: 🌱

💡 Python ^3.7 is required.

Install using pip:

pip install git+https://github.com/yleprince/ravioly.git

In your python code:

>>> from ravioly.datastructure import Ravioly

>>> df = Ravioly('../data/nyc_data.csv', nrows=1000)
>>> df.km_by_dow()

day_of_week
0    480.647876
1    466.137703
2    553.287868
3    427.187865
4    465.982398
5    489.352866
6    557.113716
Name: km_by_dow, dtype: float64

Tools used: ⚙️

  • Dev: 💻

    • pandas: the code is a subclass of pandas.DataFrame data structure.
    • poetry: package management is made with poetry
  • Lint: 📐

    • isort: to sort the imports
    • mypy: to check the use of types within the code
    • black: to format automatically the code with the pep8 requirements
    • flake8: to check code syntax
  • Tests: 🧑‍🏫

    • pytest: to unit test the code
    • pytest-cov: to check percentage of code covered by the tests
  • CI: 🤖

  • Documentation: 📚

  • Bonus: 🎁

    • pre-commit: pre-commit allows to run lint and tests workflow automatically at every step of the project
    • pre-push: pre-push allows to update the documentation every time the code is pushed on the github.

About

Data Structure encapsulation for New York taxi datasets

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages