🚀 DataHorse is an open-source tool and Python library that simplifies data science for everyone. It lets users interact with data in plain English 📝, without needing technical skills or watching tutorials 🎥 to learn how to use it. With DataHorse, you can create graphs 📊, modify data 🛠️, and even create smart systems called machine learning models 🤖 to get answers or make predictions. It’s designed to help businesses and individuals 💼 regardless of knowledge background to quickly understand their data and make smart, data-driven decisions, all with ease. ✨
pip install datahorseWe’re using the Iris flower dataset as an example to demonstrate how DataHorse simplifies data analysis. This example showcases how our tool can handle real-world data, making it easier to work with and understand.
Setup and usage examples are available in this Google Colab notebook.
import datahorse
df = datahorse.read('https://raw.githubusercontent.com/plotly/datasets/master/iris-data.csv')df = df.chat('convert species names to numeric codes')seed=int: Ensures that the generated function is reproducible across different runs.cache_req=True: Enables caching for the API request, ensuring that identical prompts won't trigger unnecessary API calls.
df = df.chat('convert species names to numeric codes', seed=int, cache_req=True)df.chat('train a classification model and save the model')datahorse.test("path of the saved model",[["list of testing features"]])git clone https://github.com/DeDolphins/DataHorse.gitcd DataHorseUIpip install -r requirements.textstreamlit run app.py⭐️ Star DataHorse to increase our visibility
Found a bug or have an improvement in mind? Fantastic!
Got a solution ready? That's even better!
Ready to share it with us? We're all ears!
Start at the contributing guide!