Here's a summary of my data journey so far, highlighting what I've accomplished and what I'm working on:
- Python Libraries: Explored pandas, numpy, matplotlib, seaborn, plotly, altair, sklearn, streamlit, prophet, neural prophet, etc.
- Visualization: Learned about different graph types and customization with altair, and converting visuals to HTML for deployment. Also used streamlit to create quick web applications.
- Machine Learning: Developed a Bayesian Network with a novel approach to accurately make prediction profiling on the manufacturing defect and understanding of causality discovery among the intricate interplay interaction of the process variables - real industry project.
- Web Scraping: Exploring web-scrapping and breaking through the firewalls of reputable websites for data mining.
- Algorithms and Data Types: Learning about binary search, linear search, array, linked list, big O notation, selection sort, stack, queue, quicksort, hashtable, collisions, load factor, hash function.
- Mathematics: Exploring Bayesian statistics, hypothesis testing, probability sampling, statistical significance, designing tests, and inferential statistics.
- Future Interests: Interested in learning Rust, MLOps, DevOps, cloud computing, and handling big data with Hadoop and Spark.
- Building Data Pipelines: Used Python, AWS Lambda, AWS Redshift, CRON scheduling, and encrypting personally identifiable information (PII) data columns.
- Web Development: Gaining experience in HTML, CSS, JavaScript, and AWS for computing, storage, network routing, and authorization.
- Certification: Targeting to gain datacamp certificates in data engineer and data scientist in May 2024
- Competition: Looking for teams in Kaggle Competition
