Welcome! This repository contains the official starting environment for all data engineering projects and tutorials on the Developyr YouTube Channel.
Ever heard the dreaded phrase, "but it works on my machine"? This dev container is designed to eliminate that problem forever. It provides a clean, isolated, and reproducible environment using Docker and VS Code, ensuring that your data projects always have the right tools and dependencies, no matter where you run them.
We use this exact setup as the foundation for exploring and building the 6 Modern Lakehouse Archetypes on the channel. By using this container, you can follow along with every tutorial, step-by-step.
- Perfect Reproducibility: Get the exact same environment as the tutorials, every single time.
- Keep Your Host Machine Clean: All tools and dependencies are installed inside the container, not on your main OS.
- Get Started Fast: Pre-configured with the essentials for modern data engineering.
- The Perfect Sandbox: A safe place to experiment with data tools and pipelines without consequences.
Before you begin, make sure you have the following installed:
- Visual Studio Code
- Docker Desktop
- The Dev Containers extension for VS Code.
It only takes two steps to get up and running:
- Clone the Repository
git clone [Your-Repo-URL-Here]
- Open in VS Code and Reopen in Container
- Open the cloned folder in VS Code.
- You'll see a pop-up in the bottom-right corner asking to "Reopen in Container". Click it.
- VS Code will now build the container. This might take a few minutes on the first run.
That's it! Your terminal is now running inside the container, and you're ready to start building.
.devcontainer/: This folder contains the magic.- `devcontainer
This repo is just the beginning. Join me on my journey to master modern data engineering:
- YouTube: youtube.com/@developyr
- LinkedIn: linkedin.com/in/wallacedanielk
- Substack: developyr.substack.com
- Bluesky: bsky.app/profile/developyr.bsky.social