Download, wrangle & explore all Irish energy datasets used by the codema-dev team
⚠️ Some projects use closed-access datasets for which you will need permission from thecodema-devteam to use! Email us at [email protected]
Run the projects in your browser by clicking on the following buttons:
⬅️ click me to launch workspace
⬅️ click me
Bindercan take a few minutes to setup this workspace, clickBuild logs > showto see view the build progress.
-
Double click on the project you want to open
-
Right click on the
README.mdfile,Open With > Notebookand run all cells
❓
Binderruns this code in the cloud for free with the help ofNumFocus, if you find this useful consider donating to them here
This link was generated using:
Binderon https://jupyterhub.github.io/nbgitpuller/link.html- environment repository = https://github.com/codema-dev/projects-sandbox
- content repository = https://github.com/codema-dev/projects-sandbox
⬅️ click me
-
Double click on the project you want to open
-
Right click
README.md > Open Previewto view the project guide -
Change your Terminal directory to a project folder by running:
cd NAME-OF-PROJECT
- If
(/workspace/projects/venv)disappears from your prompt this means your Terminal no longer has access to all of the dependencies required to run projects so you need to reactivate it by running:conda activate /workspace/projects/venv
- If the Terminal disappears from the bottom of your screen click
≡ > Terminal > New`` Terminal
💻 Running locally
⬅️ click me
Easy:
- Install Anaconda
- Import the
environment.ymlof a project via Anaconda Navigator - Launch VSCode from Anaconda Navigator
- Install Python for VSCode
- Follow the GitPod instructions
Lightweight:
-
Install:
-
Install all project dependencies via each project's
environment.ymlin your Terminal:conda create env --file environment.yml && conda activate NAME-OF-ENVIRONMENTClick the
environment.ymlto view the environment name -
Follow the GitPod instructions
⬅️ click me
-
Create a new file called
.envin your project directory -
Add your s3 credentials to the
.envfile:
AWS_ACCESS_KEY_ID = "AKIA...."
AWS_SECRET_ACCESS_KEY = "KXY6..."
❓ FAQ
⬅️ click me
-
If after running a project you see ...
(1)
botocore.exceptions.NoCredentialsError: Unable to locate credentials
... follow the instructions at
⚠️ Accessing closed-access data(2)
ModuleNotFoundError
... install the missing module with
conda install NAMEorpip install NAMEand raise an issue on our Github
In previous years all data wrangling was performed solely using Microsoft Excel. Although this is useful for small datasets, it soon becomes a burden when working with multiple, large datasets.
For example, when generating the previous residential energy estimates it was necessary to create up to 16 separate workbooks for each local authority each containing as many as 15 sheets, as the datasets were too large to fit into a single workbook. Although each workbook performed the same logic to clean and merge datasets, changing this logic meant changing all of the separate workbooks one at a time.
Moving to open-source scripting tools enabled using logic written down in scripts (or text files) to wrangle and merge data files, thus separating data from the logic operating on it. This means that if any dataset is updated, re-generating outputs is as simple as running a few scripts. Furthermore these scripts can be shared without sharing the underlying datasets.
This environment.yml is built by merging the environment.yml from each project. Binder & GitPod use it to create a sandbox environment in which all dependencies are installed.
To update this file run:
conda env create --file environment.meta.yml --name codema-dev-projects-meta
conda activate codema-dev-projects-meta
invoke merge-environment-ymls
conda env createcreates a virtual environment by readingenvironment.meta.ymlin whichinvokeis defined as a dependency.invokethen runs the functionmerge_environment_ymlsfromtasks.pywhich merges theenvironment.ymlfrom each project and fromenvironment.meta.ymltogether into a singleenvironment.yml
To speed up Binder builds, Binder reads the codema-dev/projects dependencies from a separate repository codema-dev/projects-sandbox. You must also update the environment.yml here with your newly generated environment.yml to keep Binder up to date!
Every time any file is changed
Binderrebuilds the entire repository and reinstalls the dependencies. By keeping the environment and the content separateBinderonly reinstalls dependencies when the dependencies change. This means that it no longer has to download & resolve dependency conflicts which can take ~20 minutes.
