Thanks to visit codestin.com
Credit goes to github.com

Skip to content

john-oshea/projects

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


Welcome to codema-dev projects!

Download, wrangle & explore all Irish energy datasets used by the codema-dev team

⚠️ Some projects use closed-access datasets for which you will need permission from the codema-dev team to use! Email us at [email protected]

Setup

Run the projects in your browser by clicking on the following buttons:

Binder ⬅️ click me to launch workspace

⬅️ click me

Binder can take a few minutes to setup this workspace, click Build logs > show to see view the build progress.

  • Double click on the project you want to open

  • Right click on the README.md file, Open With > Notebook and run all cells

open-with-notebook.png


Binder runs this code in the cloud for free with the help of NumFocus, if you find this useful consider donating to them here

This link was generated using:


Gitpod ready-to-code ⬅️ click me launch workspace

⬅️ click me
  • Double click on the project you want to open

  • Right click README.md > Open Preview to view the project guide

  • Change your Terminal directory to a project folder by running:

    cd NAME-OF-PROJECT

⚠️ Warning! ⚠️

  • If (/workspace/projects/venv) disappears from your prompt this means your Terminal no longer has access to all of the dependencies required to run projects so you need to reactivate it by running:
    conda activate /workspace/projects/venv
  • If the Terminal disappears from the bottom of your screen click ≡ > Terminal > New`` Terminal

💻 Running locally

⬅️ click me

Easy:

Lightweight:

  • Install:

  • Install all project dependencies via each project's environment.yml in your Terminal:

    conda create env --file environment.yml && conda activate NAME-OF-ENVIRONMENT
    

    Click the environment.yml to view the environment name

  • Follow the GitPod instructions


How-To Guides

⚠️ Accessing closed-access data

⬅️ click me
  • Create a new file called .env in your project directory

  • Add your s3 credentials to the .env file:

AWS_ACCESS_KEY_ID = "AKIA...."
AWS_SECRET_ACCESS_KEY = "KXY6..."

❓ FAQ

⬅️ click me
  • If after running a project you see ...

    (1)

    botocore.exceptions.NoCredentialsError: Unable to locate credentials

    ... follow the instructions at ⚠️ Accessing closed-access data

    (2)

    ModuleNotFoundError

    ... install the missing module with conda install NAME or pip install NAME and raise an issue on our Github


Why?

In previous years all data wrangling was performed solely using Microsoft Excel. Although this is useful for small datasets, it soon becomes a burden when working with multiple, large datasets.

For example, when generating the previous residential energy estimates it was necessary to create up to 16 separate workbooks for each local authority each containing as many as 15 sheets, as the datasets were too large to fit into a single workbook. Although each workbook performed the same logic to clean and merge datasets, changing this logic meant changing all of the separate workbooks one at a time.

Moving to open-source scripting tools enabled using logic written down in scripts (or text files) to wrangle and merge data files, thus separating data from the logic operating on it. This means that if any dataset is updated, re-generating outputs is as simple as running a few scripts. Furthermore these scripts can be shared without sharing the underlying datasets.


Keeping the global environment.yml up to date

This environment.yml is built by merging the environment.yml from each project. Binder & GitPod use it to create a sandbox environment in which all dependencies are installed.

To update this file run:

conda env create --file environment.meta.yml --name codema-dev-projects-meta
conda activate codema-dev-projects-meta
invoke merge-environment-ymls

conda env create creates a virtual environment by reading environment.meta.yml in which invoke is defined as a dependency. invoke then runs the function merge_environment_ymls from tasks.py which merges the environment.yml from each project and from environment.meta.yml together into a single environment.yml

To speed up Binder builds, Binder reads the codema-dev/projects dependencies from a separate repository codema-dev/projects-sandbox. You must also update the environment.yml here with your newly generated environment.yml to keep Binder up to date!

Every time any file is changed Binder rebuilds the entire repository and reinstalls the dependencies. By keeping the environment and the content separate Binder only reinstalls dependencies when the dependencies change. This means that it no longer has to download & resolve dependency conflicts which can take ~20 minutes.

About

Download, wrangle & explore all Irish energy datasets used by the codema-dev team

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.1%
  • Other 0.9%