Thanks to visit codestin.com
Credit goes to github.com

Skip to content

AlexandreVerept/AirIQ2

Repository files navigation

AIRIQ 2

Goal of this student project

This is a first year of master research project done by Mohamed Boulanouar, Maxime Thoor and Alexandre Verept, supervised by Kévin Hérissé (PhD student) at ISEN engineering school.

The goal of this project is to predict the air quality 1, 2 and 3 days in advance in Lille with the highest accuracy as possible.

In a way to accomplish this, we will use some machine learning and statistical techniques.

r

NB: Here you can find our first semester project, that consisted in the forecasting of the Air Index quality in Lille, based on only few data collected by a bee hive placed on the roof of the school. We had the chance to present our project in a public event organized by the MEL (Metropole Européenne de Lille).

Jeudi du Numérique

The result

See the result here ! (a bit empty for now as the project is not yet over)

Shiny

Our project architecture

Here is the general structure of our project:

Architecture

Data Collector:

  • This script is running in real time and collect all the data from different APIs, shape them if needed, then send the result to our Backend API to be stored in the Database.

    Compute engine

Real-time prediction script:

  • This is the script that run every day in order to make the prediction.
  • It ask all the information needed to the Data Collector.

Compute engine

Back and front end API:

  • Receive useful data from the Data Collector to store it in the Data base.

  • Provide data to our real time Prediction script, receive the results of the predictions and store them in the Data base.

  • Used to consult freely our predictions stored in the Database.

    • Note: our original plan was to create two distinct APIs so you could find some un-updated references to the Frontend and Backend API elsewhere on this git.

    Compute engine

Data Base:

  • MySQL database used to store all the data we need: the different open source datasets, predictions ...

    Compute engine

Final display:

  • Please see the result here made in R with Shiny.

    shiny

Technologies we used

We have used and learned some technologies and tools during this project:

  • Most of our scripts are running using the Google Cloud Platform, with a mySQL database and several appEngines.
  • Our final visualization is made in R with Shiny.
  • Both our APIs use Flask.
  • We trained our models using Google Colab and Tensorflow.

shinyGCPFlaskColabTensorflow

What have we learned ?

In a student project, the most important thing is what we learn from it, what experience we get:

  • As mentioned above, we discovered a lot of technologies by ourselves to create this product, such as Google Platform for the hosting or Flask for the API.

  • We improved our skills with Keras and Tensorflow when it comes to recurrent neural network and architectures with several inputs and outputs.

  • As all the work was done within a Covid-19 context, we had to adapt our methods of teamworking, especially with the planning.

Open data

The datasets we use must be a real time data to be useful for the prediction, but we also need archived data over a long period of time in a way to create a training dataset for our predictive model.

Potential open data APIs to exploit :

Name Source Description Frequency Time frame
Indice qualité de l'air MEL Air index quality. Every day Window of 5 years with the dataset the MEL send us and the public data online.
Données SYNOP Essentielles OMM Météo France Wide range of weather data including wind, pressure, humidity and temperature. Every 3 hours Since 1997.
Historique de l'indice Atmo ATMO Provide an index of the daily measures of NO2, O3, and PM10. Every day Since 2012.

About

Student Project on air quality prediction (2020). Mainly focused on data engineering.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •