Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
165 views22 pages

PFE Book - Mass Analytics - 2022

Internship Offers at MASS Analytics Departments: Marketing Analytics and R&D MASS Analytics is offering internship projects in their Marketing Analytics and R&D departments. In Marketing Analytics, there are 4 potential projects involving marketing mix modeling, panel data analysis, handling endogeneity, and Bayesian regression analysis. In R&D, 2 potential projects are described involving time series cross validation and improving the Expectation Maximization algorithm. The qualifications required for both departments include strong problem solving, programming, mathematical modeling, and communication skills. Interested candidates should apply by sending their resume and motivation letter.

Uploaded by

farouk benalia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
165 views22 pages

PFE Book - Mass Analytics - 2022

Internship Offers at MASS Analytics Departments: Marketing Analytics and R&D MASS Analytics is offering internship projects in their Marketing Analytics and R&D departments. In Marketing Analytics, there are 4 potential projects involving marketing mix modeling, panel data analysis, handling endogeneity, and Bayesian regression analysis. In R&D, 2 potential projects are described involving time series cross validation and improving the Expectation Maximization algorithm. The qualifications required for both departments include strong problem solving, programming, mathematical modeling, and communication skills. Interested candidates should apply by sending their resume and motivation letter.

Uploaded by

farouk benalia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Internship Offers

at MASS Analytics
Departments: Marketing Analytics and R&D

2022
1
1 About MASS Analytics

2 Projects Offered – Marketing Analytics


OUTLINE
3 Projects Offered – R&D

2
1

About MASS Analytics

3
About MASS Analytics
MASS Analytics (www.mass-analytics.com) specializes in collecting and analyzing big data using
the latest technology available to assist companies in their strategic decision making.

MASS Analytics offers two types of services:

Marketing Mix Modeling Tools Development


We help companies understand and evaluate the We use the latest technology available to develop
performance of their marketing mix and measure the smart and integrated tools that specialize in
return on investment (ROI) from each media channel organizing, querying, charting and analyzing several
used and each marketing activity run. The Evaluation streams of data. Our flagship product “MassTer” takes
phase is then followed by clear recommendations on the user through an end-to-end analysis journey from
how best to optimize budgets going forward. data loading to data processing, charting, modelling,
optimization, forecasting and reporting.
4
1
2

Projects Offered – Marketing Analytics

5
Projects Offered
MASS has four potential projects offering a fantastic opportunity for a technically proficient Data Scientist with
strong mathematical modeling skills. The candidate will help build new and maintain existing software components.
The position will require supporting all phases of the product lifecycle including analysis, development, and testing.
World class standards are to be applied and consequently a high level of commitment is expected.

The successful candidate will have:


• A good degree in Computer Science, Electrical
Engineering, Data Analytics or similar.
• A keen interest in the latest development tools
and technologies.
• Strong problem-solving skills are also expected.
• Familiarity with mathematical modeling and
analytics is a plus.

6
1

Facebook Robyn / Ridge Regression

Robyn is an open-source R library published by Facebook allowing to run Marketing Mix Modeling. The library includes
various modules for data preparation, modeling and reporting. For modeling, Robyn mainly uses Ridge Regression
and timeseries cross validation techniques to deal with multi-collinearity and over-fitting issues commonly encountered
in traditional predictive analytics based on linear regression.

The project aims to explore the capabilities of Robyn and compare its models with those built using MASS Analytics
proprietary modeling software where other advanced modeling techniques are used.

• Keywords: Regression Analysis, Ridge Regression, Cross-Validation

7
2

N Level Panel Data Analysis Using Random Effects

Panel Data analysis allows to estimate model parameters across multiple levels such as regions, groups, segments,
etc. To this end, various modeling techniques can be used. Namely, Fixed effects, Random effects, and Mixed effects
(which is the combination of the first two). At Mass Analytics a Mixed Effects estimator is implemented using the
popular Expectation Maximization Algorithm (EM).

The goal of this project is to apply the three estimators on different business problems and compare their results both
statistically and in terms of the business insights extracted. Namely, we are interested in using Mixed Effects Modeling
to estimate n-level hierarchical effects. In addition, analyzing and understanding the convergence properties of the EM
algorithm is a plus.

• Keywords: Fixed effects, Random effects, Mixed effects, Expectation Maximization Algorithm, Hierarchical Modeling

8
3

Handling Endogeneity Using 2SLS Estimator

Endogeneity is a serious problem commonly encountered in regression analysis problems yet often ignored by data
analysts.

The objective of this project is to study endogeneity thoroughly and understand its causes, how to detect it and the
solutions proposed in the literature to deal with it. Namely, the Two-Stage Least Squares (2SLS) method will be applied
and its capabilities to handle endogeneity will be studied.

• Keywords: Endogeneity, Regression Analysis, 2SLS

9
4

Bayesian / MAP Regression Analysis

Regression analysis is often conducted using Maximum Likelihood Estimators (MLE) which offer satisfactory point estimates forthe
model parameters. However, MLE does not benefit from any prior knowledge that may have already been gathered about the problem in
hand. For this reason, Maximum A Priori Estimators (MAP), where a prior distribution is considered, can alternatively be used.

Similarly, Bayesian regression, has considerably gained popularity among data analysts in recent years because it is capable to
incorporate this prior knowledge to estimate the model parameters. In addition, Bayesian analysis is also capable to estimate the full
distribution of these parameters (as opposed to a simple point estimate) hence allowing to quantify the uncertainty of the model.

The objective of this project is to apply this novel Bayesian regression approach to Marketing Mix Modeling using Python or R.

• Keywords: Bayesian Regression, MAP, MLE

10
Qualifications
Marketing Analytics Department

Skills Required (essential)

Ability to
Good communicate with
problem-solving Familiarity with technical and non-
skills design patterns technical staff in a
clear and effective
manner
Capable of Reliably
Object oriented producing high-
working
programming quality work to
independently
tight deadlines

Interested candidates please send your CV and motivation letter to [email protected]

11
1
3

Projects Offered – R&D

12
Projects Offered
MASS has potential projects offering a fantastic opportunity for a technically proficient Java Developer and/or a
Data Scientist with a strong mathematical modelling skill. The candidate will help build new and maintain existing
software components. The position will require supporting all phases of the product lifecycle including analysis,
development, and testing. World class standards are to be applied and consequently a high level of commitment is
expected.

The successful candidate will have:


• Good degree in Computer Science, Electrical
Engineering, Data Analytics or similar.
• A keen interest in the latest development tools
and technologies
• Strong problem-solving skills are also expected.
• Familiarity with mathematical modelling and
analytics is a plus.

13
1

Time Series Cross Validation

Cross Validation is a technique widely used to assess the quality of various machine learning models. It allows to
quantify the predictive capabilities of these models and their immunity to the common over-fitting problems.

The objective of this project is to implement a cross validation module for time-series to reinforce the quality of fit
metrics used in our existing regression analysis engines.
Upon success, the so implemented module will also be integrated in our automatic model selection engines
based on Genetic Algorithms. The cross-validation score will then be considered to select the best combination of
independent variables.

• Keywords: Regression Analysis, Cross-Validation, Over-fitting


• Technology: Java

14
2

Expectation Maximization Algorithm

Expectation-Maximization (EM) algorithm is a very important machine learning technique used in various
applications. We use this algorithm to estimate the parameters of the so called Mixed-Effects Models (Random &
Fixed effects) often needed to solve various complex business problems.

The objective of this project is to analyze the convergence and performance properties of the EM algorithm in
various conditions and modelling requirements. Based on the outcome of this analysis, areas to improve the
stability and efficiency of the algorithm will be addressed.

• Keywords: Expectation Maximization Algorithm, Regression Analysis, Mixed (Fixed & Random) Effects Models
• Technology: Java

15
3

Hybrid Stepwise & Genetic Algorithms Model Selection

Stepwise and Genetic Algorithms are two widely used approaches for model selection in Regression Analysis.

The objective of this project is to build a hybrid approach where the two techniques will be combined in one single
algorithm. Different complex scenarios should be tested and analyzed to acquire a deep understanding of the
algorithm and its convergence properties.

• Keywords: Genetic Algorithms, Stepwise Model Selection, Regression Analysis.


• Technology: Java

16
4

Model Selection with Random Forests

Random Forests is a machine learning technique used to solve regression and classification problems. It is also
often used as a model selection technique.

This project consists of adapting Random Forests to enhance our automatic regression analysis engine to select
the best combination of explanatory independent variables.

• Keywords: Regression Analysis, Random Forests, Model Selection


• Technology: Java

17
5

Snowflake and Data Connectivity with Spark

Snowflake is a Data Cloud platform increasingly being adopted by major companies worldwide. It offers an easy
to manipulate proxy to major Cloud platforms like AWS, Azure and Google Cloud.

To maintain the competitiveness of our software, we aim to connect our various software solutions to Snowflake
to continue to serve our global clients.
The candidate will have to implement a data access module (ETL) using Spark SQL Technology to ensure a
robust and efficient data connectivity with Snowflake.

• Keywords: Snowflake, Apache Spark, SQL, Databases, ETL


• Technology: Java

18
6

A Web Reports Visualization Dashboard

This project consists of creating a dashboard to display different reports on a browser. The reports can be the
output of various machine learning models and/or different cross sections of a large data cube. This process
should be automated and customizable. The dashboard should be compatible with all major browsers.

Upon success, the dashboard will also be deployed on a web server to display reports deployed on Snowflake (a
popular Data Cloud platform).

• Keywords: Dashboard, Web Development, Snowflake


• Technology: Java, Spring Boot, Angular JS

19
7

Parallel Processing on a Compute Farm

In regression analysis, selecting the best combination of predictors from thousands of potential candidates is
often a very time-consuming task especially when advanced complex iterative algorithms are deployed.

The objective of this project is to exploit the properties of our model selection engines (mainly based on Genetic
Algorithms) to parallelize the process and use multi-threading to improve the performance of these engines.
Upon success, this approach will be generalized to distributed computing on a dedicated compute farm in
premises or potentially on the cloud.

Keywords: Parallel processing, Distributed Computing, Multi-threading


Technology: Java

20
Qualifications
R&D Department

Required Skills Useful Additional Skills

Good Java, python, C#, R


problem-solving
skills
Swing, hibernate, Xml,
Ability to JSF, PHP, SQL
Object oriented communicate with
programming and technical and non-
familiarity with design technical staff in a
patterns Exposure to agile
clear and effective
development
manner
practices and Test- SVN, Junit, Maven
Ability of working Driven
independently and Development
reliably producing high-
quality work to tight Mathematical skills
deadlines (e.g. econometric
analysis), machine
learning, …

Interested candidates please send your CV and motivation letter to [email protected]


21
THANK YOU!
For questions or more info:

PHONE NUMBER:
(+44) (0) 208 123 2461

EMAIL ADDRESS:
[email protected]

22

You might also like