Assignment for Site Reliability Engineer position
ere at MovieRama, the engineering team is developing an app that became quite popular. The
H
application is a movie aggregator that uses a few public sources of information about movies
and through their APIs retrieves data and displays them on a web page. The app displays
movies that are on theatres this week but it also allows users to search by title for past or future
movies. There are also some personalised features, user comments and reviews.
rchitecturally the app consists of a few microservices, some redis instances for short term data
A
caching, a postgres instance for storing personalisation info, a Mongo to cache some of the
aggregated data and an elasticsearch cluster for performant searching through movies, reviews,
etc. The company so far didn't have anyone to worry about hosting, running and monitoring the
app and the sudden success raised a lot of challenges. Your team’s objective is to build the new
infrastructure that will host the app and you are following an IaC design, as well as to build and
manage the infrastructure that would provide monitoring both for infrastructure and for the
company’s microservices.
our task is to automate the provisioning of some of the supported services and also
Y
build a sample app that will be a blueprint for developers to build and deploy their own
web apps.
Tasks
1. Create one of the following infrastructure pieces using automation tools:
a. Replicated MongoDB cluster
b. Master-slave PostgreSQL setup
c. ELK stack
You can use the automation tool of your choice; terraform(preferred), ansible(preferred),
puppet, chef, etc
2. Create a sample web app that has:
a. A web server and replies to specific endpoints documented below.
b. Optional: Communicates with an external postgres db
This sample app should expose the following endpoints:
a. Reply to a /healthendpoint with HTTP code 200 whenthe pod is up and
running
b. Reply to a /metricsendpoint that exposes the numberof times each endpoint
has been called preferably in prometheus format.
c. Optional: Reply to a/readyendpoint with:
i. HTTP code of 200 when the app can communicate with the local postgres
ii. HTTP code of 503 when app can’t communicate with the local postgres
ou can use the virtualization technology of your choice; kubernetes/docker
Y
(preferred), vagrant, etc
3. Optional:
a. Deploy the prometheus / grafana stack.
b. Collect any relevant metrics from the services deployed in previous tasks.
NOTE: If you haven’t managed to deploy a metrics endpoint you can still work on
this task and use this endpoint to collect metrics:
https://vhs.workabledemo.com/metrics
Deliverables
The final deliverable should contain:
● A private repo named sre-movieramain Bitbucket/Githubwith (please include ALL
deliverables in the repository):
○ Your code / config files
○ A simple Readme.md that will describe the way to deploy the services along with
any other external systems that you may have used.
Notes
Instructions will be provided about who to share your private repo with.
●
● You can use either appropriate virtualization technologies (Vagrant, minikube, etc) or the
free tier for any of the cloud providers to setup the requested infrastructure.
● You are free to include any additional deployments/functionalities that you may find
relevant or could showcase your skills but please bear in mind that you should cover the
core requirements first before attempting any improvements.
● If you don’t finish all the tasks please document what you have done, what is missing
and how you would go about implementing the missing parts, basically document what
you have done, even the parts you didn’t finish.
● In the documentation feel free to discuss if necessary any choices/compromises you had
to make.
● Your solution will be judged based on the requirements listed above and the clarity of the
submission (deployment files, scripts, explanation).