Jan 152020Conftest is a tool to help you write tests against structured configuration data. It relies on Rego which is a nice query language that comes with a bunch of built-in functions that are ready to use. By using it, you can write tests against the config types below:
- YAML/JSON
- INI
- TOML
- HOCON
- HCL/HCL2
- CUE
- Dockerfile
- EDN
- XML
When it comes to talking about conftest's pros/cons, there're some unique features that some other testing tools don't have.
Pros:
You can:
- write more declarative tests(policies) which are not simply assertions.
- write tests against many kinds of config types.
- use --combine flag to combine some different files in one context for using their variables globally.
- use parse command to see how the inputs are parsed.
- combine different input types in one test run and apply combined policy against them.
- Pull/push policies from different kinds of sources like S3, docker registry, github file, etc...
- Find real-world examples in examples/ folder
Cons:
- Learning Rego could be a little bit time consuming
Finally, I encourage folks either to look at conftest's source code and rego language.
It's a simple, single-threaded command-line tool. I recommend folks to integrate it to their organizations also PR's are welcome.
Here's the repo: https://github.com/instrumenta/conftest
Thanks!
Apr 162019Hello All,
In this article, I'm gonna show you how we transform our ETL processes to spark which runs as Kubernetes pods.
Before that, we prefer custom python codes for our ETLs.
The problem about this project is a need for a distributed key-value store and when we pick a solution like Redis, It creates too much internal I/O between slave docker containers and Redis. The performance with spark is much better.
Also, the master creates numbers of slaves and manages the containers. Sometimes, docker-py library fails with communicating the docker-engine and the master can't delete the slaves or Redis containers. This causes idempotency problems.
You have to distribute the slave containers across your docker cluster which means that you have to put too many cross-functional requirements next to your business code.
We inspect the spark documentation for Kubernetes because we have been already using Kubernetes for our production environment.
We use the version 2.3.3 for Spark-Kubernetes.
You can have a look at this: https://spark.apache.org/docs/2.3.3/running-on-kubernetes.html
Even the Spark Documentation says the feature is experimental for now, we started to run spark jobs on our Kubernetes cluster.
This feature allows us to run spark across our cluster.
Easy to use.
Secured. Because you have to create a specific user for spark driver and executors.
Enough parameters for Kubernetes (node-selector for computation, core limit, number of executors, etc.)
We bundled the spark submit codes with our artifact jar.
After this step, the docker container can make a request to k8s master, starts the driver pod, and the driver pod creates executors from the same image.
This allows us to bundle all the things in one image. If the code change, CI creates a new bundle and publish it to the registry.
The image describes the architecture below.

First of all, you have to create a base image.
Download the "spark-2.3.3-bin-hadoop2.7" from here https://spark.apache.org/downloads.html and unzip it.
Create an image from this.
./bin/docker-image-tool.sh -r internal-registry-url.com:5000 -t base build
./bin/docker-image-tool.sh -r internal-registry-url.com:5000 -t base push
We created multi-staged Dockerfile like this.
FROM hseeberger/scala-sbt:11.0.1_2.12.7_1.2.6 AS build-env
COPY . /app
WORKDIR /app
ENV SPARK_APPLICATION_MAIN_CLASS Main
RUN sbt update && \
sbt clean assembly
RUN SPARK_APPLICATION_JAR_LOCATION=`find /app/target -iname '*-assembly-*.jar' | head -n1` && \
export SPARK_APPLICATION_JAR_LOCATION && \
mkdir /publish && \
cp -R ${SPARK_APPLICATION_JAR_LOCATION} /publish/ && \
ls -la ${SPARK_APPLICATION_JAR_LOCATION} && \
ls -la /publish
FROM internal-registry-url.com:5000/spark:base
RUN apk add --no-cache tzdata
ENV TZ=Europe/Istanbul
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone
COPY --from=build-env /publish/* /opt/spark/examples/jars/
COPY --from=build-env /app/secrets/* /opt/spark/secrets/
COPY --from=build-env /app/run.sh /opt/spark/
WORKDIR /opt/spark
CMD [ "/opt/spark/run.sh" ]
And our run.sh script is like this :
#!/bin/bash
bin/spark-submit \
--master k8s://https://${KUBERNETS_MASTER}:6443 \
--deploy-mode cluster \
--name coverage-${MORDOR_ENV} \
--class Main \
--conf spark.executor.instances=${NUMBER_OF_EXECUTORS} \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
--conf spark.kubernetes.driverEnv.MORDOR_ENV=${MORDOR_ENV} \
--conf spark.kubernetes.driver.label.app=coverage-${MORDOR_ENV} \
--conf spark.kubernetes.container.image.pullPolicy=Always \
--conf spark.kubernetes.container.image=http://internal-registry-url.com:5000/coveragecalculator:${VERSION} \
--conf spark.kubernetes.driver.pod.name=coverage-${MORDOR_ENV} \
--conf spark.kubernetes.authenticate.submission.caCertFile=/opt/spark/secrets/${CRT_FILE} \
--conf spark.kubernetes.authenticate.submission.oauthToken=${CRT_TOKEN} \
--conf spark.kubernetes.driver.limit.cores=${DRIVER_CORE_LIMIT} \
--conf spark.kubernetes.executor.limit.cores=${EXECUTOR_CORE_LIMIT} \
local:///opt/spark/examples/jars/CoverageCalculator-assembly-0.1.jar
Notice that, you have to place the secrets in secrets/ folder in order to create pods with single image.
After the driver pod created, it uses the internal executor pod creation scripts which also placed in spark:base image described also in the spark-kubernetes documentation.
We created the pipelines as build-push -> run-on-qa-cluster -> run-on-preprod-cluster -> run-on-prod-cluster

The run scripts placed in pipeline, pass the parameters to run.sh and we run like this :
docker run -i --entrypoint /bin/bash -e KUBERNETS_MASTER='yourkubernetesmasterip' -e NUMBER_OF_EXECUTORS=5 -e MORDOR_ENV='qa' -e VERSION=$GO_PIPELINE_LABEL -e CRT_FILE='non_prod_ca.crt' -e CRT_TOKEN='THE_USER_CRT_TOKEN' -e DRIVER_CORE_LIMIT=2 -e EXECUTOR_CORE_LIMIT=2 -v /etc/resolv.conf:/etc/resolv.conf:ro -v /etc/localtime:/etc/localtime:ro 192.168.57.20:5000/coveragecalculator:$GO_PIPELINE_LABEL /opt/spark/run.sh
This command creates one driver pod which has core limit equals to 2.
And after that, 5 executor Pods are created by spark:base. Each one of them has also 2 core limits.

Nov 192018Hi All,
In this Article, I'm gonna talk about the Kubernetes Horizontal Pod Autoscale object and the Custom Metrics API and how we scale our API's in Hepsiburada.
Before digging into HPA, take a look at https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale
HPA determines if we need more pods and scales the number of Pod. You can scale using the CPU and memory metrics using "K8s Metrics-Server".
However, Kubernetes 1.6 adds support for making use of custom metrics in the Horizontal Pod Autoscaler. With Custom Metrics, you can attach Influxdb/Prometheus or another third party time series db.
There is a nice project and ready to go YAML's in GitHub https://github.com/stefanprodan/k8s-prom-hpa with described and detailed autoscale mechanism deeply.
The Prometheus collects metrics from your applications/pods and stores them on Prometheus. You can use the annotations in your deployment YAML's.
The default path is "/metrics"
annotations:
prometheus.io/scrape: 'true'
prometheus.io/path: '/metrics-text'
The Custom Metrics API is responsible for collect data from Prometheus and passes them to HPA.
After you connect your HPA, you can test and verify its working properly.
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1" | jq .
The exposed metrics that also exists in Prometheus are shown below.

For example, the "application_httprequests_active" metric is exposed by our API. Also, this can be used with HPA like this.
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: podinfo
spec:
scaleTargetRef:
apiVersion: extensions/v1beta1
kind: Deployment
name: podinfo
minReplicas: 5
maxReplicas: 40
metrics:
- type: Pods
pods:
metricName: application_httprequests_active
targetAverageValue: 1000
Here are the instances of our Grafana Dashboards which is connected to Prometheus and shows autoscale's in Kubernetes. You can inspect the Pod memory and the newly created Pods can be seen there. At "07:56" and "08:00" people started to use Search API more and after scaling process, metrics become normal.

Jul 172018 It's been a long time since I have written my last post. In this period, I dig into Kubernetes mostly. Kubernetes is a deployment automation system that manages containers in distributed environments. It simplifies common tasks like deployment, scaling, configuration, versioning, log management and a lot more.
In this article, you will find how can a dotnetcore app put into kubernetes using blue-green deployment and using the pipeline as code. In this case, I used GoCD and their yaml plugin: https://github.com/tomzo/gocd-yaml-config-plugin
First of all, you have to dockerise your dotnetcore app. Here is a snippet for example.
FROM microsoft/dotnet:2.0.5-sdk-2.1.4 AS build-env
WORKDIR /workdir
COPY . /workdir
RUN dotnet restore ./WebApp.sln
RUN dotnet test ./src/tests/WebApp.IntegrationTests
RUN dotnet test ./src/tests/WebApp.UnitTests
RUN dotnet publish ./src/WebApp/WebApp.csproj -c Release -o /publish
FROM microsoft/dotnet:2.0.5-runtime
WORKDIR /app
COPY --from=build-env ./publish .
EXPOSE 3333/tcp
CMD ["dotnet", "WebApp.dll", "--server.urls", "http://*:3333"]
After that, put a "kubernetes" folder in your Project's root. Folder structure can be like this:
- kubernetes
-- deployment.yaml
-- service.yaml
-- switch_environment.sh
- src
....
- ci.gocd.yaml
- Dockerfile
- WebApp.sln
Your "deployment.yaml" should be like this :
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: webapp-${ENV}
spec:
replicas: ${PODS}
template:
metadata:
labels:
app: webapp
ENV: ${ENV}
spec:
containers:
- name: webapp
image: yourdockerregistry:5000/webapp:${IMAGE_TAG}
resources:
requests:
cpu: "750m"
ports:
- containerPort: 3333
readinessProbe:
tcpSocket:
port: 3333
initialDelaySeconds: 15
periodSeconds: 5
livenessProbe:
httpGet:
path: /status
port: 3333
initialDelaySeconds: 15
periodSeconds: 10
terminationGracePeriodSeconds: 30
---
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: webapp-${ENV}
spec:
scaleTargetRef:
apiVersion: apps/v1beta1
kind: Deployment
name: webapp-${ENV}
minReplicas: 10
maxReplicas: 25
metrics:
- type: Pods
pods:
metricName: cpu_usage # Metrics Comming From Prometheus. List of metrics : kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1" | jq .
targetAverageValue: 0.6 # If average pod CPU over %50, Pods will be scaled.
In this snippet, you will see some ENV variables for parametric values like image tag, deployment environment, blue-green deployment etc..
You can also use helm for rolling deployments, version bump-ups but I will use much more simple thing: "envsubst"
The other mechanism is horizontal scaling in the cluster. You can merge deployment and scaling in one yaml.
In this instance, I used K8s' custom metric API.
Take a look if you wanna this or just skip it: https://github.com/stefanprodan/k8s-prom-hpa
And the service.yaml should be like this :
apiVersion: v1
kind: Service
metadata:
name: webapp-svc
spec:
type: NodePort
ports:
- port: 3333
nodePort: 30333
targetPort: 3333
protocol: TCP
name: http
selector:
app: webapp
ENV: ${ENV}
We will use K8s' selectors in order to get blue-green switch for deployments. The selector object will take the suitable pods and bind into service.
And I used nodePort because of binding services to Load Balancer externally.
You can bind like this :
AGENTIP1:30333 http://servicedns.com
AGENTIP2:30333 http://servicedns.com
AGENTIP3:30333 http://servicedns.com
You don't have to give each agent's IP to load balancer because K8s have also internal Load-Balancing. (That's not a good approach. Managing in Loadbalancer in K8s simply better)
Your "switch_environment.sh" file can be like this.
#!/bin/bash
if [ -z "$1" ]
then
echo "No argument supplied"
exit 1
fi
if ! kubectl get svc $1
then
echo "No service found : ${1}"
exit 1
fi
ENVIRONMENT=$(kubectl describe svc $1 | grep ENV | awk '{print $2}' | cut -d"," -f1 | cut -d"=" -f2)
if [ $ENVIRONMENT == "blue" ]; then
ENV=green envsubst < service.yaml | kubectl apply -f -
echo "Switched to green"
else
ENV=blue envsubst < service.yaml | kubectl apply -f -
echo "Switched to blue"
fi
After all, bind all these items in one gocd.yml file.
format_version: 2
environments:
WebAPI:
pipelines:
- webapp-build-and-push
- webapp-deploy-to-prod-blue
- webapp-deploy-to-prod-green
- webapp-switch-environment
pipelines:
webapp-build-and-push:
group: webapp
label_template: "1.1.${COUNT}"
materials:
project:
git: http://github.com/example/webapp.git
branch: master
destination: app
stages:
- buildAndPush:
clean_workspace: true
jobs:
buildAndPush:
tasks:
- exec:
working_directory: app/build-scripts
command: /bin/bash
arguments:
- -c
- './build-and-publish.sh'
webapp-deploy-to-prod-blue:
group: webapp
label_template: "${webapp-build-and-push}"
materials:
webapp-build-and-push:
type: pipeline
pipeline: webapp-build-and-push
stage: deploy
project:
git: http://github.com/example/webapp.git
branch: master
destination: app
stages:
- build:
approval:
type: manual
clean_workspace: true
jobs:
build:
tasks:
- exec:
working_directory: app/kubernetes
command: /bin/bash
arguments:
- -c
- 'ENV=blue IMAGE_TAG=$GO_PIPELINE_LABEL PODS=10 envsubst < deployment.yaml | kubectl apply -f -'
- exec:
working_directory: app/kubernetes
command: /bin/bash
arguments:
- -c
- 'kubectl rollout status deployment webapp-blue'
webapp-deploy-to-prod-green:
group: webapp
label_template: "${webapp-build-and-push}"
materials:
webapp-build-and-push:
type: pipeline
pipeline: webapp-build-and-push
stage: deploy
project:
git: http://github.com/example/webapp.git
branch: master
destination: app
stages:
- build:
approval:
type: manual
clean_workspace: true
jobs:
build:
tasks:
- exec:
working_directory: app/kubernetes
command: /bin/bash
arguments:
- -c
- 'ENV=green IMAGE_TAG=$GO_PIPELINE_LABEL PODS=10 envsubst < deployment.yaml | kubectl apply -f -'
- exec:
working_directory: app/kubernetes
command: /bin/bash
arguments:
- -c
- 'kubectl rollout status deployment webapp-green'
webapp-switch-environment:
group: webapp
label_template: "${COUNT}"
materials:
webapp-build-and-push:
type: pipeline
pipeline: webapp-build-and-push
stage: deploy
project:
git: http://github.com/example/webapp.git
branch: master
destination: app
stages:
- build:
approval:
type: manual
clean_workspace: true
jobs:
build:
tasks:
- exec:
working_directory: app/kubernetes
command: /bin/bash
arguments:
- -c
- './switch_environment.sh webapp-svc'
Now, You have 4 pipelines:
- webapp-build-and-push
- webapp-deploy-to-prod-blue
- webapp-deploy-to-prod-green
- webapp-switch-environment
You can define your build script to build, dockerise the application.
If you have Test, Staging environments, put them in the "gocd.yaml" too. (In order to simplify, I removed those lines)
That's it! After that, you have :
- Dockerised dotnetcore app
- Kubernetes Deployment pipelines
- Blue-Green Switch Pipeline which controls kubernetes service (You have to configure kubectl for gocd agents)
- Horizontal Pod Autoscaler (CPU based autoscale mechanism in the cluster)
Dec 122017Hi Guys, in this article I'm gonna talk about how microservices managed by Istio and why we should prefer Istio. Before Istio, the people, who own microservice architecture, are complaining about management of microservices, visualizing service mesh, monitoring of distributed services, service discovery and so on. The announcement of Istio came at a good time because of those issues.
Istio is a platform that hosted on the top of your kubernetes cluster. You deploy your applications(containers) with a special sidecar proxy throughout your environment that intercepts all network communication between microservices, configured and managed using Istio’s control plane functionality. Therefore, you can use Envoy for managing the network, routing rules etc... Service Discovery is also supported by Istio. You don’t have to think about whether or not the network was configured correctly.
I did the POC on their website, described in here: https://istio.io/docs/guides/bookinfo.html. It's clear to understand because it's hosted on the top of kubernetes. You can use your favorite Cloud Provider. (I used GCP's fully managed kubernetes cluster). The content-based request routing(A/B testing for ex.), traffic shifting, fault injection are the most satisfying parts of using Istio.
Istio also has their ready-to-use plugins and add-ons, you can have a look at the addons parts. Grafana-Prometheus for the monitoring, jeager for the tracing, dotviz for the service mesh... All of them are ready to inject. You manage and do custom configurations via their provision ymls.
Check out of those URL's if you are interested in.
https://istio.io/docs/concepts/what-is-istio/overview.html
https://github.com/istio/istio
Aug 282017>devops
non-programmer babysitting servers
>frontend developer
web programmer that can't do their job all the way
>fullstack developer
web programmer that can do their job all the way
>backend developer
a regular programmer
>systems architect
programmer that is too hot shit to actually make programs
>information security analyst
programmer that used to be script kiddie who wanted to be a hacker
>systems engineer
programmer who knows perl and RHEL (that's it)
>network engineer
non-programmer glorified tv cable guy
Jul 262017In this article, I wanna focus on Continuous Delivery. As described by Martin Fowler, Continuous Delivery is a software development discipline where you build software in such a way that the software can be released to production at any time. This means our packages should be battle tested, reliable, automatically deployable and configurable. That's why we do Continuous Integration. Frequent builds, in turn, lead to more frequent releases. At that point, I ' m on the side of trunk based development rather than git flow.In my opinion, each commit should be deployed to the environments instead of waiting for a silly manual merging operation. We gain more agility and each change becomes simpler and lower risk.
From the business perspective, this idea is the simply perfection. Because this allows organizations to adjust rapidly to changing market conditions.
From the developer perspective, we need to develop and deploy more carefully, adding more test suites not only unit tests but also integration, contract, security etc. to our pipeline is good. Maybe, we should do more pair programming and this is like continuous code review. Frequent production releases make us more aware and we discover new approaches. The approaches we found, actually, Continuous Delivery best patterns.
Sep 112016DevOps...
Yazılım dünyasında son zamanların yükselen trendi gibi görünse de aslında özünde bir takım konseptler ve disiplin süreçlerini içeriyor. Bunların toplamına ise bir kültür diyebiliriz aslında.
Kelime anlamı ise Development ve Operations süreçlerinin birbiri ile kolobrasyonu.
Yukarıdaki anlam biraz havada kaldıysa konuyu daha derinlemesine inceleyelim.
Aslında bana en yakın gelen tanım şu şekilde : Yazılımcının kendi yazdığı kodu deploy etmesinden sorumlu olması. Buradan hareketle, DevOps, yanlızca bazı kişilerin üzerinde olan bir sorumluluk değil, çalıştığınız kurumun tüm geliştiricilerinde olması gereken bir özelliktir. Bu pencereden baktığımızda ise DevOps, aslında Agile,Waterfall gibi bir süreç.
Spotify bunu Agile kültürünün bir parçası olarak görüyor ve takımları, ürünlerinin design,development,deployment (end-to-end) vs. tüm süreçlerinden sorumlular. Takımları tamamı ile Cross-Functional. Takımların operasyonel işlerinin automate edilmesi ile başlayan otomasyon süreci, daha sonra Machine Management'e , Immutable Infrastructure'a , Infrastructure as Code'a kadar gidiyor. Bu ekipler otomasyonu sağlayan tooları yazıyorlar ve de Site Reliability Engineer'lardan oluşuyorlar. Amaç; Self-healing systemler ve auto-scalable infrastructure. Bu noktada insan faktörünün manuel konfigürasyon değişikliklerini 0'a indirdiğini ve artık datacenter'leri yöneten Software System'leri görüyoruz. Öyle ki Google'nın Borg'u, birden fazla Datacenter'da ki tüm operasyonları başarı ile orchestrate ediyor. Yönetimde Sys. Adminler yeride Software-Systemler var. Bir benzer teknoloji için; http://mesos.apache.org/
Dikkat edilmesi gereken nokta, Site Reliability Engineer'lar ürün deliver eden ekipleri bloklayan bir ekip değil aksine onlar için teknoloji geliştiriyorlar ve bu teknolojileri Developer'ların kullanımına sunuyorlar.
Bu noktada kulağa ilginç gelen şey, piyasada aranan "DevOps Mühendisleri"'nin aslında Site Reliability Engineer'lar olması.(Bkz : Love DevOps? Wait until you meet SRE)
Bir sonraki yazımda Continuous Delivery ile DevOps ilişkisini ele alacağım.
Konu ile ilgili :
https://en.wikipedia.org/wiki/Continuous_delivery
http://martinfowler.com/bliki/ContinuousDelivery.html
https://www.youtube.com/watch?v=dxk8b9rSKOo
https://air.mozilla.org/continuous-delivery-at-google/