#Failure prediction for machines in a Google cluster
This repository contains the scripts used to perform the data analysis presented in the paper
Towards Data-Driven Autonomics in Data Centers, Alina Sîrbu and Ozalp Babaoglu.
The scripts are released under the GNU General Public Licence v3.0
Files included:
*big_query.sh - commands for BigQuery pre-processing and post-processing of results
*classification.py - python script for training the RF ensemble and evaluating the results