Base on data mining, using Apache spark RDD in python and scala.
HW1. MapReduce
HW2. Implemented SON algorithm to find Frequent Itemsets
HW3. Completed CF Recommendation System with MinHash and Locality Sensitive Hashing(LSH) algorithms
HW4. Community Detection by Girvan-Newman algorithm