This repository contains example code and sample data for Introduction to Apache Spark session. Follow the below steps to setup spark on your machine. Also clone this repository to get code and data.
You should have Java installed on your machine.
Download spark from apache website
tar -zxvf spark-1.2.1-bin-hadoop2.4.tgz
-
Use 7zip to extract the file. You have to extract twice.
-
Copy spark-1.2.1-bin-hadoop2.4 folder to root of C drive.
Cd to spark-1.2.1-bin-hadoop2.4 folder
Start spark-shell.
bin/spark-shell
bin\spark-shell.cmd
If spark-shell starts successfully , then spark is installed on your machine.
Clone this repository using following command
git clone https://github.com/phatak-dev/introduction-to-spark.git
to get example code and sample data.
Please pull before coming to the session to get the latest code.