Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
59 views26 pages

Data Mining Lab Manual

The lab manual for Data Mining & Data Warehousing (CSN447) at DIT University outlines various experiments using the WEKA tool, including data preprocessing, visualization, and machine learning techniques. It details the installation of WEKA, the exploration of datasets, and the creation of ARFF files, along with data processing techniques and OLAP operations. The manual serves as a practical guide for students to apply machine learning methods to real-world data mining problems.

Uploaded by

arorasargam3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views26 pages

Data Mining Lab Manual

The lab manual for Data Mining & Data Warehousing (CSN447) at DIT University outlines various experiments using the WEKA tool, including data preprocessing, visualization, and machine learning techniques. It details the installation of WEKA, the exploration of datasets, and the creation of ARFF files, along with data processing techniques and OLAP operations. The manual serves as a practical guide for students to apply machine learning methods to real-world data mining problems.

Uploaded by

arorasargam3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 26

Lab Manual

of
Data Mining & Data Warehousing
(CSN447)
MASTER OF COMPUTER APPLICATION

Session 2024-25

SCHOOL OF COMPUTING

DIT University

Submitted to: Submitted by:


Dr.R.K. Saini Sargam Arora
School of Computing 1000026809
DIT University, Dehradu
EXPERIMENT 1: Exploring Weka tool

WHAT IS WEKA?
WEKA - an open source software provides tools for data preprocessing, implementation of several
Machine Learning algorithms, and visualization tools so that you can develop machine learning
techniques and apply them to real-world data mining problems. What WEKA offers is summarized in
the following diagram.

STEP 1: Installing WEKA TOOL in virtual machine using internet explorer.


The WEKA GUI Chooser application will start and you would see the following :

The GUI Chooser application allows you to run five different types of applications as listed here:
· Explorer
· Experimenter
· KnowledgeFlow
· Workbench
· Simple CLI

WEKA TOOL INTERFACE


STEP 2: Opening Weka Explorer And Opening Dataset Using The Following Path:
When you click on the Explorer button in the Applications selector, it opens the following screen.
On the top, you will see several tabs as listed here:
· Preprocess
· Classify
· Cluster
· Associate
· Select Attributes
· Visualize

Click on the Open file ... button. A directory navigator window opens as shown in the following
screen, following the below path to load data.
C:\Program Files\Weka-3-8-6\data

STEP 3: Visualizing The Given Data Graphically


STEP 4: Every Attribute Along With It’s Graph Has Some Conclusions
Example the time duration of the labour dataset, its graph and conclusion are as follows:

Here we can see from the graph as well as the pre calculated value by weka tool:
 The mean of the time duration is : 2.161
 The standard deviation is: 0.707

 Increase of wage in second year, it’s graph and conclusion.


Here we can see from the graph as well as the pre calculated value by weka tool:
 The mean of the wage increment in second year is : 3.972
 The standard deviation is: 1.164

 Increase of wage in second year, it’s graph and conclusion.


Here we can see from the graph as well as the pre calculated value by weka tool:
 The mean of the wage increment in second year is : 38.039
 The standard deviation is: 2.506

 Increase of wage in third year, it’s graph and conclusion.


Here we can see from the graph as well as the pre calculated value by weka tool:
 The mean of the wage increment in third year is : 3.913
 The standard deviation is: 1.304
EXPERIMENT 2: Creating a new ARFF FILE

Step 1:
EXPERIMENT 3: Data Processing Techniques on Dataset

Pre-Processing involves
1. Converting Nominal to Binary, Numeric to Nominal (in the form of 0 or 1).

AGE
EDUCATION_NUM
BEFORE

AFTER
2. Detecting the missing value and replacing with a user constant value or system
generated value.
3.Detect the Outliers and Remove it using interquartile.

NO OUTLIERS IN AGE ATTRIBUTE


OUTLIERS PRESENT IN FNLWGT
NO OUTLIERS IN CAPITAL_LOSS ATTRIBUTE

NO OUTLIERS IN CAPITAL_GAIN ATTRIBUTE


HOURS_PER_WEEK
EXPERIMENT 4: create a cube and illustrate the following
OLAP operations.
1) Rollup 2) Drill down 3) Slice 4) Dice 5) Pivot
5) Pivot: It rotates the cube, sub cube or rolled -up or drilled -down cube, thus
changing the view of the cube.

You might also like