0% found this document useful (0 votes)

23 views45 pages

Data Warehousing Lab Excercise

The document outlines various exercises using the Weka tool for data analysis, including data exploration, validation, and training using different algorithms such as Linear Regression and Naive Bayes. It also covers schema definition queries for both Star and Snowflake schemas in SQL Server Management Studio, detailing the steps to create and execute these queries. Overall, it serves as a comprehensive guide for utilizing Weka and SQL for data manipulation and analysis.

Uploaded by

KING GAMING

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views45 pages

Data Warehousing Lab Excercise

Uploaded by

KING GAMING

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 45

Ex.

No:1
Date:
Study of WEKA Tool

Aim: A. Investigation the Application interfaces of the Weka tool.

Introduction
Weka (pronounced to rhyme with Mecca) is a workbench that contains a collection of
visualization tools and algorithms for data analysis and predictive modeling, together with
graphical user interfaces for easy access to these functions. The original non-Java version of
Weka was a Tcl/Tk front-end to (mostly third-party) modeling algorithms implemented in
other programming languages, plus data preprocessing utilities in C, and Make file-based
system for running machine learning experiments. This original version was primarily
designed as a tool for analyzing data from agricultural domains, but the more recent fully
Java-based version (Weka 3), for which development started in 1997, is now used in many
different application areas, in particular for educational purposes and research. Advantages
of Weka include:
 Free availability under the GNU General Public License.
 Portability, since it is fully implemented in the Java programming language and thus
runs on almost any modern computing platform
 A comprehensive collection of data preprocessing and modeling techniques
 Ease of use due to its graphical user interfaces

Description
Open the program. Once the program has been loaded on the user‟s machine it is opened by
navigating to the programs start option and that will depend on the user‟s operating system.
Figure 1.1 is an example of the initial opening screen on a computer.
There are four options available on this initial screen:

Fig: 1.1 Weka GUI

1. Explorer - the graphical interface used to conduct experimentation on raw data After
clicking the Explorer button the weka explorer interface appears.
Fig: 1.2 Pre-processor
Inside the weka explorer window there are six tabs:
i) Preprocess- used to choose the data file to be used by the application.
Open File- allows for the user to select files residing on the local machine or recorded
medium
Open URL- provides a mechanism to locate a file or data source from a different location
specified by the user
Open Database- allows the user to retrieve files or data from a database source provided by
user
ii) Classify- used to test and train different learning schemes on the preprocessed data file
under experimentation

Fig: 1.3 choosing Zero set from classify

Again there are several options to be selected inside of the classify tab. Test option gives the
user the choice of using four different test mode scenarios on the data set.
1. Use training set
2. Supplied training set
3. Cross validation
4. Split percentage

iii) Cluster- used to apply different tools that identify clusters within the data file.
The Cluster tab opens the process that is used to identify commonalties or clusters of
occurrences within the data set and produce information for the user to analyze.
iv) Association- used to apply different rules to the data file that identify association within
the data. The associate tab opens a window to select the options for associations within the
dataset.

v) Select attributes-used to apply different rules to reveal changes based on selected

attributes
inclusion or exclusion from the experiment
vi) Visualize- used to see what the various manipulation produced on the data set in a 2D
format,
in scatter plot and bar graph output.

2. Experimenter - this option allows users to conduct different experimental variations on

data sets and perform statistical manipulation. The Weka Experiment Environment enables
the user to create, run, modify, and analyze experiments in a more convenient manner than is
possible when processing the schemes individually. For example, the user can create an
experiment that runs several schemes against a series of datasets and then analyze the results
to determine if one of the schemes is (statistically) better than the other schemes.
Results destination: ARFF file, CSV file, JDBC database.
Experiment type: Cross-validation (default), Train/Test Percentage Split (data randomized).
Iteration control: Number of repetitions, Data sets first/Algorithms first.
Algorithms: filters
3. Knowledge Flow -basically the same functionality as Explorer with drag and drop
functionality. The advantage of this option is that it supports incremental learning from
previous results
4. Simple CLI - provides users without a graphic interface option the ability to
execute commands from a terminal window.
b. Explore the default datasets in Weka tool.
Click the “Open file…” button to open a data set and double click on the “data” directory.
Weka provides a number of small common machine learning datasets that you can use to
practiceon.
Select the “iris.arff” file to load the Iris dataset.

Result:

The core features, general characteristics and the applications of Weka Tool
has been studied.
Ex.No:2
Date:
Data exploration and integration with Weka

Aim:
To implement data exploration and integration with Weka
Procedure:
Step 1: Launch Weka Explorer
- Open Weka and select the "Explorer" from the Weka GUI Chooser.
Step 2: Load the dataset
- Click on the "Open file" button and select "datasets" > "iris.arff" from the Weka
installation directory. This will load the Iris dataset.
Step 3: To know more about the iris dataset, open iris.arff in notepad++ or in a similar tool
and read thecomments.
Step 4: Fill this tables:
Flower Type Count

Attribute Minimum Maximum Mean StdDev

Sepal length
Sepalwidth
Petallength
Petal width

Step 5: Explore the dataset

- Click on the "Classify" tab in the Weka Explorer.
- You'll see the attributes on the left-hand side. The "Class" attribute represents the target
variable (species of iris flowers).
- Select the "Preprocess" tab to explore the dataset visually and apply data preprocessing if
necessary.
- Click on "Visualize all" to see the scatter plots of different attribute pairs.
Step 6: Data preprocessing
- In the "Preprocess" tab, you can apply filters to clean and preprocess the data.
- For example, you can use the "Remove" filter to remove unnecessary attributes or
instances.
- You can also handle missing values using the "ReplaceMissingValues" filter.
Step 7: Integration
- Weka provides various classification and clustering algorithms that you can use to integrate
your data and build models.
- In the "Classify" tab, you can choose an algorithm from the "Classifier" dropdown menu
and then click on the "Start" button to build a model.
- Evaluate the model's performance using the "Test options" section.
Step 8: Click on visualize tab to see various 2D visualizations of the dataset.
a. Click on some graphs to see more details about it.
b. In any of the graph, click one’x’ to see details about that data record.
Output
Result:
Thus the simple data exploration and integration exercise using Weka was implemented.
Ex.No:3
Date:
Apply Weka Tool for Data Validation

Aim:
To implement data validation using Weka

Procedure:
Step 1: Launch Weka Explorer
- Open Weka and select the "Explorer" from the Weka GUI Chooser.
Step 2: Load the dataset
- Click on the "Open file" button and select "datasets" > "iris.arff" from the Weka
installation directory. This will load the Iris dataset.
Step 3: Split your data into training and testing sets. Under the "Classify" tab, click on the
"Choose" button next to the "Test options" area and select a testing method. Weka offers
options like cross-validation, percentage split, and user-defined test set. Configure the
options according to your needs.
Step 4: Select a classifier algorithm. Weka offers a wide range of algorithms for
classification, regression, clustering, and other tasks. Under the "Classify" tab, click on the
"Choose" button next to the "Classifier" area and choose an algorithm. Configure its
parameters, if needed.
Step 5: Click on the "Start" button under the "Classify" tab to run the training and testing
process. Weka will train the model on the training set and test its performance on the testing
set using the selected algorithm.

Validation Techniques:
Cross-Validation: Go to the "Classify" tab and choose a classifier. Then, under the "Test
options," select the type of cross-validation you want to perform (e.g., 10-fold cross-
validation). Click "Start" to run the validation.
Train-Test Split: You can also split your data into a training set and a test set. Use the
"Supervised" tab to train a model on the training set and evaluate its performance on the test
set.
Step 6: Evaluate the model's performance. Once the process finishes, Weka will display
various performance measures like accuracy, precision, recall, and ROC curve (for
classification tasks) or RMSE and MAE (for regression tasks). These measures can be found
in the "Result list" on the right side of the window.
Step 7: Analyze the results and interpret them. Examine the performance measures to assess
the model's quality and suitability for your dataset. Compare different models or validation
methods if you have tried more than one.
Step 8: Repeat steps 4-7 with different algorithms or validation methods if desired. This will
help you compare the performance of different models and choose the best one.
Output
Result:
Thus the simple data validation and testing dataset using Weka was implemented.
Ex.No:4
Date:
Training the Given Dataset for an Application

Aim:
To apply the concept of Linear Regression for training the given dataset.

Procedure:
Step 1: Open the weka tool.
Step 2: Download a dataset by using UCI.
Step 3: Apply replace missing values.
Step 4: Apply normalize filter.
Step 5: Click the Classify Tab.
Step 6: Choose the Simple Linear Regression option.
Step 7: Select the training set of data.
Step 8: Start the validation process.
Step 9: Note the output.

Linear Regression:
In statistics, Linear Regression is an approach for modeling a relationship between a scalar
dependent variable Y
and one or more explanatory variables denoted X.the case of explanatory variable is called
Simple Linear
Regression.
Coefficient of Linear Regression is given by: Y=ax+b

Problem:
Consider the dataset below where x is the number of working expeince of a college graduate
and y is the
corresponding salary of the graduate. Build a regression equation and predict the salary of
college graduate whose
experience is 10 years.
Input:

Output:
Result: Thus the concept of Linear Regression for training the given dataset is applied and
implemented.
Ex.No:5
Date:
Testing the Given Dataset for an Application

Aim:
To apply the Navie Bayes Classification for testing the given dataset.

Procedure:
Step 1: Open the weka tool.
Step 2: Download a dataset by using UCI.
Step 3: Apply replace missing values.
Step 4: Apply normalize filter.
Step 5: Click the Classification Tab.
Step 6: Apply Navie Bayes Classification.
Step 7: Find the Classified Value.
Step 8: Note the output.

Bayes’ Theorem In the Classification Context:

X is a data tuple. In Bayesian term it is considered “evidence”.H is some hypothesis that X
belongs to a specified
class C .P(H|X) is the posterior probability of H conditioned on X .

Example: predict whether a costumer will buy a computer or not " Costumers are described
by two attributes: age
and income " X is a 35 years-old costumer with an income of 40k " H is the hypothesis that
the costumer will buy a
computer " P(H|X) reflects the probability that costumer X will buy a computer given that
we know the costumers’
age and income.
Input:

Output:
Result:
Thus the Navie Bayes Classification for testing the given dataset is implemented.
Ex.No:6
Date:
Write the Query for Schema Definition
Ex.No.6.1 Query for Star schema using SQL Server Management Studio

Aim:
To execute and verify query for star schema using SQL Server Management Studio

Procedure:
Step 1: Install SQLEXPR and SQLManagementStudio
Step 2: Launch SQL Server Management Studio
Step 3: Create new database and write query for creating Star schema table
Step 4: Execute the query for schema
Step 5: Explore the database diagram for Star schema

Query for Star Schema

USE Demo
GO
CREATE TABLE DimProduct
(ProductKey int identity NOT NULL PRIMARY KEY NONCLUSTERED,
ProductAltKey nvarchar(10) NOT NULL,
ProductName nvarchar(50) NULL, ProductDescription nvarchar(100) NULL,
ProductCategoryName nvarchar(50))
GO

CREATE TABLE DimCustomer

(CustomerKey int identity NOT NULL PRIMARY KEY NONCLUSTERED,
CustomerAltKey nvarchar(10) NOT NULL,
CustomerName nvarchar(50) NULL, CustomerEmail nvarchar(100) NULL,
CustomerGeographyKey int NULL) GO

CREATE TABLE DimSalesperson

(SalespersonKey int identity NOT NULL PRIMARY KEY NONCLUSTERED,
SalespersonAltKey nvarchar(10) NOT NULL,
SalespersonName nvarchar(50) NULL, StoreName nvarchar(50) NULL,
SalespersonGeographyKey int NULL) GO

CREATE TABLE DimDate

(DateKey int NOT NULL PRIMARY KEY NONCLUSTERED, DateAltKey datetime NOT
NULL, CalendarYear int NOT NULL, CalendarQuarter int NOT NULL, MonthOfYear int
NOT NULL, [MonthName]nvarchar(15) NOT NULL, [DayOfMonth]int NOT NULL,
[DayOfWeek]int NOT NULL, [DayName]nvarchar(15) NOT NULL, FiscalYear int NOT
NULL, FiscalQuarter int NOT NULL)
GO
CREATE TABLE FactSalesOrders
(ProductKey int NOT NULL REFERENCES DimProduct(ProductKey), CustomerKey int
NOT NULL REFERENCES DimCustomer(CustomerKey), SalespersonKey int NOT NULL
REFERENCES DimSalesperson(SalespersonKey), OrderDateKey int NOT NULL
REFERENCES DimDate(DateKey),
OrderNo int NOT NULL, ItemNo int NOT NULL, Quantity int NOT NULL,
SalesAmount money NOT NULL,
Cost money NOT NULL
CONSTRAINT[PK_FactSalesOrders] PRIMARY KEY NONCLUSTERED (
[ProductKey],[CustomerKey],[SalesPersonKey], [OrderDateKey], [OrderNo], [ItemNo]
))
Output

Result:
Thus the Query for Star Schema was created and executed successfully
Ex.No.6.2 Query for SnowFlake schema using SQL Server Management Studio

Aim:
To execute and verify query for SnowFlake schema using SQL Server Management Studio
Procedure:
Step 1: Install SQLEXPR and SQLManagementStudio
Step 2: LaunchSQL Server Management Studio
Step 3: Create new database and write query for creating Star schema table
Step 4: Execute the query
Step 5: Explore the database diagram for SnowFlake schema
Step 6: Connect the Geography table with Salesperson & Product Geography key

Query for SnowFlake Schema

USE Demo
GO

CREATE TABLE DimProduct

(ProductKey int identity NOT NULL PRIMARY KEY NONCLUSTERED, ProductAltKey
nvarchar(10) NOT NULL,
ProductName nvarchar(50) NULL, ProductDescription nvarchar(100) NULL,
ProductCategoryName nvarchar(50))
GO

CREATE TABLE DimCustomer

(CustomerKey int identity NOT NULL PRIMARY KEY NONCLUSTERED,
CustomerAltKey nvarchar(10) NOT NULL,
CustomerName nvarchar(50) NULL, CustomerEmail nvarchar(100) NULL,
CustomerGeographyKey int NULL) GO

CREATE TABLE DimSalesperson

CREATE TABLE DimGeography

(GeographyKey int identity NOT NULL PRIMARY KEY NONCLUSTERED, PostalCode
nvarchar(15) NULL,
City nvarchar(50) NULL, Region nvarchar(50) NULL, Country nvarchar(50) NULL) GO

CREATE TABLE FactSalesOrders

(ProductKey int NOT NULL REFERENCES DimProduct(ProductKey), CustomerKey int
NOT NULL REFERENCES DimCustomer(CustomerKey), SalespersonKey int NOT NULL
REFERENCES DimSalesperson(SalespersonKey), OrderNo int NOT NULL,
ItemNo int NOT NULL, Quantity int NOT NULL,
SalesAmount money NOT NULL, Cost money NOT NULL
CONSTRAINT[PK_FactSalesOrders] PRIMARY KEY NONCLUSTERED (
[ProductKey],[CustomerKey],[SalesPersonKey],[OrderNo], [ItemNo]
))

Output

Result:
Thus the Query for SnowFlake Schema was created and executed successfully
Ex.No:7
Date:
Design Data Warehouse for Real Time Applications

Aim:
To design and execute data warehouse for real time application using SQL Server Management
Studio

Procedure:
Step 1: Launch SQL Server Management Studio
Step 2: Explore the created database
Step 3: 3.1 Right-click on the table name and click on the Edit top 200 rows option.
3.2. Enter the data inside the table or use the top 1000 rows option and enter the query.
Step 4: Execute the query, and the data will be updated in the table.
Step 5: Right-click on the database and click on the tasks option. Use the import data option to
import files to the database.

Sample Query
INSERT INTO dbo.person(first_name,last_name,gender) VALUES
('Kavi','S','M'), ('Nila','V','F'), ('Nirmal','B','M'), ('Kaviya','M','F');

SELECT * FROM dbo.person

Output:

Import CSV file

Result:
Thus, the data warehouse for real-time applications was designed successfully.
Ex.No:8
Date:
Case Study Using OLAP

Aim:
To evaluate the implementation and impact of OLAP technology in a real-world business
context, analyzing its effectiveness in enhancing data analysis, decision-making, and overall
operational efficiency.

Introduction:
OLAP stands for On-Line Analytical Processing. OLAP is a classification of
software technology which authorizes analysts, managers, and executives to gain insight into
information through fast, consistent, interactive access in a wide variety of possible views of
data that has been transformed from raw information to reflect the real dimensionality of the
enterprise as understood by the clients .It is used to analyze business data from different
points of view. Organizations collect and store data from multiple data sources, such as
websites, applications, smart meters, and internal systems.

Methodology
OLAP (Online Analytical Processing) methodology refers to the approach and techniques
used to design, create, and use OLAP systems for efficient multidimensional data analysis. Here
are the key components and steps involved in the OLAP methodology:

1. Requirement Analysis:
The process begins with understanding the specific analytical requirements of the
users. Analysts and stakeholders define the dimensions, measures, hierarchies, and data sources
that will be part of the OLAP system. This step is crucial to ensure that the OLAP system meets
the business needs.

2. Dimensional Modeling:
Dimension tables are designed to represent attributes like time, geography, and
product categories. Fact tables contain the numerical data (measures) and the keys to
dimension tables.

3. Star Schema:
This is a common design in OLAP systems where the fact table is at the center, connected to
dimension tables.

4. Data Extraction and Transformation:

Data is extracted from various sources, cleaned, and transformed into a format suitable for
OLAP analysis. This may involve data aggregation, cleansing, and integration.
5. Data Loading:
The prepared data is loaded into the OLAP database or cube. This step includes populating
the dimension and fact tables and creating the data cube structure.

Operations in OLAP
In OLAP (Online Analytical Processing), operations are the fundamental actions performed on
multidimensional data cubes to retrieve, analyze, and present data in a way that facilitates
decision-making and data exploration. The main operations in OLAP are:

1. Slice: Slicing involves selecting a single dimension from a multidimensional cube to

view a specific "slice" of the data. For example, you can slice the cube to view sales data for a
particular month, product category, or region.

2. Dice: Dicing is the process of selecting specific values from two or more dimensions to
create a subcube. It allows you to focus on a particular combination of attributes. For
example, you can dice the cube to view sales data for a specific product category and region
within a certain time frame.

3. Roll-up (Drill-up): Roll-up allows you to move from a more detailed level of data to a
higher-level summary. For instance, you can roll up from daily sales data to monthly or yearly
sales data, aggregating the information.

4. Drill-down (Drill-through): Drill-down is the opposite of roll-up, where you move from
a higher-level summary to a more detailed view of the data. For example, you can drill
down from yearly sales data to quarterly, monthly, and daily data, getting more granularity.

5. Pivot (Rotate): Pivoting involves changing the orientation of the cube, which means
swapping dimensions to view the data from a different perspective. This operation is useful for
exploring data in various ways.

6. Slice and Dice: Combining slicing and dicing allows you to select specific values from
different dimensions to create subcubes. This operation helps you focus on a highly specific
subset of the data.

7. Drill-across: Drill-across involves navigating between cubes that are related but have
different dimensions or hierarchies. It allows users to explore data across different OLAP cubes.

8. Data Filtering: In OLAP, you can filter data to view only specific data points or subsets
that meet certain criteria. This operation is useful for narrowing down data to what is most
relevant for analysis.
Slice

Dice

Roll Up
Pivot

Drill Down

Real time example

One of the real time example of olap is Market Basket Analysis.Let us discuss in detail about
the example
Market Basket Analysis:
 A data mining technique, is typically performed using algorithms like Apriori, FP- growth,
or Eclat. These algorithms are designed to discover associations or patterns in transaction
data, such as retail sales.
 While traditional OLAP (Online Analytical Processing) is not the primary tool for market
basket analysis, it can play a supporting role. Here's how OLAP can complement market
basket analysis in more detail:
1. Data Integration:
Gather and integrate transaction data from various sources, such as point-of-sale
systems, e-commerce platforms, or other transactional databases. Clean and pre-process the
data, ensuring that it is in a format suitable for analysis.
2. Data Modeling:
Design a data model that will be used in the OLAP cube. In the context of market basket
analysis, consider the following dimensions and measures:
Dimensions:
Time (e.g., day, week, month)
Products (individual items or product categories) Customers (if you want to analyze customer
behavior)
Measures:
 The count of transactions containing specific items or itemsets.
 The count of products in each transaction.
 Any other relevant metrics, such as revenue, quantity, or profit.

3. Data Loading:
Load the integrated and preprocessed transaction data into the OLAP cube. Ensure that the
cube is regularly updated to reflect the most recent data.
4. OLAP Cube Design:
Define hierarchies and relationships within the cube to enable effective analysis. For instance,
you might have hierarchies that allow drilling down from product categories to individual
products.
5. Market Basket Analysis:
Although OLAP cubes are not designed for direct market basket analysis, they can
facilitate it in several ways:

Conclusion
OLAP is a powerful technology for businesses and organizations seeking data insights,
informed decisions, and performance improvement. It enables multidimensional data
analysis, especially in complex, data-intensive environments. It is a crucial technology for
organizations seeking to gain insights from their data and make informed decisions. It
empowers businesses to analyze data efficiently and effectively, offering a competitive
advantage in today's data-driven world.
Ex.No:9
Date:
Case Study Using OLTP

Aim:
Develop an OLTP system that enables the e-commerce company to process a high volume of
online orders, track inventory, manage customer information, and handle financial
transactions in real-time, ensuring data integrity and providing a seamless shopping
experience for customers.

Introduction:
In today's digital age, businesses across various industries are relying heavily on technology to
streamline their operations and provide seamless services to their customers. One crucial
aspect of this technological transformation is the development and implementation of
efficient Online Transaction Processing (OLTP) systems. This case study delves into the
design and implementation of an OLTP system for a fictional e-commerce company,
"TechTrend Electronics," and examines the key considerations, challenges, and aims
associated with such a project.

TechTrend Electronics is an emerging e-commerce retailer specializing in the latest consumer

electronics and gadgets. With a rapidly growing customer base, TechTrend faces the
challenge of managing a high volume of online transactions, which include order placement,
inventory management, and financial transactions. To meet customer demands and stay
competitive in the market, TechTrend Electronics recognizes the need for a robust and
reliable OLTP system.

This case study aims to showcase the process of developing an OLTP system tailored to
TechTrend Electronics' unique requirements. The objective is to ensure that the company can
efficiently handle a multitude of real-time transactions while maintaining data accuracy and
providing a seamless shopping experience for its customers.

Methodology:
The methodology for developing an OLTP (Online Transaction Processing) system for a case
study involves a systematic approach to designing, implementing, and testing the system.
Below is a step-by-step methodology for creating an OLTP system for a case study, using the
fictional e-commerce company "Tech Trend Electronics" as an example:

1. Database Design:
Develop a well-structured relational database schema that aligns with the business
requirements.
Normalize the data to eliminate redundancy and ensure data consistency.
Create entity-relationship diagrams and define data models for key entities like customers,
products, orders, payments, and inventory.

2.Technology Selection:
Choose appropriate technologies for the database management system (e.g., MySQL,
PostgreSQL, Oracle) and programming languages (e.g., Java, Python, C#) for the OLTP
system.
Evaluate and select suitable frameworks, libraries, and tools that align with the chosen
technologies.

3. System Architecture:
Design the system's architecture, which may include multiple application layers, a web
interface, and a database layer.
Implement a layered architecture, separating concerns for scalability, maintainability, and
security.

4. User Authentication and Authorization:

Implement user authentication mechanisms to secure access to the system for both customers
and staff.
Define access control policies and user roles (e.g., customers, administrators, and employees)
based on the principle of least privilege.

5. Transaction Processing Logic:

Develop the transaction processing logic, including handling order placement, inventory
management, and payment processing in real-time.
Ensure that transactions adhere to the ACID properties for data integrity.
6. Security Measures:
Implement security measures to protect customer data, financial information, and the system
itself.
Use encryption for sensitive data and ensure that the system is protected against common
security threats (e.g., SQL injection, cross-site scripting).

7. Payment Processing Integration

Integrate payment gateways to securely process financial transactions.
Implement payment authorization and fraud detection measures to protect customer financial
data.

8. Testing and Quality Assurance:

Thoroughly test the system, including unit testing, integration testing, and system testing.
Conduct stress testing to evaluate performance under heavy loads.

9. Deployment and Monitoring:

Deploy the OLTP system in a production environment.
Implement monitoring tools to track system performance, identify bottlenecks, and generate
reports for system administrators.

10. Maintenance and Updates:

Establish a plan for system maintenance and regular updates to address issues, enhance
functionality, and adapt to changing business needs.

Real World Example

In a real-world scenario, let's consider an e-commerce platform as an example of an OLTP
system. The platform processes millions of transactions every day. Here's a breakdown of how
the system functions:
Users can browse through the website, add products to their carts, and complete the checkout
process.
As a user completes the checkout process, a new transaction is created. This transaction
contains information about the products purchased, the buyer's details, the shipping address,
and other relevant data.
The system generates an invoice for the buyer and sends it via email.
The system generates transaction reports, such as daily sales summaries or sales by product
category, for internal use and management.
In this scenario, the e-commerce platform acts as an OLTP system, with its transaction
processing capabilities and the real-time updates to inventory and order details being key
components.
Here's an alternative approach using OLAP:
Aggregate sales data across all time and geographical locations, making it available for
reporting and analysis.
Allow business managers to run complex analytical queries on this data, such as calculating
average sales by product category, comparing sales trends between different regions, or
identifying top-performing sales channels.
Use OLAP tools like data warehouses and data cubes to enable fast, real-time access to
aggregated data and to simplify the process of running complex analytical queries.
By leveraging OLAP capabilities, businesses can gain insights into their sales performance,
identify trends and patterns, and make data-driven decisions. This can ultimately lead to
increased revenue, better customer service, and more efficient use of resources.

Conclusion:
In conclusion, OLTP systems play a pivotal role in modern business operations, facilitating
real-time transaction processing, data integrity, and customer interactions. These systems are
designed for high concurrency, low-latency, and consistent data access, making them
essential for day-to-day operations in various industries, such as finance, e-commerce,
healthcare, and more.
Overall, OLTP systems are the backbone of modern business operations, ensuring the
seamless execution of day-to-day transactions and delivering a positive customer experience.
Ex.No:10
Date:
Implementation of Warehouse Testing.
Aim:

To perform load testing using JMeter and interact with a SQL Server database using SQL
Management Studio, you'll need to set up JMeter to send SQL queries to the database
and collect the results for analysis.

Procedure:
1. Install Required Software:
 Install JMeter: Download and install JMeter from the official Apache JMeter website.
 Install SQL Server and SQL Management Studio: If you haven't already, set up SQL
Server and SQL Management Studio to manage your database.
2. Create a Test Plan in JMeter:
 Launch JMeter and create a new Test Plan.
3. Add Thread Group:
 Add a Thread Group to your Test Plan to simulate the number of users and requests.
4. Add JDBC Connection Configuration:
 Add a JDBC Connection Configuration element to your Thread Group. Configure it
with the database connection details, such as the JDBC URL, username, and password.
This element will allow JMeter to connect to your SQL Server database.
5. Add a JDBC Request Sampler:

 Add a JDBC Request sampler to your Thread Group. This sampler

will contain your SQL query.
 Configure the JDBC Request sampler with the JDBC
Connection Configuration created in the previous step.
 Enter your SQL query in the "Query" field of the JDBC Request sampler.

6. Add Listeners:

 Add listeners to your Test Plan to collect and view the test results. Common
listeners include View Results Tree, Summary Report, and Response Times
Over Time.
7. Configure Your Test Plan:
 Configure the number of threads (virtual users), ramp-up time, and loop count in the
Thread Group to simulate the desired load.
8. Run the Test:
 Start the test by clicking the "Run" button in JMeter.

9. View and Analyze Results:

 After the test has completed, you can view and analyze the results using the listeners
you added. You can analyze response times, errors, and other performance metrics.
10. Optimize and Fine-Tune:
 Based on the results, you can optimize your SQL queries and JMeter test plan to fine-
tune the performance of your database.

Conclusion
Using JMeter in conjunction with SQL Management Studio can be a powerful combination
for load testing and performance analysis of applications that rely on SQL Server databases.
This approach allows you to simulate a realistic user load, send SQL queries to the database,
and evaluate the system's performance under various conditions.
JMeter in combination with SQL Management Studio provides a robust solution for assessing
the performance of applications that rely on SQL Server databases. Through thorough testing,
analysis, and optimization, you can ensure your application is capable of delivering a reliable
and responsive experience to users even under heavy load conditions.

DWDM Unit 1 (R23)
No ratings yet
DWDM Unit 1 (R23)
85 pages
Unit 2
No ratings yet
Unit 2
25 pages
Data Mining Lab Manual for CSE
No ratings yet
Data Mining Lab Manual for CSE
50 pages
DWDM Lab Manual 2022-2023
No ratings yet
DWDM Lab Manual 2022-2023
87 pages
Lab 04
No ratings yet
Lab 04
7 pages
Weka DW&DM Lab Notes
No ratings yet
Weka DW&DM Lab Notes
37 pages
Power BI Insights for Analysts
No ratings yet
Power BI Insights for Analysts
33 pages
ccs341 Data Warehouse Lab Experiments
No ratings yet
ccs341 Data Warehouse Lab Experiments
26 pages
Data Warehousing and Data Mining Lab Manual
100% (1)
Data Warehousing and Data Mining Lab Manual
30 pages
Lab Manual (2024)
No ratings yet
Lab Manual (2024)
56 pages
DW Lab Manual
No ratings yet
DW Lab Manual
44 pages
DHW Lab (Ex1 To 3)
No ratings yet
DHW Lab (Ex1 To 3)
18 pages
Experiment No: 01 Data Exploration & Data Preprocessing
No ratings yet
Experiment No: 01 Data Exploration & Data Preprocessing
54 pages
DWDM Lab Manual 2024-2025
No ratings yet
DWDM Lab Manual 2024-2025
96 pages
WEKA Data Mining Lab Manual
100% (1)
WEKA Data Mining Lab Manual
8 pages
Data Warehousing Lab Exp 1-3
No ratings yet
Data Warehousing Lab Exp 1-3
24 pages
Data Warehousing
No ratings yet
Data Warehousing
54 pages
DWDM Lab Manual
No ratings yet
DWDM Lab Manual
55 pages
Week 1
No ratings yet
Week 1
12 pages
Data Warehousing Lab Excercise, 110
No ratings yet
Data Warehousing Lab Excercise, 110
45 pages
DWDM File-Final Ver3.pdf 20241230 172003 0000
No ratings yet
DWDM File-Final Ver3.pdf 20241230 172003 0000
54 pages
Aiml Manual
No ratings yet
Aiml Manual
27 pages
Data Werehousing Lab Manual
No ratings yet
Data Werehousing Lab Manual
63 pages
32013105-BDA LabManual
No ratings yet
32013105-BDA LabManual
122 pages
DWM1 Riya
No ratings yet
DWM1 Riya
16 pages
Mooc On Weka
No ratings yet
Mooc On Weka
59 pages
Data Mining Complete Lab Manual - DRSNR
No ratings yet
Data Mining Complete Lab Manual - DRSNR
27 pages
Laboratory Manual On: Data Mining
No ratings yet
Laboratory Manual On: Data Mining
41 pages
AtScale Technical Overview
No ratings yet
AtScale Technical Overview
18 pages
Data Warehousing - To Write
No ratings yet
Data Warehousing - To Write
23 pages
Data Warehouse Lab Manual
No ratings yet
Data Warehouse Lab Manual
60 pages
DM Lab Material
No ratings yet
DM Lab Material
88 pages
AI-43 Data Mining
No ratings yet
AI-43 Data Mining
96 pages
Weka Lab Manual
No ratings yet
Weka Lab Manual
49 pages
EXP No 1
No ratings yet
EXP No 1
7 pages
Machine Learning Tools: Weka & KNIME
No ratings yet
Machine Learning Tools: Weka & KNIME
88 pages
Datawarehouse Pract 2
No ratings yet
Datawarehouse Pract 2
7 pages
Data Warehousing Lab Guide
No ratings yet
Data Warehousing Lab Guide
55 pages
Weka Tool Installation Guide
No ratings yet
Weka Tool Installation Guide
7 pages
DMW Lab Print
No ratings yet
DMW Lab Print
21 pages
DMW LabFile 0901CS243D11 Swastik
No ratings yet
DMW LabFile 0901CS243D11 Swastik
25 pages
Experiment WEKA
No ratings yet
Experiment WEKA
16 pages
Data Warehousing Lab Record Final
No ratings yet
Data Warehousing Lab Record Final
45 pages
CCS 341 Lab Manual
No ratings yet
CCS 341 Lab Manual
32 pages
Lab Updated - Merged
No ratings yet
Lab Updated - Merged
49 pages
4 Data Warehousing & OLAP
No ratings yet
4 Data Warehousing & OLAP
62 pages
WEKA Practical Protocol
No ratings yet
WEKA Practical Protocol
40 pages
Week 1
No ratings yet
Week 1
4 pages
WEKA Tool & Data Mining Lab Guide
No ratings yet
WEKA Tool & Data Mining Lab Guide
29 pages
Priyadarshini J. L. College of Engineering, Nagpur: Session 2022-23 Semester-V
No ratings yet
Priyadarshini J. L. College of Engineering, Nagpur: Session 2022-23 Semester-V
31 pages
Data Mining Lab Manual
No ratings yet
Data Mining Lab Manual
50 pages
Weka Data Miningvsem
No ratings yet
Weka Data Miningvsem
7 pages
Dinesh DM
No ratings yet
Dinesh DM
34 pages
Introduction To Data Warehouse
No ratings yet
Introduction To Data Warehouse
34 pages
DW Lab Record
No ratings yet
DW Lab Record
44 pages
DWM1
No ratings yet
DWM1
19 pages
Weka Experiment
No ratings yet
Weka Experiment
13 pages
Weka Tool Guide for Data Analysts
No ratings yet
Weka Tool Guide for Data Analysts
6 pages
Deepak Dmbi File
No ratings yet
Deepak Dmbi File
40 pages
WEKA Toolkit: Machine Learning Guide
No ratings yet
WEKA Toolkit: Machine Learning Guide
8 pages
Weka Tutorial
No ratings yet
Weka Tutorial
32 pages
Rintro Wekacomplete
No ratings yet
Rintro Wekacomplete
135 pages
Weka Overview Slides
No ratings yet
Weka Overview Slides
31 pages
Analytical CRM
No ratings yet
Analytical CRM
97 pages
Data Warehousing and Data Mining Lab Manual
0% (1)
Data Warehousing and Data Mining Lab Manual
30 pages
DWDM WEEK1&2
No ratings yet
DWDM WEEK1&2
13 pages
CS8075 DATAWAREHOUSING AND DATA MINING - Watermark
No ratings yet
CS8075 DATAWAREHOUSING AND DATA MINING - Watermark
83 pages
One Stream Design and Reference Guide
100% (1)
One Stream Design and Reference Guide
1,186 pages
Enterprise Resource Planning - BA4031 - Notes by MIET
No ratings yet
Enterprise Resource Planning - BA4031 - Notes by MIET
87 pages
Data Warehouse and Power BI
No ratings yet
Data Warehouse and Power BI
6 pages
BI Unit Test I (UT 1 24-25) Answer Key
No ratings yet
BI Unit Test I (UT 1 24-25) Answer Key
14 pages
Practical Analytics Second Edition 2nd Nitin Kal Nancy Jones PDF Download
100% (1)
Practical Analytics Second Edition 2nd Nitin Kal Nancy Jones PDF Download
82 pages
Report Preparation and Presentation
No ratings yet
Report Preparation and Presentation
27 pages
DWM Mod 1
No ratings yet
DWM Mod 1
17 pages
Online Analytical Processing (OLAP)
No ratings yet
Online Analytical Processing (OLAP)
43 pages
Unit - 3 Study Material
No ratings yet
Unit - 3 Study Material
98 pages
Cognos TM1 Online Training
No ratings yet
Cognos TM1 Online Training
10 pages
DWDM Unit 2 Part 2 by Jithender Tulasi
No ratings yet
DWDM Unit 2 Part 2 by Jithender Tulasi
63 pages
In Memory Analytics
No ratings yet
In Memory Analytics
300 pages
Dossier Performance Troubleshooting Guide
No ratings yet
Dossier Performance Troubleshooting Guide
22 pages
DM Unit 1 PDF
No ratings yet
DM Unit 1 PDF
9 pages
BI-Unit 2
No ratings yet
BI-Unit 2
17 pages
Whats New 7 0 SR2
No ratings yet
Whats New 7 0 SR2
34 pages
Unit 3
No ratings yet
Unit 3
94 pages
What Is Data Warehouse 1696349950
No ratings yet
What Is Data Warehouse 1696349950
26 pages
Data Warehousing & Mining Lab Lab Code: CSL503: Experiment 3
No ratings yet
Data Warehousing & Mining Lab Lab Code: CSL503: Experiment 3
6 pages
OLAP Cubes for Business Analytics
No ratings yet
OLAP Cubes for Business Analytics
5 pages

Data Warehousing Lab Excercise

Uploaded by

Data Warehousing Lab Excercise

Uploaded by

Ex.

Aim: A. Investigation the Application interfaces of the Weka tool.

Fig: 1.1 Weka GUI

Fig: 1.3 choosing Zero set from classify

v) Select attributes-used to apply different rules to reveal changes based on selected

2. Experimenter - this option allows users to conduct different experimental variations on

Attribute Minimum Maximum Mean StdDev

Step 5: Explore the dataset

Bayes’ Theorem In the Classification Context:

Query for Star Schema

CREATE TABLE DimCustomer

CREATE TABLE DimSalesperson

CREATE TABLE DimDate

Query for SnowFlake Schema

CREATE TABLE DimProduct

CREATE TABLE DimCustomer

CREATE TABLE DimSalesperson

CREATE TABLE DimGeography

CREATE TABLE FactSalesOrders

SELECT * FROM dbo.person

Import CSV file

4. Data Extraction and Transformation:

1. Slice: Slicing involves selecting a single dimension from a multidimensional cube to

Real time example

TechTrend Electronics is an emerging e-commerce retailer specializing in the latest consumer

4. User Authentication and Authorization:

5. Transaction Processing Logic:

7. Payment Processing Integration

8. Testing and Quality Assurance:

9. Deployment and Monitoring:

10. Maintenance and Updates:

Real World Example

 Add a JDBC Request sampler to your Thread Group. This sampler

9. View and Analyze Results:

You might also like