0% found this document useful (0 votes)

59 views10 pages

Ch06 - Transforming Data (Slides)

Uploaded by

abdirachidmohamoud87

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

59 views10 pages

Ch06 - Transforming Data (Slides)

Uploaded by

abdirachidmohamoud87

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

30-Apr-24

Accounting Information Systems

Fifteenth Edition, Global Edition

Chapter 6
Transforming Data

• Copyright © 2021 Pearson Education Ltd.

Learning Objectives
• Describe the principles of data structuring related to
aggregating data, data joining, and data pivoting.
• Describe data parsing, data concatenation, cryptic data
values, misfielded data values, data formatting, and data
consistency and how they relate to data standardization.
• Describe how to diagnose and fix the data cleaning errors
related to data duplication, data filtering, data contradiction
errors, data threshold violations, violated attribute
dependencies, and data entry errors.
• List and describe four different techniques to perform data
validation.

• Copyright © 2021 Pearson Education Ltd.

1
30-Apr-24

Table 6.1 Attributes of High-Quality

Data

• Copyright © 2021 Pearson Education Ltd.

Data Structuring (1 of 2)
• Data structuring is the process of changing the
organization and relationships among data fields to
prepare the data for analysis.
• Extracted data often needs to be structured in a manner
that will enable analysis. This can entail
– aggregating the data at different levels of detail
– joining different data together, and/or
– pivoting the data

• Copyright © 2021 Pearson Education Ltd.

2
30-Apr-24

Data Structuring (2 of 2)
• Aggregate data is the presentation of data in a
summarized form.
• Data joining is the process of combining different data
sources.
• Data pivoting is rotating data from rows to columns.

• Copyright © 2021 Pearson Education Ltd.

Figure 6.4 Examples of Different

Levels of Aggregating Data

• Copyright © 2021 Pearson Education Ltd.

3
30-Apr-24

Figure 6.5 Pivoting S&S Data

• Copyright © 2021 Pearson Education Ltd.

Figure 6.6 Pivoting Figure 6.5 Data

VendorName ProdCat TotalCosts
B&D 1 $15,982.00
B&D 2 $ 2,529.00
Black and Decker 1 $ 2,220.06
Black and Decker 2 $ 568.00
Black and Decker 3 $13,024.57
Calphalon 1 $19,509.75
Honeywell 2 $ 5,516.90
Honeywell 1 $43,282.53
Oster 1 $28,020.11
Panasonic 1 $15,765.12
Panasonic 2 $ 5,693.50

• Copyright © 2021 Pearson Education Ltd.

4
30-Apr-24

Data Standardization (1 of 3)
• Data standardization is the process of standardizing the
structure and meaning of each data element so it can be
analyzed and used in decision making.
– It is particularly important when merging data from
several sources.
– It may involve changing data to a common format, data
type, or coding scheme.
– It encompasses ensuring the information is contained
in the correct field and the fields are organized in a
useful manner.

• Copyright © 2021 Pearson Education Ltd.

Data Standardization (2 of 3)
• Data parsing involves separating data from a single field
into multiple fields.
– It is often an iterative process that relies heavily on
pattern recognition.
• Data concatenation is the combining of data from two or
more fields into a single field.
– It is often used to create a unique identifier for a row.

• Copyright © 2021 Pearson Education Ltd.

5
30-Apr-24

Figure 6.7 Data Parsing Example

Figure 6.8 Data Concatenation Example

6
30-Apr-24

Data Standardization (3 of 3)
• Cryptic data values are data items that have no meaning
without understanding a coding scheme.
– When a field contains only two different responses,
typically 0 or 1, this field is called a dummy variable or
dichotomous variable.
• Misfielded data values are data values that are correctly
formatted but not listed in the correct field.
• Data consistency is the principle that every value in a
field should be stored in the same way.

Data Cleaning (1 of 3)
• Data cleaning is the process of updating data to be
consistent, accurate, and complete.
– Dirty data is data that is inconsistent, inaccurate, or
incomplete.
– To be useful, dirty data must be cleaned.
• Data de-duplication is the process of analyzing data and
removing two or more records that contain identical
information.
• Data filtering is the process of removing records or fields
of information from a data source.

7
30-Apr-24

Data Cleaning (2 of 3)
• Data imputation is the process of replacing a null or
missing value with a substituted value.
– It only works with numeric data.
• Data contradiction errors are errors that exist when the
same entity is described in two conflicting ways.
– Contradiction errors need to be investigated and
resolved appropriately.
• Data threshold violations are data errors that occur when
a data value falls outside an allowable level.

Data Cleaning (3 of 3)
• Violated attribute dependencies are errors that occur
when a secondary attribute in a row of data does not
match the primary attribute.
• Data entry errors are all types of errors that come from
inputting data incorrectly.
– They often occur in human data entry and can also be
introduced by the computer system.
– They may be indistinguishable from data formatting
and data consistency errors in an output data file.

8
30-Apr-24

Data Validation (1 of 2)
• Data validation is the process of analyzing data to make
certain the data has the properties of high-quality data.
– It is both a formal and informal process.
– It is an important precursor to data cleaning.
– The techniques used to validate data can be thought of
as a continuum from simple to complex.

Data Validation (2 of 2)
• Visual inspection is the process of examining data using
human vision to see if there are problems.
• Basic statistical tests can be performed to validate the
data.
• Audit a sample is one of the best techniques for assuring
data quality.
• Advanced testing techniques are possible with a deeper
understanding of the content of data.

9
30-Apr-24

Key Terms
• Data structuring • Data filtering
• Aggregate data • Data imputation
• Data pivoting • Data contradiction errors
• Data standardization • Data threshold violations
• Data parsing • Violated attribute dependencies
• Data concatenation • Data entry errors
• Cryptic data values • Data validation
• Dummy variable or • Visual inspection
dichotomous variable
• Misfielded data values
• Data consistency
• Dirty data
• Data cleaning
• Data de-duplication
• Copyright © 2021 Pearson Education Ltd.

SUM DMO With System Move
No ratings yet
SUM DMO With System Move
26 pages
Data Quality Services
No ratings yet
Data Quality Services
196 pages
Aspects of Data Quality (Excellent!)
No ratings yet
Aspects of Data Quality (Excellent!)
2 pages
Data Cleaning: Missing Values: - For Example in Attribute Income If
No ratings yet
Data Cleaning: Missing Values: - For Example in Attribute Income If
30 pages
UNIT - 2 .DataScience 04.09.18
No ratings yet
UNIT - 2 .DataScience 04.09.18
53 pages
Session2 Short
No ratings yet
Session2 Short
196 pages
Data Science Course Overview
No ratings yet
Data Science Course Overview
34 pages
Data Cleansing Guide for Analysts
No ratings yet
Data Cleansing Guide for Analysts
5 pages
Unit-I (Data Analytics)
No ratings yet
Unit-I (Data Analytics)
22 pages
Data Preprocessing
100% (1)
Data Preprocessing
33 pages
Estimasi Anggaran Biaya Google Adwords Iklan Website
No ratings yet
Estimasi Anggaran Biaya Google Adwords Iklan Website
54 pages
Cse2026 Module 1 & 2 Detailed Notes
No ratings yet
Cse2026 Module 1 & 2 Detailed Notes
185 pages
Data Preparation Guide COS10022
No ratings yet
Data Preparation Guide COS10022
61 pages
Data Preprocessing Techniques Guide
No ratings yet
Data Preprocessing Techniques Guide
32 pages
Data Preprocessing Essentials
No ratings yet
Data Preprocessing Essentials
33 pages
CH 6 Transforming Data
No ratings yet
CH 6 Transforming Data
12 pages
Data Preprocessing Essentials
No ratings yet
Data Preprocessing Essentials
41 pages
Preprocessing
No ratings yet
Preprocessing
50 pages
DataPreprocessing 2
No ratings yet
DataPreprocessing 2
68 pages
WINSEM2023-24 - BECE352E - ETH - VL2023240504409 - 2024-02-03 - Reference-Material-I 2
No ratings yet
WINSEM2023-24 - BECE352E - ETH - VL2023240504409 - 2024-02-03 - Reference-Material-I 2
16 pages
Data Sciences Unit-I
No ratings yet
Data Sciences Unit-I
83 pages
The Complete Guide To Data Preprocessing
No ratings yet
The Complete Guide To Data Preprocessing
50 pages
Session2 Parts 3 4
No ratings yet
Session2 Parts 3 4
202 pages
Data Preprocessing Essentials
No ratings yet
Data Preprocessing Essentials
9 pages
Importance of Data Cleaning
No ratings yet
Importance of Data Cleaning
35 pages
03 Nishikant ETL Testing Strategy - Migration Project
0% (1)
03 Nishikant ETL Testing Strategy - Migration Project
11 pages
7.data Preprocessing
No ratings yet
7.data Preprocessing
12 pages
M2 PPT
No ratings yet
M2 PPT
60 pages
Data Preparation: January 2017
No ratings yet
Data Preparation: January 2017
15 pages
Data Cleaning Ebook
No ratings yet
Data Cleaning Ebook
25 pages
Week 10 Tutorial Questions Chapter 6
No ratings yet
Week 10 Tutorial Questions Chapter 6
4 pages
Data Mining for Quality Improvement
100% (1)
Data Mining for Quality Improvement
34 pages
Data Cleaning and Data Transformation
No ratings yet
Data Cleaning and Data Transformation
13 pages
Data Migration Project
No ratings yet
Data Migration Project
12 pages
Bana1 Midterm Reviewer
No ratings yet
Bana1 Midterm Reviewer
10 pages
Unit 3
No ratings yet
Unit 3
18 pages
Data Preprocessing for Tech Students
No ratings yet
Data Preprocessing for Tech Students
59 pages
DWDM 3
No ratings yet
DWDM 3
12 pages
Quality Stage User Guide
No ratings yet
Quality Stage User Guide
233 pages
Chapter 06
No ratings yet
Chapter 06
26 pages
DEC - Unit II Data Pre-Processing
No ratings yet
DEC - Unit II Data Pre-Processing
96 pages
Mod2 DM
No ratings yet
Mod2 DM
86 pages
SQL Data Cleaning Techniques Guide
No ratings yet
SQL Data Cleaning Techniques Guide
31 pages
Data Preprocessing
No ratings yet
Data Preprocessing
11 pages
Big Data Lec5
No ratings yet
Big Data Lec5
37 pages
02 Data - Preprocessing - 4,5,6
No ratings yet
02 Data - Preprocessing - 4,5,6
54 pages
Laxmi Complete Final Projecct 6
No ratings yet
Laxmi Complete Final Projecct 6
34 pages
Data Quality
No ratings yet
Data Quality
15 pages
Pre Processing
No ratings yet
Pre Processing
68 pages
Operations Research Chapter1
No ratings yet
Operations Research Chapter1
31 pages
Full Thesis
No ratings yet
Full Thesis
42 pages
Chapter Four 1
No ratings yet
Chapter Four 1
7 pages
Data Conversion Plan Template
50% (2)
Data Conversion Plan Template
75 pages
Migrating Customer Master Into SAP (Harlex)
No ratings yet
Migrating Customer Master Into SAP (Harlex)
7 pages
Chapter 2
No ratings yet
Chapter 2
22 pages
Chapter 3& 4
No ratings yet
Chapter 3& 4
60 pages
CS822 DataMining Week3
No ratings yet
CS822 DataMining Week3
91 pages
Data Analysis Question and Answers
No ratings yet
Data Analysis Question and Answers
15 pages
DM Unit 3
No ratings yet
DM Unit 3
15 pages
Data Analysis and Information Management
No ratings yet
Data Analysis and Information Management
13 pages
Class Example
No ratings yet
Class Example
8 pages
Data Preprocessing
No ratings yet
Data Preprocessing
120 pages
Best Practices and Challenges in Data Migration For Oracle Fusion Financials
No ratings yet
Best Practices and Challenges in Data Migration For Oracle Fusion Financials
21 pages
2 DM DataPreprocessing
No ratings yet
2 DM DataPreprocessing
43 pages
Week2 2
No ratings yet
Week2 2
25 pages
Glossary of SAP S - 4HANA Central Finance
No ratings yet
Glossary of SAP S - 4HANA Central Finance
11 pages
2 Data Preprocessing
No ratings yet
2 Data Preprocessing
57 pages
Excel Mastery With AI
No ratings yet
Excel Mastery With AI
1 page
Module 2 - DM - AI
No ratings yet
Module 2 - DM - AI
61 pages
Marketing Data Intern
No ratings yet
Marketing Data Intern
2 pages
Data Cleaning
No ratings yet
Data Cleaning
20 pages
Datapreparation
No ratings yet
Datapreparation
59 pages
Video Pres
No ratings yet
Video Pres
7 pages
Data Preprocessing
No ratings yet
Data Preprocessing
54 pages
Assignment 2 Data Analysis Framework
No ratings yet
Assignment 2 Data Analysis Framework
5 pages
BI Unit 4 Final
No ratings yet
BI Unit 4 Final
2 pages
DSV-S8 Data Cleaning
No ratings yet
DSV-S8 Data Cleaning
34 pages
English Revition Class 8
No ratings yet
English Revition Class 8
5 pages
Revision
No ratings yet
Revision
7 pages
Maiora's Zarus Data Suite - Insurance Use Case
No ratings yet
Maiora's Zarus Data Suite - Insurance Use Case
2 pages
OJCST Vol13 N2-3 P 78-81
No ratings yet
OJCST Vol13 N2-3 P 78-81
4 pages
Module 2 Data Science New
No ratings yet
Module 2 Data Science New
57 pages
Data Preprocessing Techniques Guide
No ratings yet
Data Preprocessing Techniques Guide
8 pages
Writesonic Chatsonic 1709352232061
No ratings yet
Writesonic Chatsonic 1709352232061
1 page
Disability Discrimination
No ratings yet
Disability Discrimination
1 page
BiblioMagika SAMPLE
No ratings yet
BiblioMagika SAMPLE
40 pages
Part II, Meet 4 - CH 6 Dan 7 UNP
No ratings yet
Part II, Meet 4 - CH 6 Dan 7 UNP
19 pages
Lu05 - Ain3701 B0 LS05 004
No ratings yet
Lu05 - Ain3701 B0 LS05 004
50 pages
Assignment Windows 10
No ratings yet
Assignment Windows 10
1 page
Pre Processing
No ratings yet
Pre Processing
43 pages
Bank Acc Exercises
No ratings yet
Bank Acc Exercises
18 pages
Data Pre Processing
No ratings yet
Data Pre Processing
28 pages
African Study
No ratings yet
African Study
3 pages
African Study
No ratings yet
African Study
6 pages
Session 4
No ratings yet
Session 4
40 pages
Unit 2 Preprocessing
No ratings yet
Unit 2 Preprocessing
39 pages
DMDW Unit II
No ratings yet
DMDW Unit II
57 pages

Ch06 - Transforming Data (Slides)

Uploaded by

Ch06 - Transforming Data (Slides)

Uploaded by

30-Apr-24

Accounting Information Systems

• Copyright © 2021 Pearson Education Ltd.

• Copyright © 2021 Pearson Education Ltd.

Table 6.1 Attributes of High-Quality

• Copyright © 2021 Pearson Education Ltd.

• Copyright © 2021 Pearson Education Ltd.

• Copyright © 2021 Pearson Education Ltd.

Figure 6.4 Examples of Different

• Copyright © 2021 Pearson Education Ltd.

Figure 6.5 Pivoting S&S Data

• Copyright © 2021 Pearson Education Ltd.

Figure 6.6 Pivoting Figure 6.5 Data

• Copyright © 2021 Pearson Education Ltd.

• Copyright © 2021 Pearson Education Ltd.

• Copyright © 2021 Pearson Education Ltd.

Figure 6.7 Data Parsing Example

• Copyright © 2021 Pearson Education Ltd.

Figure 6.8 Data Concatenation Example

• Copyright © 2021 Pearson Education Ltd.

• Copyright © 2021 Pearson Education Ltd.

• Copyright © 2021 Pearson Education Ltd.

• Copyright © 2021 Pearson Education Ltd.

• Copyright © 2021 Pearson Education Ltd.

• Copyright © 2021 Pearson Education Ltd.

• Copyright © 2021 Pearson Education Ltd.

You might also like