Course Title Data Mining and Data Warehousing
Module Title
Module Code M4913 Course Code: INSC 4914
CP/ECTS 6
Study Hour Lecture: 48 Laboratory: 48 Tutorial: 0 Home Study: 93
Learning Outcomes By the end of this course, students should be able to:
Provide the student with an understanding of the concepts of data
warehousing and data mining
Study the dimensional modeling technique for designing a data warehouse
Study data warehouse architectures, OLAP and the project planning aspects
in building a data warehouse
Explain the knowledge discovery process
Course Content
Topic Duration References
Chapter 1: Introduction to Data Mining 1 -3 Text: Chapter
o What is data mining?
o Data Mining Goals
o Stages of the Data Mining Process
o Overview of Data Mining Techniques
o Knowledge Representation Methods
o Integration of a data mining system with Related
technologies - Machine Learning, DBMS, OLAP, Data
Warehouses, Statistics
o Major issues in Data Mining
o Applications of Data Mining
Chapter 2. Data Warehousing and Online Analytical Processing 4-7 Text: Chapter
2.1. Introduction to Data Warehouse
2.2. Characteristics of Data Warehouse
2.3. Types of Data Warehouse
2.4. Differences between Operational Database Systems
and Data Warehouses
2.5. Tools for Data warehouse development
2.6. Data Warehousing: A Multitiered Architecture
2.7. Data Warehouse Models: Enterprise Warehouse, Data
Mart, and Virtual Warehouse
2.8. Extraction, Transformation, and Loading
2.9. Metadata Repositories
2.10. Advantages and Applications of Data Warehouse
2.11. Introduction to OLAP
2.12. Characteristics of OLAP
2.13. Steps in the OLAP Creation Process
2.14. Advantages of OLAP
2.15. OLAP Architectures
2.16. Differences between OLTP Systems and Data
Warehousing
Chapter 3: Data Preparation for Knowledge discovery 8-9
2.1. Introduction
2.2. Need for preprocessing or preparation the data
2.3. Data cleaning
2.4. Data integration and transformation
2.5. Data reduction
2.6. Discretization and concept Hierarchy generation
10-12 Text: Chapter
Chapter 4:Data mining algorithms
4.1. Association rules
4.1.1. Basic Algorithms
4.1.2. Advanced Association Rule Techniques
4.1.3. Measuring the Quality of Rules
4.2. Classification and Prediction
4.2.1. Basic issues regarding classification and predication
4.2.2. Classification by Decision Tree
4.2.3. Bayesian classification
4.2.4. classification by back propagation
4.2.5. Associative classification
4.2.6. Prediction
4.2.7. Statistical-Based Algorithms
4.2.8. Decision Tree -Based Algorithms
4.2.9. Neural Network -Based Algorithms
4.2.10. Rule-Based Algorithms
4.2.11. Combining Techniques
4.2.12. Classifier Accuracy and Error Measures
4.3. Clustering
4.3.1. Basic issues in clustering
4.3.2. First conceptual clustering system: Cluster/2
4.3.3. Partitioning methods: k-means, expectation
maximization (EM)
4.3.4. Hierarchical methods: distance-based agglomerative
and divisible clustering
4.3.5. Conceptual clustering: Cobweb
4.3.6. Experiments with Weka - k-means, EM, Cobweb
Chapter 5: Advanced techniques, Data Mining software and 13-14 Text: Chapter
applications
5.1. Text mining: extracting attributes (keywords),
structural approaches (parsing, soft parsing).
5.2. Bayesian approach to classifying text
5.3. Web mining: classifying web pages, extracting
knowledge from the web
5.4. Data Mining software and applications
Chapter 6: Data Mining and Society; Future Directions 15-16 Text: Chapter
6.1. Data Mining and Society: Ethics, Privacy, and Security
issues
6.2. Future Directions for Data Mining, web mining, text mining
Teaching Strategy
Assessment Criteria
Assessment Forms % of credit allotted
Lecture (100%)
Practice (100%)
Role of Instructor(s)
Role of Students
Required software Weka,Rapid Miner, Knime, Orange, Oracle data mining
and/or hardware Rattle (R Language) and etc..
Reference Text Books:
1. Principles of Data Mining, Max Bramer, Springer
2. Data Mining Practical Machine Learning Tools and Techniques, Ian
H. Witten, Eibe Frank andMark A. Hall, Elsevier
3. Tan, P-N, Steinbach, M., Kumar, V. Introduction to Data Mining.
Addison Wesley, 2006.
4. J. Han and M. Kamber, "Data Mining: Concepts and Techniques",
Morgan Kaufman, 3/E, 2011.
5. Han, J. and Kamber, M., Data Mining: Concepts and Techniques,
Morgan Kaufmann, 2012.
6. Alex Berson, Stephen J. Smith, "Data Warehousing, Data Mining,
and OLAP", MGH, 1998.
7. P.-N. Tan, M. Steinbach, and V. Kumar, Introduction to Data Mining,
Addison Wesley, 2005.
8. Mohammed J. Zaki, Wagner Meira Jr., Data Mining and Analysis:
Fundamental Concepts and Algorithms, Cambridge Press, 2014.
9. Data Mining Techniques – Arun K Pujari, 2 nd Edition, Universities
Press.
10. Data Warehousing in Real world- Sam Anhory& Dennis Murray
Pearson Edn Asia.
11. Insight into Data Mining, K.P. Soman, V. Vijay, PHI,2008.
12. Data warehousing Fundamentals - -PaulrajPonnaiah Wiley Student Edition