Chapter 6
Foundation of Business Intelligence:
Database & Information Management
Management Information System _ Lai Vinh Phuc, MBA
Learning Objectives
• What are the problems of managing data resources in a traditional
file environment?
• What are the major capabilities of database management systems
(DBMS), and why is a relational DBMS so powerful?
• What are the principal tools and technologies for accessing
information from databases to improve business performance and
decision making?
• Why are information policy, data administration, and data quality
assurance essential for managing the firm’s data resources?
Management Information System _ Lai Vinh Phuc, MBA
Content:
I. Problem of Managing Data Resources
II. Database Management System (DBMS)
III. Tools & Technologies accessing information in database
IV. Information Policy & Data Quality
Management Information System _ Lai Vinh Phuc, MBA
I. Problem of Managing Data Resources
1. File Organization Terms & Concepts:
• Database: Group of related files
• File: Group of records of same type
• Record: Group of related fields
• Field: Group of characters as word(s) or number(s)
• Entity: Person, place, thing on which we store information
• Attribute: Each characteristic, or quality, describing entity
What are entities and attributes in a university database?
Management Information System _ Lai Vinh Phuc, MBA
I. Problem of Managing Data Resources
Management Information System _ Lai Vinh Phuc, MBA
Figure 6.1: The Data Hierarchy
I. Problem of Managing Data Resources
2. Problem with the Traditional File Environment:
• Files maintained separately by different departments
• Data redundancy
• Data inconsistency
• Program-data dependence
• Lack of flexibility
• Poor security
• Lack of data sharing and availability
Management Information System _ Lai Vinh Phuc, MBA
I. Problem of Managing Data Resources
2. Problem with the Traditional File Environment:
Figure 6.2: Traditional File Processing
Management Information System _ Lai Vinh Phuc, MBA
2. Problem with the Traditional File Environment:
a. Data redundancy & Data inconsistency:
• Presence of duplicate data in multiple data files stored in different location.
• Occurring when collecting independently the same piece of data.
Waste resources and data inconsistency (same attribute with different
value)
• E.g: In 6.1, same attribute “Date” in BA database for register course date
and in IT database for end course date.
Student_ID & ID, Size of clothes (extra large vs XL)
Management Information System _ Lai Vinh Phuc, MBA
2. Problem with the Traditional File Environment:
b. Program-Data Dependence:
• Data stored in files and specific programs require to update and maintain
due to the changes of in programs
c. Lack of flexibility:
• Traditional file system cannot response as usual routine due to
unanticipated information.
d. Poor Security:
• Little control or management of data
e. Lack of Data Sharing & Availability:
• Detection of different value of same information in two systems cause data
inaccuracy.
Management Information System _ Lai Vinh Phuc, MBA
II. Database Management System (DBMS)
1. Database Management System (DBMS):
• Database: Serves many applications by centralizing data and
controlling redundant data
• Database management system (DBMS):
• Interfaces between applications and physical data files
• Separates logical and physical views of data
• Solves problems of traditional file environment
- Controls redundancy
- Eliminates inconsistency
- Uncouples programs and data
- Enables organization to centrally manage data and data security
Management Information System _ Lai Vinh Phuc, MBA
II. Database Management System (DBMS)
Figure 6.3: Human Resources Database with Multiple Views
Management Information System _ Lai Vinh Phuc, MBA
II. Database Management System (DBMS)
2. Relational DBMS:
• Represent data as two-dimensional tables
• Each table contains data on entity and attributes
• Table: grid of columns and rows
- Rows (tuples): Records for different entities
- Fields (columns): Represents attribute for entity
- Key field: Field used to uniquely identify each record
- Primary key: Field in table used for key fields
- Foreign key: Primary key used in second table as look-up field to identify records
from original table
Management Information System _ Lai Vinh Phuc, MBA
II. Database Management System (DBMS)
Management Information System _ Lai Vinh Phuc, MBA Figure 6.4: Rational Database Tables
II. Database Management System (DBMS)
3. Operations of a Relational DBMS:
• Three basic operations used to develop useful sets of data
- SELECT: Creates subset of data of all records that meet stated criteria
- JOIN: Combines relational tables to provide user with more
information than available in individual tables
- PROJECT: Creates subset of columns in table, creating tables with only
the information specified
Management Information System _ Lai Vinh Phuc, MBA
II. Database Management System (DBMS)
Management Information System _ Lai Vinh Phuc, MBA Figure 6.5: Rational Database Tables
II. Database Management System (DBMS)
4. Capabilities of Database Management Systems:
• Data definition capability: specify the structure of the content of database
• Data dictionary: automated or manual file stored definition of data
definition & characteristics
E.g: name, description, size, type or ownership, security, user.
• Querying and reporting:
• Data manipulation language: used to add, change, delete, retrieve data.
• Structured Query Language (SQL)
• Many DBMS have report generation capabilities for creating polished
reports (Microsoft Access)
Management Information System _ Lai Vinh Phuc, MBA
II. Database Management System (DBMS)
Management Information System _ Lai Vinh Phuc, MBA Figure 6.6: Access Data Dictionary Features
II. Database Management System (DBMS)
Figure 6.7: Example of an SQL Query
Management Information System _ Lai Vinh Phuc, MBA
II. Database Management System (DBMS)
Figure 6.8: An Access Query
Management Information System _ Lai Vinh Phuc, MBA
II. Database Management System (DBMS)
5. Designing Databases:
• Conceptual design: abstract model of database from business perspective
• Physical design: how database is arranged on storage devices
• Normalization
- Streamlining complex groupings of data to minimize redundant data
elements and awkward many-to-many relationships
• Referential integrity
- Rules used by RDBMS to ensure relationships between tables remain
consistent
• Entity-relationship diagram
• A correct data model is essential for a system serving the business well
Management Information System _ Lai Vinh Phuc, MBA
II. Database Management System (DBMS)
Figure 6.9: An Unnormalized Relation for Order
Management Information System _ Lai Vinh Phuc, MBA
II. Database Management System (DBMS)
Figure 6.10: Normalized Tables Created from ORDER
Management Information System _ Lai Vinh Phuc, MBA
II. Database Management System (DBMS)
Figure 6.11: An Entity-Relationship Diagram
Management Information System _ Lai Vinh Phuc, MBA
II. Database Management System (DBMS)
6. Non-relational Databases and Databases in the Cloud:
• Non-relational databases: “NoSQL”
• More flexible data model
• Data sets stored across distributed machines
• Easier to scale
• Handle large volumes of unstructured and structured data
• Databases in the cloud
• Appeal to start-ups, smaller businesses
• Amazon Relational Database Service, Microsoft SQL Azure
• Private clouds
Management Information System _ Lai Vinh Phuc, MBA
Discussion Question
Why does organization want to store social media data in non-
relational database?
ANSWER: Social media involves many different file types, and the
information is not easily organized into tables of columns and rows
Management Information System _ Lai Vinh Phuc, MBA
III. Tools & Technologies Access Information in
Database
1. The Challenge of Big Data:
• Big data
• Massive sets of unstructured/semi-structured data from web traffic, social
media, sensors, and so on
• Volumes too great for typical DBMS
• Petabytes, exabytes of data
• Can reveal more patterns, relationships and anomalies
• Requires new tools and technologies to manage and analyze
E.g: Analyzing data from customer credit card purchases => know what
they buy => improve menu (vegies, seafood or drink)
Management Information System _ Lai Vinh Phuc, MBA
III. Tools & Technologies Access Information in
Database
2. Building Intelligence Infrastructure:
• Array of tools for obtaining information from separate systems and
from big data
• Data warehouse:
– Stores current and historical data from many core operational transaction
systems
– Consolidates and standardizes information for use across enterprise, but data
cannot be altered
– Provides analysis and reporting tools
Management Information System _ Lai Vinh Phuc, MBA
III. Tools & Technologies Access Information in
Database
2. Building Intelligence Infrastructure:
• Data marts
– Subset of data warehouse
– Typically focus on single subject or line of business
• Hadoop
• Open source software framework enables distributed parallel processing of big
data across inexpensive computers
• Key services
• Hadoop Distributed File System (HDFS): data storage
• MapReduce: breaks data into clusters for work
• Hbase: NoSQL database
• Used Yahoo, NextBio
Management Information System _ Lai Vinh Phuc, MBA
III. Tools & Technologies Access Information in
Database
2. Building Intelligence Infrastructure:
• In-memory computing
• Used in big data analysis
• Uses computers main memory (RAM) for data storage to avoid delays in
retrieving data from disk storage
• Can reduce hours/days of processing to seconds
• Requires optimized hardware
• Analytic platforms
• High-speed platforms using both relational and non-relational tools optimized for
large datasets
Management Information System _ Lai Vinh Phuc, MBA
III. Tools & Technologies Access Information in
Database
Management Information System _ Lai Vinh Phuc, MBA Figure 6.12: Contemporary Business Intelligence Infrastructure
III. Tools & Technologies Access Information in
Database
3. Analytical Tools: Relationships, Patterns, Trends:
• Tools for consolidating, analyzing, and providing access to vast
amounts of data to help users make better business decisions
- Multidimensional data analysis (OLAP)
- Data mining
- Text mining
- Web mining
Management Information System _ Lai Vinh Phuc, MBA
3. Analytical Tools: Relationships, Patterns, Trends
a. Online Analytical Processing (OLAP):
• Supports multidimensional data analysis
- Viewing data using multiple dimensions
- Each aspect of information (product, pricing, cost, region, time period)
is different dimension
• OLAP enables rapid, online answers to ad hoc queries
Management Information System _ Lai Vinh Phuc, MBA
3. Analytical Tools: Relationships, Patterns, Trends
Management Information System _ Lai Vinh Phuc, MBA
Figure 6.13: Multidimensional Data Model
3. Analytical Tools: Relationships, Patterns, Trends
b. Data Mining:
• Finds hidden patterns, relationships in datasets
- E.g: customer buying patterns
• Infers rules to predict future behavior
• Types of information obtainable from data mining:
- Associations: occurrences linked to single event
- Sequences: events linked over time
- Classification: recognizes patterns that describe group to which item belongs
- Clustering: similar to classification when no groups have been defined; finds
groupings within data
- Forecasting: uses series of existing values to forecast what other values will be
Management Information System _ Lai Vinh Phuc, MBA
3. Analytical Tools: Relationships, Patterns, Trends
c. Text Mining:
• Extracts key elements from large unstructured data sets
• Sentiment analysis software
d. Web mining:
• Discovery and analysis of useful patterns and information from web
+ Web content mining
+ Web structure mining
+ Web usage mining
E.g: marketers use Google trends and Google Insights for search
services => trach the popularity of words and phrases which consumers
seek for.
Management Information System _ Lai Vinh Phuc, MBA
III. Tools & Technologies Access Information in
Database
4. Databases and the Web:
• Many companies use the web to make some internal databases
available to customers or partners
• Typical configuration includes:
- Web server
- Application server/middleware/CGI scripts
- Database server (hosting DBMS)
• Advantages of using the web for database access:
- Ease of use of browser software
- Web interface requires few or no changes to database
- Inexpensive to add web interface to system
Management Information System _ Lai Vinh Phuc, MBA
4. Databases and the Web
Figure 6.14: Linking Internal Databases to the Web
Management Information System _ Lai Vinh Phuc, MBA
IV. Information Policy & Data Quality
1. Establishing an Information Policy:
• Firm’s rules, procedures, roles for sharing, managing, standardizing data
• Data administration
- Establishes policies and procedures to manage data
• Data governance
- Deals with policies and processes for managing availability, usability, integrity, and
security of data, especially regarding government regulations
• Database administration
- Creating and maintaining database
Management Information System _ Lai Vinh Phuc, MBA
IV. Information Policy & Data Quality
2. Ensuring Data Quality:
• More than 25 percent of critical data in Fortune 1000 company
databases are inaccurate or incomplete
E.g: Providing wrong contact number or address
• Before new database is in place, a firm must:
- Identify and correct faulty data
- Establish better routines for editing data once database in operation
• Data quality audit
• Data cleansing
Management Information System _ Lai Vinh Phuc, MBA
Recap
Problem of Managing Data Resources
Database Management System (DBMS)
Tools & Technologies accessing information in database
Information Policy & Data Quality
Management Information System _ Lai Vinh Phuc, MBA
Next Week
Chapter 7: Telecommunications, the Internet, and Wireless Technology
Management Information System _ Lai Vinh Phuc, MBA