Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
50 views53 pages

Data & Business Intelligence

The document discusses databases and database management systems. It covers topics like data hierarchy, types of data models including hierarchical, network and relational models, database components, query languages, trends like distributed and object-oriented databases.

Uploaded by

Navajyoti Dhar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views53 pages

Data & Business Intelligence

The document discusses databases and database management systems. It covers topics like data hierarchy, types of data models including hierarchical, network and relational models, database components, query languages, trends like distributed and object-oriented databases.

Uploaded by

Navajyoti Dhar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 53

Data & Business Intelligence

Databases
• Database is collection of related data that can be stored in a central
location or in multiple locations
• Usually a group of files
• File: Group of records of same type
• Record: Group of related fields
• Field: Group of characters as word(s) or number(s)
• Entity: Person, place, thing on which we store information
• Attribute: Each characteristic, or quality, describing entity
Databases
• Data hierarchy is the structure and organization of data, which
involves fields, records, and files.
• A database management system (DBMS) is software for creating,
storing, maintaining, and accessing database files.
• A DBMS makes using databases more efficient.
Databases

Data
Hierarchy
Databases
Databases
• ADVANTAGES
• Complex requests can be handled more easily.
• Data redundancy is eliminated or minimized.
• Programs and data are independent, so more than one program
can use the same data.
• Data management is improved.
• A variety of relationships among data can be easily maintained.
• More sophisticated security measures can be used.
• Storage space is reduced
Types of Data in a Database
• Internal data
• Collected within organization
• Transaction records, sales records, personnel records
• External data
• Competitors, customers, and suppliers
• Distribution networks
• Economic indicators (e.g., the consumer price index)
• Government regulations
• Labor and population statistics
• Tax records
Methods for Accessing Files
• Sequential file structure
• Records organized and processed in numerical or sequential order
• Organized based on a “primary key”
• Usually used for backup and archive files
• Because they need updating only rarely
• Random access file structure
• Records can be accessed in any order
• Fast and very effective when a small number of records need to be
processed daily or weekly
Methods for Accessing Files
• Indexed sequential access method (ISAM)
• Records accessed sequentially or randomly
• Depending on the number being accessed
• Uses an index structure with two parts:
• Indexed value
• Pointer to the disk location of the record matching the indexed
value
Logical Database Design
• Physical view
• How data is stored on and retrieved from storage media
• Logical view
• How information appears to users
• How it can be organized and retrieved
• Can be more than one logical view
Logical Database Design
• Data model
• Determines how data is created, represented, organized
• Data structure—Describes how data is organized and the
relationship among records
• Operations—Describe methods, calculations that can be
performed on data, such as updating and querying data
• Integrity rules—Define the boundaries of a database, such as
maximum and minimum values allowed for a field, constraints
and access methods
Logical Database Design
• Hierarchical model
• Relationships between records form a treelike structure
• Records are called nodes
• Relationships among records are called branches.
• The node at the top is called the root
• Every other node (called a child) has a parent.
• Nodes with the same parents are called twins or siblings.
Logical Database Design

Hierarchical
model
Logical Database Design
• Network model
• Similar to the hierarchical model
• Records are organized differently
• Each record in the network model can have multiple parent and
child records
Logical Database Design

Network
model
Logical Database Design
• Relational model
• Uses a two-dimensional table of rows and columns of data
• Data dictionary
• Field name—Student name, admission date, age, and major
• Field data type—Character (text), date, and number
• Default value—The value entered if none is available; for example,
if no major is declared, the value is “undecided.”
• Validation rule—A rule determining whether a value is valid; for
example, a student’s age cannot be a negative number.
Logical Database Design
• Relational model
• Each table contains data on entity and attributes
• Table: grid of columns and rows
• Rows (tuples): Records for different entities
• Fields (columns): Represents attribute for entity
• Key field: Field used to uniquely identify each record
• Primary key: Field in table used for key fields
• Foreign key: Primary key used in second table as look-up field
to identify records from original table
Logical Database Design

Relational
database
tables
Logical Database Design

Relational
database
tables
Logical Database Design
• Relational model
• Normalization improves database efficiency by eliminating
redundant data and ensuring that only related data is stored in a
table.
• Data retrieval
• Three basic operations used to develop useful sets of data
• SELECT: Creates subset of data of all records that meet stated
criteria
• JOIN: Combines relational tables to provide user with more
information than available in individual tables
• PROJECT: Creates subset of columns in table, creating tables with
only the information specified
Logical Database Design

Relational
database
tables
Logical Database Design

Relational
database
tables
Components of a DBMS
• Database Engine
• Heart of DBMS software
• Responsible for data storage, manipulation, and retrieval
• Converts logical requests from users into their physical equivalents
• Data Definition
• Create and maintain the data dictionary
• Define the structure of files in a database
• Adding fields
• Deleting fields
• Changing field size
• Changing data type
Components of a DBMS
• Data Manipulation
• Add, delete, modify, and retrieve records from a database
• Query language
• Structured Query Language (SQL)
• Standard fourth-generation query language used by many
DBMS packages
• The basic format of an SQL query is as follows:
SELECT field FROM table or file WHERE conditions
SELECT NAME, SSN, TITLE, GENDER, SALARY
FROM EMPLOYEE, PAYROLL
WHERE EMPLOYEE.SSN=PAYROLL.SSN AND
TITLE5“ENGINEER”
Components of a DBMS
• Data Manipulation
• Query by example (QBE)
• Construct statement of query forms
• Graphical interface
• Finetune the query
• AND—Means that all conditions must be met.
• OR—Means only one of the conditions must be met.
• NOT—Searches for records that do not meet the condition.
Components of a DBMS
• Application Generation
• Design elements of an application using a database
• Data entry screens
• Interactive menus
• Interfaces with other programming languages
• Data Administration
• Used for Backup and recovery, Security, Change management
• Create, read, update, and delete (CRUD)
• Database administrator (DBA) : Individual or department
• Responsibilities
Components of a DBMS
• Data Administration
• Database administrator (DBA) : Individual or department
• Designing and setting up a database
• Establishing security measures to determine users’ access
rights
• Developing recovery procedures in case data is lost or
corrupted
• Evaluating database performance
• Adding and finetuning database functions
Recent Trends in Database Design and Use
• Data-driven Web sites
• Interface to a database
• Retrieves data and allows users to enter data
• Improves access to information
• Useful for:
• E-commerce sites that need frequent updates
• News sites that need regular updating of content
• Forums and discussion groups
• Subscription services, such as newsletters
Recent Trends in Database Design and Use
• Distributed database system
• Data is stored on multiple servers placed throughout an
organization
• Reasons for choosing
• Decrease response time/network traffic
• Minimize effect of computer failures
• Small integrated systems may cost less than one large server
Recent Trends in Database Design and Use
• Distributed database system
• Approaches for setup
• Fragmentation: how tables are divided among multiple
locations.
• Horizontal: breaks a table into rows, storing all fields
(columns) in different locations.
• Vertical fragmentation stores a subset of columns in
different locations.
• Mixed fragmentation, which combines vertical and
horizontal fragmentation, stores only site-specific data in
each location.
Recent Trends in Database Design and Use
• Distributed database system
• Approaches for setup
• Replication: each site store a copy of the data in the
organization’s database.
• Allocation: each site stores the data it uses most often.
• Security issues because of multiple access points from both inside
and outside the organization.
• Security policies, scope of user access, and user privileges must
be clearly defined, and authorized users must be identified.
Recent Trends in Database Design and Use
• Object-oriented database
• This data model represents real-world entities with database
objects.
• An object consists of attributes (characteristics describing an
entity) and methods (operations or calculations) that can be
performed on the object’s data.
Recent Trends in Database Design and Use
• Object-oriented database
Recent Trends in Database Design and Use
• Object-oriented database
• Encapsulation: Grouping objects along with their attributes and
methods into a class
• Inheritance: New objects can be created faster and more easily by
entering new data in attributes
• Interaction with an object-oriented database takes places via
methods
Recent Trends in Database Design and Use
• Cloud Databases
• Special appeal for businesses seeking database capabilities at a
lower cost than in-house database products.
• Microsoft Azure SQL Database
Recent Trends in Database Design and Use
• Blockchain
• Distributed database technology to create and verify transactions
on a network nearly instantaneously without a central authority.
• Distributed ledgers in a peer-to-peer distributed database
• Maintains a growing list of records and transactions shared by all
• Encryption used to identify participants and transactions
• Used for financial transactions, supply chain, and medical records
• Foundation of Bitcoin, and other crypto currencies
Recent Trends in Database Design and Use

How
Blockchain
Works
Business Intelligence Infrastructure
• Array of tools for obtaining information from separate systems and
from big data
• Data warehouse
• Stores current and historical data from many core operational
transaction systems
• Support decision-making applications and generate business
intelligence
• Store multidimensional data, called hypercubes
Business Intelligence Infrastructure
• Data warehouse
• Characteristics
• Subject oriented: Focused on a specific area
• Integrated: Comes from a variety of sources
• Time variant: Categorized based on time
• Type of data: Captures aggregated data
• Purpose: Used for analytical purposes
Business Intelligence Infrastructure
• Data warehouse
• Components
• Input: External, Databases, Transaction files, ERP systems, CRM
systems
• Extraction, transformation, and loading (ETL)
• Extraction
• Collecting data from a variety of sources
• Transformation processing
• Make sure data meets the data warehouse’s needs
• Loading
• Process of transferring data to the data warehouse
Business Intelligence Infrastructure

• Data warehouse
• Components
Business Intelligence Infrastructure
• Data warehouse
• Components
• Storage
• Raw data: information in its original form
• Summary data: users subtotals of various categories
• Metadata: information about data—its content, quality,
condition, origin, and other characteristics.
Business Intelligence Infrastructure
• Data warehouse
• Components
• Output
• Online analytical processing (OLAP)
• Generates business intelligence
• Uses multiple sources of information and provides
multidimensional analysis
• Hypercube
• Drill down and drill up
• Data-mining analysis
• Discover patterns and relationships
Business Intelligence Infrastructure
• Data warehouse
• Components
• Output
• Reports
• Cross-reference segments of an organization’s
operations for comparison purposes
• Find patterns and trends that can’t be found with
databases
• Analyze large amounts of historical data quickly
Business Intelligence Infrastructure
• Data mart
• Smaller version of data warehouse
• Used by single department or function
• Advantages over data warehouses
• More limited scope than data warehouses
Business Intelligence Infrastructure
• Business analytics
• Uses data and statistical methods to gain insight into the data
• Provide decision makers with information they can act on
• Leverages and explores the data in a database, data warehouse, or
data mart system
• Descriptive analytics
• Reviews past events
• Analyzes the data
• Provides a report indicating what happened in a given period
• How to prepare for the future
• It is a reactive strategy.
Business Intelligence Infrastructure
• Business analytics
• Predictive analytics
• It is a proactive strategy
• It prepares a decision maker for future events
• Prescriptive analytics
• Recommending a course of action that a decision maker
should follow
• Shows the likely outcome of each decision
• Amazon Analytics, Google Analytics, and Twitter Analytics
• Web analytics: efficiency and effectiveness of a Web site
• Mobile analytics: measures traffic among mobile devices and all
the apps used by these mobile devices
Big Data Era
• The Challenge of Big Data
• Massive sets of unstructured/semi-structured data from web
traffic, social media, sensors, and so on
• Can reveal more patterns, relationships and anomalies
• Requires new tools and technologies to manage and analyze
• Volumes: Petabytes, exabytes of data
• Variety: Structured/Unstructured Data
• Velocity: Speed of gathering and processing data
• Veracity: Trustworthiness and accuracy of data
• Value: Value for the decision making process
Big Data Era
• Benefits from Big Data
• Retail
• Financial services
• Advertising and public relations
• Government
• Manufacturing
• Media and telecommunications
• Energy
• Healthcare
Big Data Era
• Tools and Technologies of Big Data
• Open-source Apache Hadoop
• Hadoop Distributed File System (HDFS) to manage storage.
• Distributed databases, including NoSQL and Cassandra
• Examples of big data commercial platforms
• SAP Big Data Analytics (www.sap.com/BigData)
• Tableau (www.tableausoftware.com)
• SAS Big Data Analytics (www .sas.com/big-data)
• QlikView (www.qlikview.com).
Big Data Era
• Big Data Privacy Risks
• Discrimination
• Privacy breaches and embarrassments
• Unethical actions based on interpretations
• Loss of anonymity
Database Marketing
• An organization’s database of customers and potential customers in
order to promote products or services that an organization offers.
• Implement marketing strategies that eventually increase profits
and enhance the competitiveness
• Use multivariate analysis, data segmentation, and automated
tools
• Loyalty programs, such as grocery chain club cards, airline mileage
programs, My Starbucks Rewards.
Database Marketing
•Successful Database marketing campaigns:
• Calculating customer lifetime value (CLTV)—what the lifetime
relationship of a typical customer will be worth to a business.
• Recency, frequency, and monetary analysis (RFM)—how
valuable a customer is based on the recentness of purchases,
frequency of purchases, and how much the customer spends
• Customer communications— Techniques to communicate
effectively with customers increases loyalty, customer
retention, and sales.
• Analytical software—Techniques in order to monitor
customers’ behavior across a number of retail channels,
including Web sites, mobile apps, and social media.

You might also like