0% found this document useful (0 votes)

27 views25 pages

Itc 601 Dmbi-Module 6 Notes

Business Intelligence (BI) is a set of processes and technologies that transform raw data into actionable insights for better decision-making in organizations. It plays a crucial role in measuring performance, identifying trends, and improving operational efficiency, while also facing challenges such as cost, complexity, and data quality issues. The document outlines the development, architecture, advantages, disadvantages, and current trends in BI, emphasizing its importance in strategic business operations.

Uploaded by

kartikrlende

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views25 pages

Itc 601 Dmbi-Module 6 Notes

Uploaded by

kartikrlende

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

INFORMATION TECHNOLOGY DEAPRTMENT

ITC 601 : DATA MINING AND BUSINESS INTELLIGENCE

MODULE 6 (BUSINESS INTELLIGENCE)

6.1 WHAT IS BI ?
Define Business Intelligence with examples
BI (Business Intelligence) is a set of processes, architectures, and technologies that convert raw
data into meaningful information that drives profitable business actions. It is a suite of software and
services to transform data into actionable intelligence and knowledge.
BI has a direct impact on organization's strategic, tactical and operational business decisions. BI
supports fact-based decision making using historical data rather than assumptions and gut feeling.
BI tools perform data analysis and create reports, summaries, dashboards, maps, graphs, and
charts to provide users with detailed intelligence about the nature of the business.
Business intelligence is defined set of mathematical models and analysis methodologies that
exploit the available data to generate information and knowledge useful for complex decision-
making processes. as a
A business intelligence system provides decision makers with information and knowledge extracted
from data.

Why is BI important ?
1. Measurement: creating KPI (Key Performance Indicators) based on historic data.
2. Identify and set benchmarks for varied processes.
3. With BI systems organizations can identify market trends and spot business problems that
need to be addressed.
4. BI helps on data visualization that enhances the data quality and thereby the quality of decision
making.
5. BI systems can be used not just by enterprises but SME (Small and Medium Enterprises)

Examples of Business Intelligence System used in Practice

Example 1 : A hotel owner uses BI analytical applications to gather statistical information regarding
average occupancy and room rate. It helps to find aggregate revenue generated per room. It also
collects statistics on market share and data from customer surveys from each hotel to decides its
competitive position in various markets. By analyzing these trends year by year, month by month
and day by day helps management to offer discounts on room rentals.
Example 2 : A bank gives branch managers access to BI applications. It helps branch manager to
determine who are the most profitable customers and which customers they should work on. The
use of BI tools frees information technology staff from the task of generating analytical reports for
the departments. It also gives department personnel access to a richer data source. .

Types of BI users
1. The Professional Data Analyst : The data analyst is a statistician who always needs to drill
deep down into data. BI system helps them to get fresh insights to develop unique business
strategies.
2. The IT users : The IT user also plays a dominant role in maintaining the BI infrastructure. :
3. The head of the company : CEO or CXO can increase the profit of their business by improving
operational efficiency in their business.
4. The Business Users : Business intelligence users can be found from across the organization.

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 1

INFORMATION TECHNOLOGY DEAPRTMENT
ITC 601 : DATA MINING AND BUSINESS INTELLIGENCE
MODULE 6 (BUSINESS INTELLIGENCE)

There are mainly two types of business users :

(a) Casual business intelligence user
(b) The power user.
The difference between both of them is that a power user has the capability of working with
complex data sets, while the casual user need will make him use dashboards to evaluate
predefined sets of data.

Advantages of Business Intelligence

Here are some of the advantages of using Business Intelligence System :

1. Boost productivity : With a BI program, it is possible for businesses to create reports with a
single click thus saves lots of time and resources. It also allows employees to be more productive
on their tasks.
2. To improve visibility : BI also helps to improve the visibility of these processes and make it
possible to identify any areas which need attention.
3. Fix Accountability : BI system assigns accountability in the organization as there must be
someone who should own accountability and ownership for the organization's performance against
its set goals.
4. It gives a bird's eye view : BI system also helps organizations as decision makers get an
overall bird's eye view through typical BI features like dashboards and scorecards.
5. It streamlines business processes : BI takes out all complexity associated with business
processes. It also automates analytics by offering predictive analysis, computer modeling, bench-
marking and other methodologies.
6. It allows for easy analytics : BI software has democratized its usage, allowing even
nontechnical or non-analysts users to collect and process data quickly. This also allows putting the
power of analytics from the hand's many people.

BI System Disadvantages
1. Cost : Business intelligence can prove costly for small as well as for medium-sized enterprises.
The use of such type of system may be expensive for routine business transactions.
2. Complexity : Another drawback of BI is its complexity in implementation of data warehouse. It
can be so complex that it can make business techniques rigid to deal with.
3. Limited use : Like all improved technologies, BI was first established keeping in consideration
the buying competence of rich firms. Therefore, BI system is yet not affordable for many small and
medium size companies.
4. Time Consuming Implementation : It takes almost one and half year for data warehousing
system to be completely implemented. Therefore, it is a time consuming process.

Trends in Business Intelligence

1) Artificial Intelligence : Gartner’s report indicates that AI and machine learning now take on
complex tasks done by human intelligence. This capability is being leveraged to come up with real-
time data analysis and dashboard reporting.
2) Collaborative BI : BI software combined with collaboration tools, including social media, and
other latest technologies enhance the working and sharing by teams for collaborative decision
making.

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 2

INFORMATION TECHNOLOGY DEAPRTMENT
ITC 601 : DATA MINING AND BUSINESS INTELLIGENCE
MODULE 6 (BUSINESS INTELLIGENCE)

3) Embedded BI : Embedded BI allows the integration of BI software or some of its features into
another business application for enhancing and extending its reporting functionality.
4) Cloud Analytics : BI applications will be soon offered in the cloud, and more businesses will be
shifting to this technology. As per their predictions within a couple of years, the spending on cloud-
based analytics will grow 4.5 times faster.

Issues in Business Intelligence

1. Organizations and People
Management within your organization are not convinced that data driven or evidence based
decisions really works for them. They prefer to run the operation from instinct. There is no clear
overall business strategy laid out with objectives and measures related to those objectives to
assess business progress. IT personnel are overloaded and have no resource available to source
the data you need for your Business Intelligence (BI)system. There are no incentives for the staff
within your organization to improve the performance of the business either using BI or not. The
business is in a state of stress or high change or flux. There is no apparent or perceived time to
establish a BI system. The eventual consumers of the BI system do not really know what they want
from a BI system until they see it. This means lots of changes are required to the solution before it
is accepted. IT experts building the system do not really understand the business, and so many
changes are needed to have the system accepted by the organization. The company does not have
sufficient expertise or is not able to hire such expertise to manage a project implementation on time
and within budget or to design the system adequately. .

2. Data and Technology

The data of the organization is not clean and the time and effort to correct this or handle this,
destroys the success of the BI project. For example, there are many different coding systems for
the same objects or entities in different systems e.g. customer is coded differently in the Finance
system to that held in the Sales system. Also there could be many different definitions for the same
item. The BI technology chosen turns out to be so rigid and painstaking to change that the takes
too long and costs too much to complete the project on time.

The BI technology used deters use of the system because:

1. The quality of the presentation or visualization of the information is poor or limited.
2. The response times (speed) to present the data is too slow and not acceptable.
3. The flexibility to ask new questions of the BI technology is limited or too difficult or time
consuming to do for either the End Users or BI expert. o

Development of a Business Intelligence System

The development of a business intelligence system can be compared to a project, with a defined
end goal, estimated development timelines and costs, and the allocation and coordination of
resources required to complete specified tasks.

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 3

INFORMATION TECHNOLOGY DEAPRTMENT
ITC 601 : DATA MINING AND BUSINESS INTELLIGENCE
MODULE 6 (BUSINESS INTELLIGENCE)

Analysis : The organization's needs for the construction of a business intelligence system should
be thoroughly identified during the first step. This preparatory step is usually carried out through a
series of interviews with knowledge workers who execute various positions and activities inside the
company. It's important to spell out the project's overall goals and priorities, as well as the costs
and benefits of developing a business intelligence system.

Design : The second phase, which is divided into two sub-phases, aims to derive a tentative plan
for the overall architecture, taking into account any future developments as well as the system's
evolution in the mid - term. First and foremost, a review of existing information infrastructures is
required. Furthermore, in order to adequately evaluate the information requirements, the primary
decision-making processes that will be supported by the business intelligence system should be
examined. Later on, the project plan will be drawn out using traditional project management
approaches, including development phases, priorities, projected execution time frames and costs,
as well as the essential roles and resources. Analysis Identification of business needs Design
Infrastructure recognition Project macro planning Planning Detailed project requirements Definition
of the mathematical models needed Identification of the data Definition of data warehouses and
data marts Development of a prototype Implementation and control Development of data
Warehouses and data marts Development of metadata Development of ETL tools Development of
applications Release and testing

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 4

INFORMATION TECHNOLOGY DEAPRTMENT
ITC 601 : DATA MINING AND BUSINESS INTELLIGENCE
MODULE 6 (BUSINESS INTELLIGENCE)

Planning : A sub-phase of the planning stage is dedicated to defining and describing the functions
of the business intelligence system in greater depth. Following that, existing data, as well as data
that could be collected from outside sources, is evaluated. This enables the business intelligence
architecture's information structures to be created, which include a central data warehouse and
potentially some satellite data marts. Simultaneously with the recognition of available data, the
mathematical models to be used should be defined, ensuring the availability of the data required to
feed each model and ensuring that the efficiency of the algorithms to be used will be adequate for
the magnitude of the problems that will result. Finally, a system prototype should be built at a low
cost and with limited capabilities to discover any discrepancies between actual needs and project
specifications ahead of time.

Implementation and control : There are five major sub-phases in the last phase.
1. The data warehouse and each individual data mart must first be built. The information
infrastructures that will feed the business intelligence system are shown by these data marts.
2. A metadata archive should be developed to explain the meaning of the data in the data
warehouse and the transformations made to the original data in advance.
3. Furthermore, ETL procedures are designed to extract and transform data from primary sources
before loading it into the data warehouse and data marts.
4. The next step is to create the main business intelligence applications that will enable for the
execution of the planned analyses.
5. Finally, the system should be made available for testing and use.

BUSINESS INTELLIGENCE ARCHITECTURE

A business intelligence architecture is the framework for the various technologies an organization
deploys to run business intelligence and analytics applications. It includes the IT systems and
software tools that are used to collect, integrate, store and analyze BI data and then present
information on business operations and trends to corporate executives and other business users.
The underlying BI architecture is a key element in the implementation of a successful business
intelligence program that uses data analysis and reporting to help an organization track business
performance, optimize business processes, identify new revenue opportunities, improve strategic
planning and make more informed decisions overall. A BI architecture can be deployed in an on-
premises data center or the cloud. In either case, it contains a set of core components that
collectively support the different stages of the BI process, from data collection, integration, storage
and analysis to data visualization, information delivery and the use of BI data in business decision-
making.

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 5

INFORMATION TECHNOLOGY DEAPRTMENT
ITC 601 : DATA MINING AND BUSINESS INTELLIGENCE
MODULE 6 (BUSINESS INTELLIGENCE)

Business Intelligence Architecture include the following items :

1. Source systems : These are all of the systems that capture and hold the transactional and
operational data identified as essential for the enterprise BI program, for example, ERP, CRM,
finance, manufacturing and supply chain management systems. They can also include secondary
sources, such as market data and customer databases from outside information providers. As a
result, both internal and external data sources are often incorporated into a BI
architecture.Important criteria in the data source selection process include data relevancy, data
currency, data quality and the level of detail in the available data sets. In addition, a combination of
structured, semi-structured and unstructured data types may be required to meet the data analysis
and decision-making needs of executives and other business users.

2. Data Integration and cleaning tools : To effectively analyze the data collected for a BI program,
organization integrate and consolidate different data sets to create unified views of them. The most
widely used data integration technology for BI applications is extract, transform and load (ETL)
software, which pulls data from source systems in batch processes. A variant of ETL is extract, load
and transform (ELT), in which data is extracted and loaded as is and transformed later for specific
BI uses. Other methods include real-time data integration, such as change data capture and
streaming integration to support real-time analytics applications, and data virtualization, which
combines data from different source systems virtually. A BI architecture typically also includes data
profiling and data cleansing tools that are used to identify and fix data quality issues. They help BI
and data management teams provide clean and consistent data that's suitable for BI uses,

3. Analytics data stores : This encompasses the various repositories where BI data is stored and
managed. The primary one is a data warehouse, which usually stores structured data in a relational,
columnar or Mo multidimensional database and makes it available for querying and analysis. An
enterprise data warehouse can also be tied to smaller data marts set up for individual departments
and business units with data that's specific to their BI needs. In addition, BI architectures often
include an operational data store that's an interim repository for data before it goes into a data
warehouse; an operational data store (ODS) can also be used to run analytical queries against
recent transaction data. Depending on the size of a BI environment, a data warehouse, data marts

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 6

INFORMATION TECHNOLOGY DEAPRTMENT
ITC 601 : DATA MINING AND BUSINESS INTELLIGENCE
MODULE 6 (BUSINESS INTELLIGENCE)

and an ODS can be deployed on a single database server or separate systems. A data lake
running on a Hadoop cluster or other big data platform can also be incorporated into a BI
architecture as a repository for raw data of various types. The data can be analyzed in the data lake
itself or filtered and loaded into a data warehouse for analysis. A well-planned architecture should
specify which of the different data stores is best suited for particular BI uses.

4. BI and data visualization tools : The tools used to analyze data and present information to
business users include a suite of technologies that can be built into a BI architecture, for example,
ad hoc query, data mining and online analytical processing, or OLAP, software. In addition, the
growing adoption of self-service BI tools enables business analysts and managers to run queries
themselves instead of relying on the members of a BI team to do that for them. BI software also
includes data visualization tools that can be used to create graphical representations of data, in the
form of charts, graphs and other types of visualizations designed to illustrate trends, patterns and
outlier elements in data sets.

5. Dashboards, portals and reports : These information delivery tools give business users
visibility into the results of BI and analytics applications, with built-in data visualizations and, often,
self-service capabilities to do additional data analysis. For example, BI dashboards and online
portals can both be designed to provide real-time data access with configurable views and the
ability to drill down into data. Reports tend to present data in a more static format.

6. Other components : Other components that increasingly are part of a business architecture
include data preparation software used to structure and organize data for analysis and a metadata
repository, a business glossary and a data catalog, which can all help users find relevant data and
understand its lineage and meaning.

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 7

INFORMATION TECHNOLOGY DEAPRTMENT
ITC 601 : DATA MINING AND BUSINESS INTELLIGENCE
MODULE 6 (BUSINESS INTELLIGENCE)

DECISION SUPPORT SYSTEM (DSS)

It is an interactive computer-based application that combines data and mathematical models to help
decision makers solve complex problems faced in managing the public and private enterprises and
organizations. A decision support system (DSS) is a computerized program used to support
determinations, judgments, and courses of action in an organization or a business. A DSS go
through and analyzes massive amounts of data, compiling comprehensive information that can be
used to solve problems and in decision-making. Typical information used by a DSS includes target
or projected revenue, sales figures or past ones from different time periods, and other inventory- or
operations-related data. A decision support system gathers and analyzes data, synthesizing it to
produce comprehensive information reports. In this way, as an informational application, a DSS
differs from an ordinary operations application, whose function is just to collect data. The DSS can
either be completely computerized or powered by humans. In some cases, it may combine both.
The ideal systems analyze information and actually make decisions for the user. At the very least,
they allow human users to make more informed decisions at a quicker pace. The DSS can be
employed by operations management and other planning departments in an organization to
compile information and data and to synthesize it into actionable intelligence. In fact, these systems
are primarily used by mid- to upper-level management. For example, a DSS may be used to project
a company's revenue over the upcoming six months based on new assumptions about product
sales. Due to a large number of factors that surround projected revenue figures, this not a
straightforward calculation that can be done manually. However, a DSS can integrate all the
multiple variables and generate an outcome and alternate outcomes, all based on the company's
past product sales data and current variables.

Types of DSS
1. Communication-driven DSS which enables cooperation, supporting more than one person
working on a shared task; examples include integrated tools like Google Docs or Microsoft
Groove.
2. Document-driven DSS which manages, retrieves, and manipulates unstructured information in
a variety of electronic formats.
3. Knowledge-driven DSS provides specialized problem solving expertise stored as facts, rules,
procedures, or in similar structures
4. Model-driven DSS emphasizes access to and manipulation of a statistical, financial,
optimization, or simulation model. Model-driven DSS use data and parameters provided by users
to assist decision makers in analyzing a situation; they are not necessarily data intensive.
5. Data-driven DSS (or data-oriented DSS) emphasizes access to and manipulation of a time
series of internal company data and, sometimes, external data. A data driven DSS, which we will
focus on, emphasizes access to and manipulation of a time series of internal company data and
sometimes external data. Simple file systems accessed by query and retrieval tools provide the
most elementary level of functionality. Data warehouse systems that allow the manipulation of
data by computerized tools tailored to a specific task and setting or by more general tools and
operators provide additional functionality. Data-driven DSS with online analytical processing
(OLAP) provide the highest level of functionality.

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 8

INFORMATION TECHNOLOGY DEAPRTMENT
ITC 601 : DATA MINING AND BUSINESS INTELLIGENCE
MODULE 6 (BUSINESS INTELLIGENCE)

Types of Decisions
1. Structured : A structured decision is one in which the phases of the decision-making process
(intelligence, design, and choice) have standardized procedures, clear objectives, and clearly
specified input and output. There exists a procedure for arriving at the best solution.

2. Unstructured : An unstructured decision is one where not all of the decision-making phases are
structured and human intuition plays an important role.

3. Semi-structured : A semi structured decision has some, but not all, structured phases where
standardized procedures may be used in combination with individual judgment.

According to their scope, decisions can be classified as

1. Operational : Operational decisions are framed within the elements and conditions determined
by strategic and tactical decisions. They are usually made at a lower organizational level, by
knowledge workers responsible for a single activity or task such as sub-department heads,
workshop foremen, back-office heads.

2. Tactical : Tactical decisions affect only parts of an enterprise and are usually restricted to a
single department. The time span is limited to a medium-term horizon, typically up to a year.
Made by middle managers.

3. Strategic : Decisions are strategic when they affect the entire organization or at least a
substantial part of it for a long period of time. They strongly influence the general objectives and
policies of an enterprise. Taken at a higher organizational level, usually by the company top
management.

Features of Decision Support System

1. Effectiveness : It should help knowledge workers to reach more effective decisions.

2. Mathematical models : Mathematical models are applied to the data contained in data marts
and data warehouse,
3. Integration in the decision-making process : Decision makers allowed to integrate in a DSS
to their needs rather than passively accepting what comes out of it.
4. Organizational role : DSS operate at different hierarchical levels within an enterprise.
5. Flexibility : A DSS must be flexible and adaptable in order to incorporate the changes required
to reflect modifications in the environment or in the decisionmaking process.

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 9

INFORMATION TECHNOLOGY DEAPRTMENT
ITC 601 : DATA MINING AND BUSINESS INTELLIGENCE
MODULE 6 (BUSINESS INTELLIGENCE)

Business Intelligence Vs Decision Support System

Business Intelligence Decision Support (BI) System (DSS)

BI uses data warehouse DSS along with data warehouse uses data
mart also.
BI systems provide information that help for DSS systems are built for decision making.
decision making directly.
BI has strategic orientation. DSS are oriented towards the analyst.
BI systems are developed using commercially DSS is developed making use of custom
available tools programming.
BI tools are mostly used in software industries DSS are originated largely in academia

Structure of Decision Support System

The main components of the DSS are the DSS database, the user interface, and the DSS software
system.

The DSS database: It is a collection of data from a number of applications or groups. The DSS
database may be a small database residing on a PC or a large data warehouse.
The DSS software system : Contains the software tools that are used for analyzing the data,
including OLAP tools, data mining tools, or a collection of mathematical or analytical models.
The user interface : Controls the interaction between the users of the system and the DSS
software tools.

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 10

INFORMATION TECHNOLOGY DEAPRTMENT
ITC 601 : DATA MINING AND BUSINESS INTELLIGENCE
MODULE 6 (BUSINESS INTELLIGENCE)

Extended structure of Decision Support System

1. Data Management Component

The data management component performs the function of storing and maintaining the information
that you want your Decision Support System to use. The data management component, therefore,
consists of both the Decision Support System information and the Decision Support System
database management system. The information you use in your Decision Support System comes
from one or more of three sources:

(i) Organizational information : You may want to use virtually any information available in the
organization for your Decision Support System. What you use, of course, depends on what you
need and whether it is available. You can design your Decision Support System to access this
information directly from your company's database and data warehouse.
(ii) External information : Some decisions require input from external sources of information.
Various branches of federal government, and the internet, to mention just a few, can provide
additional information for the use with a Decision Support System.
(iii) Personal information : You can incorporate your own insights and experience your personal
information into your Decision Support System. You can design your Decision Support System
so that you enter this personal information only as needed, or you can keep the information in
a personal database that is accessible by the Decision Support System.

2. Model Management Component

The model management component consists of both the Decision Support System models and the
Decision Support System model management system. A model is a representation of some event,
fact, or situation. As it is not always practical, or wise, to experiment with reality, people build
models and use them for experimentation. Models can take various forms.

(i) Businesses use models to represent variables and their relationships. For example, you
would use a statistical model called analysis of variance to determine whether newspaper, TV, and
billboard advertising are equally effective in increasing sales.

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 11

INFORMATION TECHNOLOGY DEAPRTMENT
ITC 601 : DATA MINING AND BUSINESS INTELLIGENCE
MODULE 6 (BUSINESS INTELLIGENCE)

(ii) Decision Support Systems help in various decision making situations by utilizing models
that allow you to analyze information in many different ways. The models you use in a Decision
Support System depend on the decision you are making and, consequently, the kind of analysis
you require. For example, you would use what-if analysis to see what effect the change of one or
more variables will have on other variables, or optimization to find the most profitable solution given
operating restrictions and limited resources. Spreadsheet software such as excel can be used as a
Decision Support System for what-if analysis.

(iii) The model management system stores and maintains the Decision Support System's
models. Its function of managing models is similar to that of a database management system. The
model management component cannot select the best model for you to use for a particular problem
that requires your expertise but it can help you create and manipulate models quickly and easily.

User Interface Management Component

The user interface management component allows you to communicate with the Decision Support
System. It consists of the user interface management system. This is the component that allows
you to combine your know-how with the storage and processing capabilities of the computer. The
user interface is the part of the system you see through it when enter information, commands, and
models. This is the only component of the system with which you have direct contract. If you have a
Decision Support System with a poorly designed user interface, if it is too rigid or too cumbersome
to use, you simply won't use it no matter what its capabilities. The best user interface uses your
terminology and methods and is flexible, consistent, simple, and adaptable.

Knowledge Management Component

The knowledge management component, like that in an expert system, provides information about
the relationship among data that is too complex for a database to represent. It consists of rules that
can constrain possible solution as well as alternative solutions and methods for evaluating them.
For example, when analyzing the impact of a price reduction, a Decision Support System should
signal if the forecast volume of activity exceeds the volume that the projected staff can service.
Such signaling requires the Decision Support System to incorporate some rules-of-thumb about an
appropriate ratio of staff to sales volume. Such rules-of-thumb, also known as heuristics, make up
the knowledge base.

Phases in development of a Decision Support System

(Phase 1) Planning : The main purpose of the planning phase is to understand the needs and
opportunities for successful development of a DSS, translate them into project & later into DSS.

(Phase 2) Analysis : Define detailed functions of DSS to be developed. Gather responses to the
questions like What should the DSS accomplish, and who will use it, when and how?

(Phase 3) Design : In this phase, entire architecture of the system is considered. The various
factors like hardware, network structure, software tools, technology, database and interaction tool
are also taken into consideration.

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 12

INFORMATION TECHNOLOGY DEAPRTMENT
ITC 601 : DATA MINING AND BUSINESS INTELLIGENCE
MODULE 6 (BUSINESS INTELLIGENCE)

(Phase 4) Implementation : This phase includes the actual implementation of a DSS and its
installation. A а DSS is also tested for any errors or bugs. Any changes can be backtracked using
feedback mechanism and project management tools. We can also use agile methodology to speed
up the implementation process.

Factors affecting The Degree of Success of DSS

The various factors that may affect the degree of success of a DSS are explained below.

(1) Integration : The design and development of a DSS necessitates the collaboration of a large
variety of approaches, tools, models, persons, and organizational processes.

(2) Involvement : During the design and development of DSS, it is common to make the mistake of
excluding or feeling isolated from the project team of knowledge workers who will really utilize the
system once it is deployed.

(3) Uncertainty : While the cost of implementation is lower, the cost of making more effective
decisions may be higher.

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 13

INFORMATION TECHNOLOGY DEAPRTMENT
ITC 601 : DATA MINING AND BUSINESS INTELLIGENCE
MODULE 6 (BUSINESS INTELLIGENCE)

DEVELOPMENT OF A BUSINESS INTELLIGENCE SYSTEM USING DATA MINING

Fraud Detection for Telecommunication Industry

The telecommunications industry has expanded dramatically in the last few years with the
development of affordable mobile phone technology.

Fraud is an adaptive crime, so it needs special method of intelligent data analysis to detect and
prevent it.

as Telecommunication fraud is defined the unauthorized use, tampering or manipulation of a mobile

phone or service.

There are many different types of telecommunications fraud and these can occur at various levels.
The two most types of fraud are subscription fraud and superimposed fraud.

In subscription fraud, fraudsters obtain an account without intention to pay the bill. This is thus at
the level of a phone number, all transactions from this number will be fraudulent. In such cases
abnormal usage occurs throughout the active period of the account. The account is usually used for
call selling or intensive self usage. .

In superimposed fraud, fraudsters take over legitimate account. In such cases the abnormal
usage is superimposed upon the normal usage of the legitimate customers. There are several ways
to carry out superimposed fraud, including mobile phone cloning and obtaining calling card
authorization details. Examples of such cases include cellular cloning, calling card theft and cellular
handset theft. Superimposed fraud will generally occur at the level of individual calls; the fraudulent
calls will be mixed in with the justified ones.

Other types of telecommunications fraud include ghosting (technology that tricks the network in
order to obtain free calls) and insider fraud where telecommunication company employees sell
information to criminals that can be explained for fraudulent gain.

These method exists in the areas of Knowledge Discovery in Databases (KDD), Data Mining,
Machine Learning and Statistics. They offer applicable and successful solutions in different areas of
fraud crimes.

At a low level, simple rule-based detection systems use rules such as the apparent use of the same
phone in two very distant geographical locations in quick succession, calls which appear to overlap
in time and very high value and very long calls.

At a higher level, statistical summaries of call distributions (often called profiles or signature at the
user level) are compared with thresholds determined either by experts or by application of
supervised learning methods to known fraud/non-fraud cases.

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 14

INFORMATION TECHNOLOGY DEAPRTMENT
ITC 601 : DATA MINING AND BUSINESS INTELLIGENCE
MODULE 6 (BUSINESS INTELLIGENCE)

Some forensic accountants specialize in forensic analytics which is the procurement and analysis
of electronic data to reconstruct, detect, and otherwise support a claim of financial fraud. The main
steps in forensic analytics are data collection, data preparation, data analysis, and reporting.

For example, forensic analytics may be used to review an employees' purchasing card activity to
assess whether any of the purchases were diverted or divertible for personal use.

Techniques used for fraud detection fall into two primary classes: Statistical techniques and
Artificial intelligence.

Examples of Statistical data analysis techniques are :

1. Data pre-processing techniques for detection, validation, error correction, and filling up of
missing or incorrect data.

2. Calculation of various statistical parameters such as averages, performance metrics. For

example, the average may include average length of call, average number of calls per month.

3. Computing user profiles.

4. Time-series analysis of time-dependent data.

5. Clustering and classification to find patterns and association among groups of data.

Examples of AI techniques are :

1. Data mining to classify, cluster, and segment the data and automatically find associations
and rules in the data that may signify interesting patterns, including those related to fraud.

2. Expert systems to encode expertise for detecting fraud in the form of rules.

3. Pattern recognition to detect approximate classes, clusters, or patterns of suspicious

behavior either automatically or to match given inputs.

4. Machine learning techniques to automatically identify characteristics of fraud.

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 15

INFORMATION TECHNOLOGY DEAPRTMENT
ITC 601 : DATA MINING AND BUSINESS INTELLIGENCE
MODULE 6 (BUSINESS INTELLIGENCE)

Recommendation System

Recommendation system is one of the business intelligence system that is used to obtain
knowledge to the active user for better decision making. Recommendation systems apply data
mining techniques to the problem of making personalized recommendations for information. Due to
the growth in the number of information and the users in recent years offers challenges in
recommender systems. Collaborative, content, demographic and knowledge-based four different
types of recommendations systems.

This system works in three phases namely preprocessing, modeling and obtaining intelligence.

First, the users are filtered based on the user's profile and knowledge such as needs and
preferences defined in the form of rules. This poses selection of features and data reduction from
dataset.
Second, these filtered users are then clustered using k-means clustering algorithm as a modelling
phase.
Third, it identifies nearest neighbour for active users and generates recommendations by finding
most frequent items from identified cluster of users. This algorithm can be experimentally tested
with e-commerce application for better decision making by recommending top n products to the
active users.

The steps involved in recommendation system are given below :

1. Identifying the dataset : To maintain the data systematically and efficiently, database and data
warehouse technologies are used. The data warehouse not only deals with the business activities
but also contains the information about the user that deals with the business.

2. Choose the columns consideration /features : Once the dataset D has been identified, the next
step of the system is to choose the consideration column or filtering columns/features. That is, from
the whole dataset, the columns/subset of features to be considered for our work are chosen. This
includes the elimination of the irrelevant column in the dataset. The irrelevant column/feature may
be the one which provide less information about the dataset.

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 16

INFORMATION TECHNOLOGY DEAPRTMENT
ITC 601 : DATA MINING AND BUSINESS INTELLIGENCE
MODULE 6 (BUSINESS INTELLIGENCE)

3. Filtering objects by defining rules: From the consideration dataset, the objects can be grouped
under stated conditions that are defined in terms of rules. That is, for each column that is
considered, specify the rule to extract the necessary domain from the original dataset. This rule is
considered to be the threshold value T. The domain can be chosen by identifying the frequent items
from the dataset.

4. Identifying frequent items : The frequent items can be identified by analyzing the repeated value
in the consideration column satisfying the support count and the confidence threshold. This will
create a new dataset D'.

5. Cluster objects using k-means clustering : Upon forming the new dataset D', the objects in D' are
clustered based on similarity of objects using k-means clustering. k-means clustering is a method of
classifying or grouping objects into k clusters (where k is the number of clusters). The clustering is
performed by minimizing the sum of squared distances between the objects and the corresponding
centroid. The result consists of cluster of objects with their labels/classes.

6. Find nearest neighbour of active user : In order to find the nearest neighbours of the active user,
similarity of the active user between cluster centroids are calculated based on distance measure.
Then, select cluster that have the highest similarity among other clusters.

7. Generate recommendation dataset for active user : Recommendations are generated for the
active user based on the selected cluster of users purchased most frequent items generated from
specified threshold T. This gives intelligence to the users and business for better decision making.

Clickstream Mining
Clickstream mining is a record of a user's activity on the internet, including every website and every
page of every website that the users visits, how long the user was on a page or site, in what order
the pages were visited, any newsgroups that the user participates in and even the email-addresses
of mail that the users send and receive.

Both ISPs and individual websites are capable of tracking a user's clickstream. Clickstream data is
becoming increasingly valuable to internet marketers and advertisers. Be aware of the big amount
of data a clickstream generates.

These 'footprints' visitors leave at a site grown wildly - large businesses may gather a terabyte of it
every day. But the ability to analyze such data hasn't kept pace with the ability to capture it.

The next frontier of web data analysis is better integration of clickstream data with other customer
information such as purchase history and even demographic profiles, to form what's often called a
"360-degree view" of a site visitor. .

Clickstream analysis can be seen as a four-stage process of collection, storage, analysis, and
reporting. The first two concentrate on gathering and formatting information, and the latter two on
making sense of it.

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 17

INFORMATION TECHNOLOGY DEAPRTMENT
ITC 601 : DATA MINING AND BUSINESS INTELLIGENCE
MODULE 6 (BUSINESS INTELLIGENCE)

We can classify Clickstream Mining using Business Intelligence as follows

(a) Web traffic analysis : Web traffic analysis operates at the web server level and concentrates
on how visitors navigate through the site. It measures the number of pages delivered to the
customer as opposed to pages sent by the server. It determines how often visitors hit the browser
stop button, how much of the page was delivered until they hit the button and how long they waited
before they hit it.

(b) E-business feedback : The e-business analysis cycle is more sophisticated. This process
combines website activity with data from other sources, such as visitor profile information, sales
databases, and campaigns that include links to the website. It provides higher-level information,
more focused answers and information that can be used to enhance ecommerce activities across
the business as well as improving the website.

Market Segmentation
Market segmentation is a marketing concept which divides the complete market set up into smaller
subsets comprising of consumers with a similar taste, demand and preference. A market segment
is a small unit within a large market comprising of like-minded individuals. One market segment is
totally distinct from the other segment. A market segment comprises of individuals who think on the
same lines and have similar interests. The individuals from the same segment respond in a similar
way to the fluctuations in the market.

We can classify Market Segmentation using Business Intelligence as

(a) Psychographic segmentation : The basis of such segmentation is the lifestyle of the
individuals. The individual's attitude, interest, value help the marketers to classify them into small
groups.

(b) Behaviouralistic Segmentation : The loyalties of the customers towards a particular brand
help the marketers to classify them into smaller groups, each group comprising of individuals loyal
towards a particular brand.

(c) Geographic Segmentation : Geographic segmentation refers to the classification of the market
into various geographical areas. A marketer can't have similar strategies for individuals living in
different places. Nestle promotes Nescafe all through the year in cold states of the country as
compared to places which have well-defined summer and winter season. McDonald's in India does
not sell beef products as it is strictly against the religious beliefs of the countrymen, whereas
McDonald's in USA freely sells and promotes beef products. .

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 18

INFORMATION TECHNOLOGY DEAPRTMENT
ITC 601 : DATA MINING AND BUSINESS INTELLIGENCE
MODULE 6 (BUSINESS INTELLIGENCE)

Retail Industry
Retail organizations thrive by providing quality products to customers in a convenient, timely, and
cost effective manner. Understanding emerging customer shopping patterns can assist retailers in
organizing their products, inventory, store layout, and web presence in order to delight their
customers, thereby increasing revenue and profits. Retailers generate a lot of transaction and
logistics data that can be used to solve problems.

Optimize inventory levels at different locations : Retailers must carefully manage their
inventories. Carrying too much inventory incurs carrying costs, whereas carrying too little inventory
can result in stockouts and missed sales opportunities. Dynamic sales trend prediction can assist
retailers in moving inventory to where it is most in demand. Online retailers can provide their
suppliers with real-time information about their items' sales, allowing the suppliers to deliver their
product to the right locations and reduce stock-outs.

Improve store layout and sales promotions : Using a market basket analysis, you can create
predictive models of which products frequently sell together. This understanding of product affinities
can assist retailers in co-locating those products. Alternatively, those affinity products could be
placed further apart in order to force the customer to walk the length and breadth of the store,
exposing them to other products. Promotional discounted product bundles can be created to
promote a non-selling item and also a group of products that sell well together.

Optimize logistics for seasonal effects : Seasonal products provide extremely profitable short-
term sales opportunities, but they also pose the risk of unsold inventories at the end of the season.
Understanding which products are in season in which markets can assist retailers in dynamically
managing prices to ensure inventory is sold during the season. If it is raining in a specific area,
inventory of umbrellas and ponchos could be quickly moved there from non-rainy areas to help
increase sales.

Reduce losses due to limited shelf life : Perishable goods present difficulties in disposing of
inventory on time. Tracking sales trends allows perishable products that are at risk of not selling
before their sell-by date to be appropriately discounted and promoted.

Telecommunication Industry
BI in telecom can help with churn management, marketing/customer profiling, network failure, and
fraud detection.

(1) Management of churn : Telecom customers have shown a tendency to switch providers in
search of better deals. Telecom companies typically respond with so many incentives and
discounts in order to retain customers. However, they must determine which customers are truly at
risk of switching and which are simply bargaining for a better deal. The level of risk should be
considered when determining the type of deals and discounts to be offered. Every month, millions
of such customer calls are made. Telecom companies must provide a consistent and data-driven
method for predicting the risk of customer switching and then making an operational decision in real
time while the customer call is in progress. A decision-tree or a neural network based system can

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 19

INFORMATION TECHNOLOGY DEAPRTMENT
ITC 601 : DATA MINING AND BUSINESS INTELLIGENCE
MODULE 6 (BUSINESS INTELLIGENCE)

be used to guide the customer-service call operator to make the right decisions for the company, in
a consistent manner.

(2) Marketing and product creation : In addition to customer data, telecom companies also store
call detail records (CDRs), which precisely describe the calling behavior of each customer. This
unique data can be Modu used to profile customers and then can be used for creating new
products/services bundles for marketing purposes. An American telecom company, MCI, created a
program called Friends & Family that allowed calls with one's friends and family on that network to
be totally free and thus, effectively locked many people into their network.

(3) Network failure management : The failure of telecommunications networks due to technical
failures or malicious attacks can have disastrous consequences for people, businesses, and
society. Some equipment in telecom infrastructure will most likely fail with a certain mean time
between failures. Modeling the failure pattern of various network components can aid in preventive
maintenance and capacity planning.

(5) Fraud control : There are numerous types of fraud in consumer transactions. When a customer
opens an account with the intent of never paying for the services, this is referred to subscription
fraud. Superimposition fraud is defined as unauthorised activity by someone other than the
legitimate account holder. Decision rules can be developed to analyze each CDR in real time to
identify chances of fraud and take effective action.

Banking
Banks make loans and offer credit cards to millions of customers. They are most concerned with
improving loan quality and reducing bad debts. They also want to keep more of their current
customers and sell them more services.

(1) Automate the loan application process : Decision models that predict the likelihood of a
loan's success can be generated from historical data. The can be integrated into business
processes to automate the loan application process.

(2) Detect fraudulent transactions : Every day, billions of financial transactions take place around
the world. Exception-seeking models detect fraudulent transaction patterns. For example, if money
is transferred for the first time to an unrelated account, it could be a fraudulent transaction. can

(3) Increase customer value (cross-selling, upselling) : Selling more products and services to
existing customers is frequently the simplest way to increase revenue. A checking account
customer in good standing may be offered better terms on home, auto, or educational loans than
other customers, thus, increasing the value generated by that customer.

(4) Optimize cash reserves through forecasting : Banks must maintain a certain level of liquidity
in order to meet the needs of depositors who may wish to withdraw funds. Banks can forecast how
much to keep and invest the rest to earn interest by using historical data and trend analysis.

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 20

INFORMATION TECHNOLOGY DEAPRTMENT
ITC 601 : DATA MINING AND BUSINESS INTELLIGENCE
MODULE 6 (BUSINESS INTELLIGENCE)

Finance
Stock brokerages make extensive use of Business Intelligence (BI) systems. Access to accurate
and timely information can mean the difference between making or losing a fortune.

Predict changes in bond and stock prices : Forecasting the price of stocks and bonds is a
favorite pastime of financial experts as well as lay people. Stock transaction data from the past,
along with other variables, can be used to predict future price patterns. This can help traders
develop long-term trading strategies.

Assess the impact of events on market movements : Decision trees can be used to create
decision models that assess the impact of events on changes in market volume and prices.
Monetary policy changes (such as a change in the Fed Reserve interest rate) or geopolitical
changes (such as a war in a particular region of the world) can be factored into the predictive model
to help take action with greater confidence and less risk.

Identify and prevent fraudulent activities in trading : There have unfortunately been many
cases of insider trading, leading to many prominent financial industry stalwarts going to jail. Fraud
detection models can identify and flag fraudulent activity patterns.

CRM (Customer Relationship Management)

A company exists to serve a customer. A satisfied customer becomes a return customer. A

company should understand its customers' needs and sentiments, sell more of its products to
existing customers, and expand the pool of customers it serves.

Many aspects of marketing can be influenced by BI applications.

(1) Maximize the return on marketing campaigns : Data-driven analysis of customer pain points
can ensure that marketing messages are fine-tuned to better resonate with customers.

(2) Improve customer retention (churn analysis) : Winning new customers is more difficult and
expensive than retaining existing customers. Scoring each customer based on their likelihood to
quit can assist businesses in developing effective interventions, such as discounts or free services,
to retain profitable customers in a cost-effective manner.

(3) Maximize customer value (cross-selling, upselling) : Every interaction with the customer
should be viewed as an opportunity to assess their current needs. Offering new products and
solutions to customers based on their presumed needs can help increase revenue per customer.
Even a customer complaint can be viewed as a chance to impress the customer. Using the
knowledge of the customer's history and value, the business can choose to sell a premium service
to the customer.

(4) Identify and delight highly valued customers : The best customers can be identified by
segmenting the customers. They can be proactively contacted and delighted with enhanced
attention and service. Loyalty programs can be more effectively managed.

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 21

INFORMATION TECHNOLOGY DEAPRTMENT
ITC 601 : DATA MINING AND BUSINESS INTELLIGENCE
MODULE 6 (BUSINESS INTELLIGENCE)

(5) Manage brand image : A company can set up a listening post to monitor social media
conversations about itself. It can then perform sentiment analysis on the text in order to understand
the nature of the comments and respond appropriately to prospects and customers.

Fake News Detection

The major objective of watching or reading news was to be informed about whatever is happening
around us. There are several social media platforms in the current modern era, like Facebook,
Twitter and so forth where millions of users would rely upon for knowing day-to-day happenings.
Then came the fake news which spread across people as fast as the real news could. Fake news is
a piece of incorporated or falsified information often aimed at misleading people to a wrong path or
damage a person or an entity's reputation.

Characteristics of Fake News :

(a) Their sources are not genuine.
(b) May or may not have grammatical errors.
(c) Often uses attention-seeking words, click baits, etc.
(d) Seems too good to be true.
(e) Mimics the real headlines and twists the story.

As humans, when we read an article, we could somehow understand its context by interpreting its
words. Given today's volume of news, it is possible to teach computers how to read and understand
the difference between real and fake news using NLP techniques. All you need here are the
appropriate Machine Learning algorithms and a dataset.

The workflow for the Fake news classifier model

(1) Data Collection : The process of gathering information from various and all possible resources
regarding a particular research problem. This information is stored in a file as the dataset and is
subject to various techniques like testing, evaluation, etc.

(2) Data Cleaning : Identification and removal of errors if any in the gathered information. This
process is carried out mainly to improve the dataset's quality, make it reliable, and provide accurate
decision-making processes.

(3) Data Exploration Analysis : Various visualization techniques are carried out here to
understand the dataset in terms of its characteristics namely, size, quantity, etc. This process is
essential to better understand the nature of the dataset and get insights faster.

(4) Data Modelling : The process of training the dataset using one or more ML algorithms to tune it
according to the business need, predict or validate it accordingly.

(5) Data Validation : The method of tuning the hyperparameters before testing the model. This
provides an unbiased evaluation of a model fit done on the training dataset.

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 22

INFORMATION TECHNOLOGY DEAPRTMENT
ITC 601 : DATA MINING AND BUSINESS INTELLIGENCE
MODULE 6 (BUSINESS INTELLIGENCE)

(6) Deployment : Integrating an ML model into an existing environment to make more practical
business decisions based on the dataset.

Cyberbullying

Social networking sites (SNS) is being rapidly increased in recent years, which provides platform to
connect people all over the world and share their interests. However, Social Networking Sites is
providing opportunities for cyberbullying activities, Cyberbullying is harassing or insulting a person
by sending messages of hurting or threatening nature using electronic communication,
Cyberbullying poses significant threat to physical and mental health of the victims. Detection of
cyberbullying and the provision of subsequent preventive measures are the main courses of action
to combat cyberbullying. The detection method can identify the presence of cyberbullying terms
and classify cyberbullying activities in social network such Flaming, Harassment, Racism and
Terrorism, using Fuzzy logic and Genetic algorithm.

The input data set contains text, image, audio and video which will be collected from social
networks. The input of the data is sent to data pre-processing which improves quality of the input.
Social network dataset consists of most noisy and unwanted data; to improve the accuracy of the
input data, the preprocessing is applied. This includes removing stop words and symbols. Stop
words are usually like “a”, "as”, “have", "is”, “the”, “or”, etc. Stop words mainly consume memory
space and reduce the processing time. After completion of the data pre-processing the outcome of
the data is sent to cyberbully detection module for detecting the cyberbully contents. The cyberbully
detection techniques are explained below:

(a) Image Cyberbully detection : Nowadays, the cyberbullying using images is vast and causes
large effects to the society. They seem to be spreading in the social networks very rapidly. Such
anti-social elements are able to create more stress to the world by spreading communalism through
images. The cyberbully image can be detected using the computer vision algorithm which includes
two methods like image similarity and Optical Character Recognition (OCR).

(b) Video Cyberbully detection : The video cyberbullying also causes more problems in terms of
both emotional and psychological means. The cyberbully video will be detected using the shot
boundary detection algorithm. Here, the video will be broken into scene, shot and frames. A shot is
a sequence of frames captured by a single camera in a single continuous action. Thereby, the
content of the video will be analysed using the shot boundary detection algorithms such as Pixel
based shot boundary detection, Histogram based shot boundary detection, Block based shot
boundary detection.

(c) Audio Cyberbully detection : The audio is the one of area where many cyberbullying occurs in
a larger part. Here, the audio will be converted into text using CMU Sphinx tool. In the converted
text, cyberbully will be detected using trained dataset.

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 23

INFORMATION TECHNOLOGY DEAPRTMENT
ITC 601 : DATA MINING AND BUSINESS INTELLIGENCE
MODULE 6 (BUSINESS INTELLIGENCE)

Finally, the cyberbully content is classified into Physical bullying, Social bullying and Verbal bullying
using Naïve Bayesian classifier. The Naive Bayes classifier method is developed based on
Bayesian theorem with assumptions which are independent in between predictors.

(a) Social bullying : Social bullying involves spreading rumours about a person, purposely
embarrassing a person in public where it intends to hurt his or her feeling. Another form of bullying
that falls into this category involves encouraging others to avoid a certain person or group. Social
bullying affects a person and their ability to relate to their environment as well as other people in a
social setting. Not only does it have a direct impact on a person's mental and emotional state, it can
also adversely affect their reputation in both personal and professional circles.

(b) Verbal bullying : Verbal bullying is one of the most highly used techniques to perform bullying
mechanism in an efficient way. Criticizing and making fun of others are all forms of verbal bullying.
In verbal bullying the main weapon the bully uses are their voice. Verbal bullying is defined as a
negative aspects based defining declaration told to the victim or about the target, thereby defining
the target to be as non-existent one. If the abuser proximately does not make an apology and draw
back the significant declaration, the relationship is considered as verbally abusive one to the
network. This will create psychological disorders that plague them into and throughout adulthood
periods of an individual.

(c) Physical bullying : Physical bullying is one in which one's feeling is being hatred or harms their
personal possessions. The various types of physical bullying methods which are present widely are
Stealing, heaving, hitting, pushing, slapping, spattering and abolishing property. Physical bullying is
hardly the primary form of bullying that a Buller will experience. Frequently bullying will commence
in an altered method and advancement to physical violence. In physical bullying the foremost
weapon the bully uses are their body.

Sentiment Analysis

Sentiment Analysis in business, also known as opinion mining is a process of identifying and
cataloging a piece of text according to the tone conveyed by it. This text can be tweets, comments,
feedback, and even random rants with positive, negative and neutral sentiments associated with
them. Every business needs to implement automated sentiment analysis. Sentiment analysis
involves the steps like preprocessing, feature extraction and sentiment classification.

Preprocessing : We are interested in features of an object. For this, input data are preprocessed
using following steps:

(a) Tokenization : White spaces, special characters, symbols are removed; remaining words are
called as tokens.
(b) Removal of Stop Words : The articles and common words like “a, an, the, this, that am, is”, etc.
(c) Stemming : Reduces the tokens or words to its root form.
(d) Case Normalization : It changes the whole document either in lower case letters or upper case
letters.

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 24

INFORMATION TECHNOLOGY DEAPRTMENT
ITC 601 : DATA MINING AND BUSINESS INTELLIGENCE
MODULE 6 (BUSINESS INTELLIGENCE)

Feature Extraction : This step deals with the following scenario.

(a) Feature Types : It deals with finding of types of features used for sentiments viz, term
frequency, term co-occurrence, sentiment word, negation, syntactic dependency.
(b) Feature Selection : It deals with finding good features for sentiment classification viz.
information gain, odd ratio, document frequency, mutual information.
(c) Feature Weighting Mechanism : It calculates weight for ranking the features using term
frequency and inverse document frequency.
(d) Feature Reduction : The dimensionality of features is reduced for better performance.

Levels in Sentiment Analysis

Opinion posted is classified as positive opinion, negative opinion and neutral opinion. The 3 levels
of sentiments analysis are as follows.

(a) Document level : The whole document is considered for impressing the opinion as positive,
negative or neutral. The opinion about an object may be expressed without using any opinion word.
In this case natural language processing plays a vital role to mine the correct sentiments. The main
challenge is to extract subjective text for inferring the overall sentiment of the whole document.

(b) Sentence level : The documents in collection are divided into sentences and then the
sentences are classified as per positive, negative or neutral polarity. A document is a combination
of subjective and objective sentences. First the subjective sentences are determined and then the
opinion in those subjective sentences will be calculated. The sentence level polarity identification
can be done in either of the two ways: a grammatical syntactic approach or a semantic approach.
The grammatical syntactic approach takes grammatical structure of the sentence into account by
considering parts of speech tags.

(c) Word or phrase level : When product feature is considered for sentiment analysis, it is word or
phrase level sentiment analysis. It uses adjective, adverb as features. Word level sentiment can be
attained by ‘Dictionary Based Approach' or 'Corpus Based Approach'.

(i) Dictionary based approach : Sometimes the opinion is not expressed by a popular
keyword. Some jargons may be used to express the sentiments. Here, WordNet containing the
synonyms and antonyms is considered for finding out the polarity of a word.

(ii) Corpus based approach: In this method, occurrence of any word with other word whose
polarity is known is taken into account. Adjectives joined by 'and' show the same impression
and if joined by 'but' show opposite impression.

Finally, the sentiments are classified using machine learning approaches like SVM, Naïve Bayes,
Decision Tree, Rule Based Classifier and lexicon based approaches like dictionary based and
corpus based approach.

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 25

Final Year Project On Online Book Store PDF
100% (3)
Final Year Project On Online Book Store PDF
55 pages
KM Notes Unit-1
100% (1)
KM Notes Unit-1
20 pages
Answer Key Sample Paper 2 AI Class 10
No ratings yet
Answer Key Sample Paper 2 AI Class 10
13 pages
Tutorial Business Intelligence 3rd - PDF Material
No ratings yet
Tutorial Business Intelligence 3rd - PDF Material
22 pages
Business Intelligence & Analysis: Done By: Golkonda C Rathi Induja C Indu Selvaraju
No ratings yet
Business Intelligence & Analysis: Done By: Golkonda C Rathi Induja C Indu Selvaraju
16 pages
01 Introduction To Ai Important Questions Answers
100% (7)
01 Introduction To Ai Important Questions Answers
28 pages
BI Software Business Intelligence Portal
No ratings yet
BI Software Business Intelligence Portal
7 pages
PROJECT ON ONLINE FOOD ORDERING SYSTEM Final Report
No ratings yet
PROJECT ON ONLINE FOOD ORDERING SYSTEM Final Report
43 pages
Unit 1 BI
No ratings yet
Unit 1 BI
19 pages
Module - 1 Ch. 1 - Business Intelligence Definitions: Extract-Transform-Load (ETL)
No ratings yet
Module - 1 Ch. 1 - Business Intelligence Definitions: Extract-Transform-Load (ETL)
70 pages
Buinessanalytics Copy 210523165451
No ratings yet
Buinessanalytics Copy 210523165451
43 pages
Business Intelligence Unit I
No ratings yet
Business Intelligence Unit I
59 pages
Business Intelligence-Final Print
No ratings yet
Business Intelligence-Final Print
13 pages
Chapter 3 Bi
No ratings yet
Chapter 3 Bi
20 pages
MODULE.1 - Business Intelligence An Introduction
No ratings yet
MODULE.1 - Business Intelligence An Introduction
8 pages
Introduction To BI
No ratings yet
Introduction To BI
45 pages
Business Intelligence (BI)
No ratings yet
Business Intelligence (BI)
5 pages
SECodec: Structural Entropy-Based Compressive Speech Representation Codec For Speech Language Models
100% (1)
SECodec: Structural Entropy-Based Compressive Speech Representation Codec For Speech Language Models
17 pages
Business Intelligence in Pharmaceutical Industry
100% (1)
Business Intelligence in Pharmaceutical Industry
12 pages
BI Unit 1
No ratings yet
BI Unit 1
143 pages
Business Intelligence
No ratings yet
Business Intelligence
200 pages
Business Intelligence
No ratings yet
Business Intelligence
4 pages
Business Intelligence: Topic
No ratings yet
Business Intelligence: Topic
7 pages
DSS Lec.2
No ratings yet
DSS Lec.2
23 pages
BusinessIntelligence Analytics
No ratings yet
BusinessIntelligence Analytics
13 pages
1-Business Intelligence and Its Impacts - Factors Driving BI-13!12!2024
No ratings yet
1-Business Intelligence and Its Impacts - Factors Driving BI-13!12!2024
10 pages
Unit 1
No ratings yet
Unit 1
41 pages
Data Mining Descriptive Analytics: What Is Business Intelligence?
No ratings yet
Data Mining Descriptive Analytics: What Is Business Intelligence?
12 pages
BI 102 FBI LECTURE 1 (Introduction BI) Slides
No ratings yet
BI 102 FBI LECTURE 1 (Introduction BI) Slides
33 pages
BI 102 FBI LECTURE 1 (Introduction BI) Slides 2017 3
No ratings yet
BI 102 FBI LECTURE 1 (Introduction BI) Slides 2017 3
33 pages
UNIT-1 Business Intelligence
No ratings yet
UNIT-1 Business Intelligence
30 pages
Business Intelligence - Unit 1 - Upd
No ratings yet
Business Intelligence - Unit 1 - Upd
51 pages
Unit 1
No ratings yet
Unit 1
6 pages
BIDV Unit I
No ratings yet
BIDV Unit I
29 pages
KM Notes Unit-1
No ratings yet
KM Notes Unit-1
19 pages
Unit 1
No ratings yet
Unit 1
41 pages
The Role of Business Intelligence Tools in Decision Making Vzdqsdrih0
No ratings yet
The Role of Business Intelligence Tools in Decision Making Vzdqsdrih0
8 pages
Business Intelligence
No ratings yet
Business Intelligence
11 pages
Business Intelligence for Managers
No ratings yet
Business Intelligence for Managers
40 pages
U2 Bai
No ratings yet
U2 Bai
39 pages
Notes KM Bca 6
No ratings yet
Notes KM Bca 6
29 pages
Business Intelligence
No ratings yet
Business Intelligence
5 pages
What Is Business Intelligence
No ratings yet
What Is Business Intelligence
6 pages
BI Strategy
No ratings yet
BI Strategy
14 pages
Business Intelligence Insights
No ratings yet
Business Intelligence Insights
21 pages
Business Intelligence (BI)
No ratings yet
Business Intelligence (BI)
34 pages
Business Intelligence Guide
No ratings yet
Business Intelligence Guide
15 pages
Business Intelligence vs. Predictive Analysis How Do They Differ
No ratings yet
Business Intelligence vs. Predictive Analysis How Do They Differ
18 pages
Theory Development Framework in DSRIS
No ratings yet
Theory Development Framework in DSRIS
29 pages
The Impact of Artificial Intelligence On Everyday Life
No ratings yet
The Impact of Artificial Intelligence On Everyday Life
1 page
Module-2 Business Intelligence
No ratings yet
Module-2 Business Intelligence
25 pages
Ucc & BM of Osmania University (MBA)
No ratings yet
Ucc & BM of Osmania University (MBA)
22 pages
Group 8 Tejaswini Sepideh Tejoram Suvashree Rahul
No ratings yet
Group 8 Tejaswini Sepideh Tejoram Suvashree Rahul
15 pages
Power BI Topics
No ratings yet
Power BI Topics
4 pages
Business Intelligence
No ratings yet
Business Intelligence
40 pages
BI Presentation
No ratings yet
BI Presentation
34 pages
Business Intelligence: "Learn As You Grow"
No ratings yet
Business Intelligence: "Learn As You Grow"
18 pages
Business Intelligenc E-: Need, Features & Uses
No ratings yet
Business Intelligenc E-: Need, Features & Uses
17 pages
Fashion Recomandation System Using ResNe
No ratings yet
Fashion Recomandation System Using ResNe
2 pages
What Is Business Intelligence - , Benefits of BI, Architecture and Coponents of BI
No ratings yet
What Is Business Intelligence - , Benefits of BI, Architecture and Coponents of BI
20 pages
Basic Engineering Mathematics Fifth Edition
No ratings yet
Basic Engineering Mathematics Fifth Edition
54 pages
Unit 5
No ratings yet
Unit 5
36 pages
AI in Action - Google Cloud
No ratings yet
AI in Action - Google Cloud
5 pages
Ai Presentation
No ratings yet
Ai Presentation
21 pages
Bi PDF
No ratings yet
Bi PDF
8 pages
AI in Business Operations 40 Pages
No ratings yet
AI in Business Operations 40 Pages
18 pages
Business Intelligence Introduction
No ratings yet
Business Intelligence Introduction
22 pages
Ai & Machine Learning in Retail
No ratings yet
Ai & Machine Learning in Retail
17 pages
MGT103 Group Assignment
No ratings yet
MGT103 Group Assignment
24 pages
Data Analytics - Project Videos & Ideas
No ratings yet
Data Analytics - Project Videos & Ideas
6 pages
Tiktok'S Ai Strategy: Byte Dance'S Global Ambitions
No ratings yet
Tiktok'S Ai Strategy: Byte Dance'S Global Ambitions
3 pages
Sunny Chapter One
No ratings yet
Sunny Chapter One
9 pages
Web Mining Presentation
No ratings yet
Web Mining Presentation
14 pages
Machine Learning Internship Report
No ratings yet
Machine Learning Internship Report
19 pages
BI & Improving Its Efficiency
No ratings yet
BI & Improving Its Efficiency
20 pages
16 Recommender Systems PDF
No ratings yet
16 Recommender Systems PDF
6 pages
INTERNSHIP Movies
No ratings yet
INTERNSHIP Movies
17 pages
Attacks On Collaborative Recommender Systems 2.4
No ratings yet
Attacks On Collaborative Recommender Systems 2.4
8 pages
CV - Ats Checker Rupesh 1
No ratings yet
CV - Ats Checker Rupesh 1
3 pages
Online Job Portal Application Using Machine Learning
No ratings yet
Online Job Portal Application Using Machine Learning
6 pages
30370-Article Text-34424-1-2-20240324
No ratings yet
30370-Article Text-34424-1-2-20240324
9 pages
AI Dietician for Personalized Nutrition
No ratings yet
AI Dietician for Personalized Nutrition
17 pages
AI's Impact on Marketing Strategies
No ratings yet
AI's Impact on Marketing Strategies
12 pages
Unit 5
No ratings yet
Unit 5
31 pages
Mathematics 11 00820
No ratings yet
Mathematics 11 00820
38 pages
An Efficient User Centric Clustering Approach For Product Recommendation Based On Majority Voting: A Case Study On Wine Data Set
No ratings yet
An Efficient User Centric Clustering Approach For Product Recommendation Based On Majority Voting: A Case Study On Wine Data Set
9 pages

Itc 601 Dmbi-Module 6 Notes

Uploaded by

Itc 601 Dmbi-Module 6 Notes

Uploaded by

INFORMATION TECHNOLOGY DEAPRTMENT

ITC 601 : DATA MINING AND BUSINESS INTELLIGENCE

Examples of Business Intelligence System used in Practice

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 1

There are mainly two types of business users :

Advantages of Business Intelligence

Trends in Business Intelligence

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 2

Issues in Business Intelligence

2. Data and Technology

The BI technology used deters use of the system because:

Development of a Business Intelligence System

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 3

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 4

BUSINESS INTELLIGENCE ARCHITECTURE

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 5

Business Intelligence Architecture include the following items :

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 6

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 7

DECISION SUPPORT SYSTEM (DSS)

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 8

According to their scope, decisions can be classified as

Features of Decision Support System

1. Effectiveness : It should help knowledge workers to reach more effective decisions.

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 9

Business Intelligence Vs Decision Support System

Business Intelligence Decision Support (BI) System (DSS)

Structure of Decision Support System

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 10

Extended structure of Decision Support System

1. Data Management Component

2. Model Management Component

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 11

User Interface Management Component

Knowledge Management Component

Phases in development of a Decision Support System

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 12

Factors affecting The Degree of Success of DSS

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 13

DEVELOPMENT OF A BUSINESS INTELLIGENCE SYSTEM USING DATA MINING

Fraud Detection for Telecommunication Industry

as Telecommunication fraud is defined the unauthorized use, tampering or manipulation of a mobile

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 14

Examples of Statistical data analysis techniques are :

2. Calculation of various statistical parameters such as averages, performance metrics. For

3. Computing user profiles.

4. Time-series analysis of time-dependent data.

Examples of AI techniques are :

3. Pattern recognition to detect approximate classes, clusters, or patterns of suspicious

4. Machine learning techniques to automatically identify characteristics of fraud.

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 15

The steps involved in recommendation system are given below :

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 16

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 17

We can classify Clickstream Mining using Business Intelligence as follows

We can classify Market Segmentation using Business Intelligence as

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 18

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 19

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 20

CRM (Customer Relationship Management)

A company exists to serve a customer. A satisfied customer becomes a return customer. A

Many aspects of marketing can be influenced by BI applications.

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 21

Fake News Detection

Characteristics of Fake News :

The workflow for the Fake news classifier model

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 22

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 23

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 24

Feature Extraction : This step deals with the following scenario.

Levels in Sentiment Analysis

Prepared by : Mandar S. Joshi , Assistant Professor, IT Department PAGE 25

You might also like