0% found this document useful (0 votes)

14 views27 pages

Unit 1 BDT

This document provides an overview of Big Data, including its definition, characteristics, and applications across various industries such as banking, healthcare, and retail. It discusses the types of data (structured, unstructured, and semi-structured), the challenges of managing Big Data, and the analytics techniques used to derive insights. Additionally, it highlights the significance of Big Data in improving business processes and decision-making.

Uploaded by

Bhagya Battula

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views27 pages

Unit 1 BDT

Uploaded by

Bhagya Battula

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

UNIT-1

Getting an Overview of Big Data: Introduction to Big Data, Structuring Big Data,
Elements of Big Data, Big Data Analytics. Exploring the use of Big Data in Business
Context: Use of Big Data in Social Networking, Use of Big Data Preventing Fraudulent
Activities, Use of Big Data in Retail Industry

Getting an Overview of Big Data

Introduction to Big Data

What is Big Data?

Big Data is a collection of data that is huge in volume, yet growing exponentially with
time. It is a data with so large size and complexity that none of traditional data
management tools can store it or process it efficiently.

What is an Example of Big Data?

Following is some of the Big Data examples-

The New York Stock Exchange is an example of Big Data that generates about one
terabyte of new trade data per day.

A single Jet engine can generate 10+terabytes of data in 30 minutes of flight time.
With many thousand flights per day, generation of data reaches up to
many Petabytes.

Every second, there are around 8,22 tweets on Twitter.

Every minute, nearly 510 comments are posted, 293,000 statuses are updated, and
136,000 photos are uploaded on Facebook.

Every hour, Walmart, a global discount departmental store chain, handles more
than 1 million customer transactions.

Every day, consumers make around 11.5 million payments by using PayPal.

Big data is structured, unstructured, semistructured, or heterogeneous in nature. It

becomes difficult for computing systems to manage ‘Big Data’ because of the
immense speed and volume at which it is generated. Traditional data management,
warehousing, and analysis systems crack to analyze this type of data. Due to its
complexity, big data is stored in distributed architecture file system.

Hadoop by Apache is widely used for storing and managing Big Data. Analyzing Big
Data is a challenging task as it involves large distributed file systems.

The process of capturing or collecting Big Data is known as ‘datafication.’ Big Data is
‘datafied’ so that it can be used productively.

Big data can be described by the following Features/Characteristics:

Figure: Features of Big Data

The Applications of Big Data are

 Banking and Securities

 Communications, Media and Entertainment

 Healthcare Providers

 Education

 Manufacturing and Natural Resources

 Government
 Insurance

 Retail and Wholesale trade

 Transportation

 Energy and Utilities

The Uses of Big Data are

 Location Tracking

 Precision Medicine

 Fraud Detection & Handling

 Advertising

 Entertainment & Media

Real World Big Data Examples

 Discovering consumer shopping habits.

 Personalized marketing.

 Fuel optimization tools for the transportation industry.

 Monitoring health conditions through data from wearables.

 Live road mapping for autonomous vehicles.

 Streamlined media streaming.

 Predictive inventory ordering

History of Data Management – Evolution of Big Data

Big Data is the new term of data evolution directed by the enormous velocity, variety,
and volume of data. Velocity implies the speed with which the data flows in an
organization; variety refers to the varied forms of data, such as structured, semi-
structured, or unstructured; and volume defines the amount or quantity of data an
organization has to deal with.

Table: lists some major milestones in the evolution of Big Data:

Structuring Big Data

Structuring of data, in simple terms, is arranging the available data in a manner such
that it becomes easy to study, analyze, and derive conclusion from it.

Today, various sources generate a variety of data, such as images, text, audios, etc.
All such different types of data can be structured only if it is sorted and organized in
some logical pattern. Thus, the process of structuring data requires one to first
understand the various types of data available today.

Types of Data

Data that comes from multiple sources, such as databases, Enterprise Resource
Planning (ERP) systems, weblogs, chat history, and GPS maps, varies in its format.

Data is obtained primarily from the following types of sources:

Internal sources, such as organizational or enterprise data

External sources, such as social data

Table compares the internal and external sources of data

On the basis of the data received from the sources mentioned in Table above, Big
Data comprises:

Structured data

Unstructured data

Semi-structured data

In a real-world scenario, typically, the unstructured data is larger in volume than the
structured and semi-structured data, approximately 70% to 80% of data is in
unstructured form. Figure below illustrates the types of data that comprise Big Data:

Figure: Types of Big Data

Structured Data

Any data that can be stored, accessed and processed in the form of fixed format is
termed as a ‘structured’ data. Structured data can be defined as the data that has a
defined repeating pattern. This pattern makes it easier for any program to sort, read,
and process the data. Processing structured data is much easier and faster than
processing data without any specific repeating patterns.
Structured data:

Is organized data in a predefined format

Is stored in tabular form

Is the data that resides in fixed fields within a record or file

Is used to query and report against predetermined data types

Some sources of structured data include:

Relational databases (in the form of tables)

Flat files in the form of records (like comma separated values (csv) and tab-
separated files)

Multidimensional databases (majorly used in data warehouse technology)

Table shows a sample of structured data in which the attribute data for every
customer is stored in the defined fields:

Unstructured Data

Unstructured data is a set of data that might or might not have any logical or
repeating patterns. Any data with unknown form or the structure is classified as
unstructured data.

Unstructured data:

Consists typically of metadata, i.e., the additional information related to data

Comprises inconsistent data, such as data obtained from files, social media
websites, satellites, etc.

Consists of data in different formats such as e-mails, text, audio, video, or images
Some sources of unstructured data include:

Text both internal and external to an organization—Documents, logs, survey

results, Feedbacks and Emails from both within and across the organization.

Social media—Data obtained from social networking platforms, including YouTube,

Facebook, Twitter, LinkedIn, and Flickr

Mobile data—Data such as text messages and location information.

Challenges Associated with Unstructured Data:

Working with unstructured data poses certain challenges, which are as follows:

Identifying the unstructured data that can be processed

Sorting, organizing, and arranging unstructured data in different sets and formats

Combining and linking unstructured data in a more structured format to derive any
logical conclusions out of the available information

Costing in terms of storage space and human resource (data analysts and
scientists) needed to deal with the exponential growth of unstructured data.

Semi-structured data

Semi-structured data, also known as having a schema-less or self-describing

structure, refers to a form of structured data that contains tags or markup elements
in order to separate elements and generate hierarchies of records and fields in the
given data. Such type of data does not follow the proper structure of data models as
in relational databases. In other words, data is stored inconsistently in rows and
columns of a database.

Some sources for semi-structured data include:

File systems such as Web data in the form of cookies

Data exchange formats such as JavaScript Object Notation (JSON) data

Examples Of Semi-structured Data

Personal data stored in an XML file-

<rec><name>Prashant Rao</name><sex>Male</sex><age>35</age></rec>
<rec><name>Seema R.</name><sex>Female</sex><age>41</age></rec>
<rec><name>Satish Mane</name><sex>Male</sex><age>29</age></rec>
<rec><name>Subrato Roy</name><sex>Male</sex><age>26</age></rec>
<rec><name>Jeremiah J.</name><sex>Male</sex><age>35</age></rec>

Elements of Big Data

According to Gartner, data is growing at the rate of 59% every year. This growth can
be depicted in terms of the following four Vs:

Volume

Velocity

Variety

Veracity

Volume

Volume is the amount of data generated by organizations or individuals. Today, the

volume of data in most organizations is approaching exabytes. Some experts predict
the volume of data to reach zettabytes in the coming years. Organizations are doing
their best to handle this ever-increasing volume of data. For example, according to
IBM, over 2.7 zetabytes of data is present in the digital universe today. Every minute,
over 571 new websites are being created. IDC estimates that by 2020, online
business transactions will reach up to 450 billion per day.

The Internet alone generates a huge amount of data. The followings figures help us
to get an idea of the Internet traffic:

Internet has around 14.3 trillion live Web pages, and 48 billion Web pages are
indexed by Google Inc.; 14 billion Web pages are indexed by Microsoft Bing.

Internet has around 672 exabytes of accessible data.

Total world-wide Internet traffic in the year 2013 was 43,639 petabytes.

Over 9,00,000 servers are owned by Google Inc., which is the largest in the world.
Total data stored on the Internet is over 1 yottabyte

Velocity

The term ‘velocity’ refers to the speed of generation of data. How fast the data is
generated and processed to meet the demands, determines real potential in the data.

Big Data Velocity deals with the speed at which data flows in from sources like
business processes, application logs, networks, and social media sites, sensors,
Mobile devices, etc. The flow of data is massive and continuous.

The sources of high velocity data include the following:

IT devices, including routers, switches, firewalls, etc., constantly generate valuable

data.

Social media, including facebook posts, tweets, and other social media activities,
create huge amount of data, which is to be analyzed instantly at a fast speed
because the value degrades quickly with time.

Portable device, including mobile, PDA, etc., also generate data at a high speed.

Variety

We all know that data is being generated at a very fast pace. Now, this data is
generated from different types of sources, such as internal, external, social, and
behavioral, and comes in different formats, such as images, text, videos, etc. Even a
single source can generate data in varied formats, for example, GPS and social
networking sites, such as Facebook, produce data of all types, including text, images,
videos, etc.

Veracity
Veracity generally refers to the uncertainty of data, i.e., whether the obtained data is
correct or consistent. Out of the huge amount of data that is generated in almost
every process, only the data that is correct and consistent can be used for further
analysis. Data when processed becomes information; however, a lot of effort goes in
processing the data. Big Data, especially in the unstructured and semi-structured
forms, is messy in nature, and it takes a good amount of time and expertise to clean
that data and make it suitable for analysis.

Big Data Analytics

The process of analysis of large volumes of diverse data sets, using advanced
analytic techniques is referred to as Big Data Analytics.

These diverse data sets include structured, semi-structured, and unstructured data,
from different sources, and in different sizes from terabytes to zettabytes.

The different types of data require different approaches. This different approach of
analytics gives rise to the four different types of big data analytics.

There are mainly three types of analytics:

Descriptive Analytics

Descriptive analytics analyses a database to provide information on the trends of

past or current business events that can help managers, planners, leaders, etc. to
develop a road map for future actions.

Descriptive analytics performs an in-depth analysis of data to reveal details such as

frequency of events, operation costs, and the underlying reason for failures. It helps
in identifying the root cause of the problem.

Examples of descriptive analytics include summary statistics, clustering, and

association rules used in market basket analysis.

An example of the use of descriptive analytics is the Dow Chemical Company. The
company utilized its past data to increase its facility utilization across its offices and
labs.

Predictive Analytics

Predictive Analytics, as can be discerned from the name itself, is concerned with
predicting future incidents. These future incidents can be market trends, consumer
trends, and many such market-related events.

This type of analytics makes use of historical and present data to predict future
events. This is the most commonly used form of analytics among businesses.
Predictive analytics doesn’t only work for the service providers but also for the
consumers. It keeps track of our past activities and based on them, predicts what we
may do next.
Predictive analytics uses models like data mining, AI, and machine learning to
analyze current data and forecast what might happen in specific scenarios.

Examples of Predictive analytics include next best offers, churn risk, and renewal risk
analysis.

We can take the example of PayPal to understand how businesses use predictive
analytics.

The company determines the steps they need to take the steps to protect their
client’s fraudulent transactions. It uses all past payment data and user behavior data
to predict fraudulent activities.
Prescriptive Analytics

Prescriptive analytics is the most valuable yet underused form of analytics. It is the
next step in predictive analytics. The prescriptive analysis explores several possible
actions and suggests actions depending on the results of descriptive and predictive
analytics of a given dataset.

Prescriptive analytics is a combination of data and various business rules. The data
of prescriptive analytics can be both internal (organizational inputs) and external
(social media insights).

Prescriptive analytics allows businesses to determine the best possible solution to a

problem. When combined with predictive analytics, it adds the benefit of
manipulating a future occurrence like mitigate future risk.

Examples of prescriptive analytics for customer retention is the next best action and
next best offer analysis.
A use case of prescriptive analytics can be the Aurora Health Care system. It saved
$6 million by reducing the readmission rates by 10%.

Prescriptive analytics has good use in the healthcare industry. It can be used to
enhance the process of drug development, finding the right patients for clinical trials,
etc.

For example, if we have to find the best way of shipping goods from a factory to a
destination, to minimize costs, we will use the prescriptive analytics. Figure shows a
diagrammatic representation of the stages involved in the prescriptive analytics:

Figure: Prescriptive analytics

Table describes various analytical approaches typically associated with Big Data:
Advantages of Big Data Analytics:

The right analysis of the available data can improve major business processes in
various ways. For example, in a manufacturing unit, data analytics can improve the
functioning of the following processes:

Procurement—To find out which suppliers are more efficient and cost-effective in
delivering products on time

Product Development—To draw insights on innovative product and service formats

and designs for enhancing the development process and coming up with demanded
products

Manufacturing—To identify machinery and process variations that may be

indicators of quality problems

Marketing—To identify which marketing campaigns will be the most effective in

driving and engaging customers and understanding customer behaviors and channel
behaviors

Price Management—To optimize prices based on the analysis of external factors

A closer look at some specific industries will help you to understand the application
of Big Data in these sectors.

Transportation

Big Data has greatly improved transportation services. The data containing traffic
information is analyzed to identify traffic jam areas. Suitable steps can then be taken,
on the basis of this analysis, to keep the traffic moving in such areas. Distributed
sensors are installed in handheld devices, on the roads and on vehicles to provide
real-time traffic information. This information is analyzed and Big Data has
transformed disseminated to commuters and also to the traffic control authority.

Education

Big Data has transformed the modern-day education processes through innovative
approaches, such as e-learning for teachers to analyze the students’ ability to
comprehend and thus impart education effectively in accordance with each
student’s needs. The analysis is done by studying the responses to questions,
recording the time consumed in attempting those questions, and analyzing other
behavioral signals of the students. Big Data also assists in analyzing the
requirements and finding easy and innovative ways of imparting education,
especially distance learning over vast geographical areas.

Travel

The travel industry also uses Big Data to conduct business. It maintains complete
details of all the customer records that are then analyzed to determine certain
behavioral patterns in customers. For example, in the airline industry, Big Data is
analyzed for identifying personal preferences or spotting which passengers like to
have window seats for short-haul flights and aisle seats for long-haul flights. This
helps airlines to offer the similar seats to customers when they make a fresh
booking with the airways.

Big Data also helps airlines to track customers who regularly fly between specific
routes so that they the right cross-sell and up-sell offers. Some airlines also apply
analytics to pricing, inventory, and advertising for improving customer experiences,
leading to more customer satisfaction, and hence, more business. Some airlines
even go to the length of evaluating customers who tend to miss their flights. They try
to help such customers by delaying the flights or booking them on another flight.

Government

Big Data has come to play an important role in almost all the undertaking and
processes of government. According to the UK free market, “the UK government
could save up to £33 billion a year by using public Big Data more effectively.”
Analysis of Big Data promotes clarity and transparency in various government
processes and helps in:

Taking timely and informed decisions about various issues

Identifying flaws and loopholes in processes and taking preventive or corrective

measures on time
Assessing the areas of improvement in various sectors, such as education, health,
defense, and research

Using budgets more judiciously and reducing unnecessary wastage and costs

Preventing fraudulent practices in various sectors

Healthcare

In healthcare, the pharmacy and medical device companies use Big Data to improve
their research and development practices, while health insurance companies use it
to determine patient-specific treatment therapy modes that promise the best results.
Big Data also helps researchers to work towards eliminating healthcare-related
challenges before they become real problems. Big Data helps doctors to analyze the
requirement and medical history of every patient and provide individualistic services
to them, depending on their medical condition. Telecom

The mobile revolution and the Internet usage on mobile phones have led to a
tremendous increase in the amount of data generated in the telecom sector.
Managing this huge pool of data has almost become a challenge for the telecom
industry.

For example, in Europe, there is a compulsion on the telecom companies to keep

data of their customers for at least six months and maximum up to two years. Now,
all this collection, storage, and maintenance of data would just be a waste of time
and resources unless we could derive any significant benefits from this data. Big
Data analytics allows telecom industries to utilize this data for extracting meaningful
information that could be used to gain crucial business insights that help industries
in enhancing their performance, improving customer services, maintaining their hold
on the market, and generating more business opportunities.

Exploring the use of Big Data in Business Context

Use of Big Data in Social Networking

A human being lives in a social environment and gains knowledge and experience
through communication. Today, communication is not restricted to meeting in
person. The affordable and handy use of mobile phones and the Internet have made
communication and sharing data of all kinds possible across the globe. Some
popular social networking sites are Twitter, Facebook, and LinkedIn. These social
networking sites are also called social media.

we analyze the effects of Big Data generated from the social media on different
industries. Let’s first understand the meaning of social network data.

Social network data refers to the data generated from people socializing on social
media. On a social networking site, you will find different people constantly adding
and updating comments, statuses, preferences, etc. All these activities generate
large amounts of data. Analyzing and mining such large volumes of data show
business trends with respect to wants and preferences and likes and dislikes of a
wide audience.

This data can be segregated on the basis of different age groups, locations, and
genders for the purpose of analysis. Based on the information extracted,
organizations design products and services specific to people’s need.

Figure shows the social network data generated daily through various social media:
Figure: Social Network Data Generated Every Minute of the Day

Social Network Analysis (SNA) is the analysis performed on the data obtained from
social media. As the data generated is huge in volume, it results in the formation of a
Big Data pool.

Let’s understand the importance of social network data with the help of an example
of a Mobile Network Operator (MNO). The data captured by an MNO in a day, such as
the cell phone calls, text messages, and other related details of all its customers is
very huge in volume. This type of data is used daily for different purposes.

An MNO does not simply need to record and analyze the calls of a customer but the
entire network calls related to that customer. The company must study the data of
the people whom the customer called and also of the people in the customer’s
network who called back the customer. Such a network is called a social network.

Some facts about big data and social media are listed as follows:

Facebook collects 500 times more data each day than the New York Stock
Exchange. (Source: BI Intelligence)

Twitter produces 12 times more data each day than the New York Stock Exchange.
(Source: BI Intelligence)

By 2016, there will be 18.9 billion network connections, i.e., 2.5 connections per
person. (Source: IBM Big Data Hub)

The following are the areas in which decision-making processes are influenced by
social network data:

Business intelligence

Marketing

Product design and development

Business Intelligence

Business intelligence is a data analysis process to convert a raw dataset to

meaningful information by using different techniques and tools for boosting
business performance. This system allows a company to collect, store, access, and
analyze data for adding value to decision making.
Marketing

Today, the preferences of consumers have changed due to their busy schedules.
They no longer have the time to read newspapers thoroughly, watch all the TV
commercials, or go through all the e-mails they receive in their inbox. Consumers can
now make their preferences clear and select the marketing messages they wish to
receive―when, where, and from whom. In today’s competitive scenario, marketers
aim to deliver what consumers want by using interactive communication across
digital channels such as e-mail, mobile, social, and the Web.

Product Design and Development

With the increasing popularity of social media and growing volume of data every
second, organizations competing to make it big in the market must not only identify
and extract the information relevant for their company, products, and services but
also comprehend and respond to the information on a continuous basis.

Use of Big Data in Preventing Fraudulent Activities

A fraud can be defined as the false representation of facts, leading to concealment

or distortion of the truth. Frauds that occur frequently in financial institutions, such
as banks and insurance and healthcare companies, or involve any type of monetary
transactions, such as in the retail industry, are called financial frauds. In such
fraudulent cases, online retailers, such as Amazon, eBay, and Groupon, tend to incur
huge expenses and losses. The following are some of the most common types of
financial frauds:

Credit card fraud—This type of fraud is quite common these days and is related to
the use of credit card facilities. In an online shopping transaction, the online retailer
cannot see the authentic user of the card and therefore, the valid owner of the card
cannot be verified. It is quite likely that a fake or a stolen card is used in the
transaction. In an online transaction, in spite of the security checks, such as address
verification or card security code, fraudsters manage to manipulate the loopholes in
the system.

Exchange or return policy fraud—An online retailer always has a policy allowing
the exchange and return of goods and sometimes, people take advantage of this
policy. These people buy a product online, use it, and then return it back as they are
not satisfied with the product. Sometimes, they even report non-delivery of the
product and later attempt to sell it online. What leads to such a fraud is that retailers
encourage consumers to order products in bulk and later return the ones that they
don’t require. Such a fraud can be averted by charging a restocking fee on the
returned goods, getting customer’s signature on the delivery of the product, and
staying cautious of such customers who are known to commit such frauds.

Personal information fraud—In this type of fraud, people obtain the login
information of a customer and then log-in to the customer’s account, purchase a
product online, and then change the delivery address to a different location. The
actual customer keeps calling the retailer to refund the amount as he or she has not
made the transaction. Once the transaction is proved fraudulent, the retailer has to
refund the amount to the customer.

All these frauds can be prevented only by studying the customer’s ordering patterns
and keeping track of out-of-line orders. Other aspects should also be taken into
consideration such as any change in the shipping address, rush orders, sudden huge
orders, and suspicious billing addresses. By observing such precautions, the
frequency of the occurrence of such frauds can be reduced to a certain extent, but
cannot be completely eliminated.

Preventing Fraud Using Big Data Analytics

We have seen that one of the ways to prevent financial frauds is to study the
customer’s ordering pattern and other related data. However, this method works only
when the data to be analyzed is small in size. In order to deal with huge amounts of
data and gain meaningful business insights, organizations need to apply Big Data
analytics. Analyzing Big Data allows organizations to:

Keep track of and process huge volumes of data

Differentiate between real and fraudulent entries

Identify new methods of fraud and add them to the list of fraud-prevention checks

Verify whether a product has actually been delivered to the valid recipient

Determine the location of the customer and the time when the product was
actually delivered Check the listings of popular retail sites, such as e-Bay, to find
whether the product is up for sale somewhere else Fraud Detection in Real

Fraud Detection in Real time

Big Data also helps to detect frauds in real time. It compares live transactions with
different data sources to validate the authenticity of online transactions. For
example, in an online transaction, Big Data would compare the incoming IP address
with the geo-data received from the customer’s smart phone apps. A valid match
between the two confirms the authenticity of the transaction.

Big Data also examines the entire historical data to track suspicious patterns of the
customer order. These patterns are then used to create checks for avoiding real-
time fraud. Big Data analysis is performed in real time by retailers to know the actual
time when the products were delivered to customers. Costly products often have
sensors attached to them that transmit their location information. When such
products are delivered to customers, the streaming data obtained from these
sensors provides location information to the retailer, thereby, preventing frauds.

Centralization of Big Data takes place through MPP systems. Any organization that
aims at improving its analytic scalability needs an MPP system. With the continuous
increase in the volume of data, it is not always possible to move data as part of the
analysis process except where it is absolutely required. MPP is the most widely used
technique of storing and analyzing huge volumes of data.

Let us now understand what an MPP database is and what makes it so special and
preferred. An MPP database has several independent pieces of data stored on
multiple networks of connected computers. It eliminates the concept of one central
server having a single CPU and disk.

The data in an MPP database is divided into different disks managed by different
CPUs across different servers, as shown in Figure:
Figure: MPP System Data Storage

Visually Analyzing Fraud

Image analytics is another emerging field that can help detect frauds. It refers to the
process of analyzing image data with the help of digital processing of the image.
Examples include the use of bar codes and QR codes. Some other examples include
complex solutions such as facial recognition and position-and-movement analysis.
Today, images and videos contribute to 80 percent of unstructured data. Analytical
systems that deal with Big Data are designed to integrate and understand images,
videos, text, and numbers.

Big Data can also help in creating maps and graphs for comparisons that can be
used to analyze situations and take decisions. An analysis in the graphical form, for
example, can help identify the customers, areas, and products that display a high
fraud rate. Big Data can even show comparisons between products and regions,
which alert retailers as to where a greater probability of fraud exists. The retailer can
then take proper actions to mitigate the risk accordingly.

Use of Big Data in Retail Industry

Big Data has huge potential for the retail industry as well. Considering the immense
number of transactions and their correlation, the retail industry offers a promising
space for Big Data to operate.

Seemingly simple questions, such as the following, are easy to answer when there is
a single retail location and a small customer base:

How many basic tees did we sell today?

What time of the year do we sell most leggings?

What else has customer X bought, and what kind of coupons can we send to
customer X?

However, with millions of transactions spread across multiple disconnected legacy

systems and IT teams, it is impossible to find answers to such questions. Business
insights in customer behavior and company health can be obtained by finding a
relation between the organization’s sales between instore and online sales. It could
be very difficult for a marketing analyst to understand the health and strength of
different types of products and campaigns and reconcile the data obtained from
these systems. While omni-channel retailing solutions do exist, they require both
store managers and Web developers to learn entirely new systems. The company-
wide training and deployment of these systems would incur huge costs in terms of
time and money.

Many times, extracting data in real time is not feasible as systems are affected
because of scaling issues. Suppose you want to know if a particular item is in stock
in another nearby store. This data cannot be found immediately and needs some
phone calls or other ways of accessing information and therefore, prevents the
immediate sale of the item. If access to the data is possible, there may not be
anything particularly rich or useful about it. Raw transactional data can only help a
company understand its sales but does not provide any relationships, patterns, or
other clues for deeper analysis. Also, the fact remains that most of the Big Data is
just not required and not useful either. Some information in a Big Data feed can have
a long-term strategic value while some information will be used immediately and
some information will not be used at all. The main part of taming Big Data is to
identify which portions fall into which category.

Use of RFID Data in Retail

The introduction of Radio Frequency Identification (RFID) technology automated the

process of labeling and tracking of products, thereby, saving significant time, cost,
and effort. Walmart was one of the first retailers to implement RFID in its
merchandise.

The RFID technology helps better item tracking by differentiating the items that are
out of stock and that are available on shelves. For instance, if an item is not
available on the shelves, it does not imply that the item is not available throughout.
With the help of an RFID reader and a mobile computer, the inventory can be
immediately verified and stocks replenished, if required.

Various types of RFID tags are available for various environments such as cardboard
boxes, wooden, glass, or metal containers. Tags also come in various sizes and are
of varied capabilities, including read and write capability, memory, and power
requirements. They also have a wide range of durability. Some varieties are paper-
thin and are typically for one-time use and are called ‘smart labels.’ RFID tags can
also be customized and withstand heat, moisture, acids, and other extreme
conditions. Some RFID tags are also reusable, thus offering a Total Cost of
Ownership (TCO) benefit over bar code labels.

The use of RFIDs saves time, reduces labor, enhances the visibility of products
throughout the production-delivery cycle, and saves costs.

Some common benefits of using RFID are shown in Figure 2.4:

Asset Management:

Organizations can tag all their capital assets, such as pallets, vehicles, and tools, in
order to trace them anytime and from any location. Readers fixed at specific
locations can observe and record all movements of the tagged assets with great
accuracy. This mechanism also works as a security check and alerts supervisors
and raises an alarm in case anyone tries to take the asset outside the authorized
area.

When containers are loaded for shipment, tracking pallets with RFIDs are included in
them. These RFIDs contain records of what is stored in the container. This helps
production managers to have a complete view of the inventory level and location of
containers. This information can be used to locate items and fulfil rush orders
without any waste of time.

Shipping containers, pallets, cylinders, and reusable plastic bottles having RFID tags
can be easily identified at the dock entry as they leave with an outbound
consignment. After the database is matched with the shipping information, the
manufacturers of the products create a log of each shipping container with its
details and develop a procedure for tracking their goods. This information can be
utilized to reduce the time required for documentation and can be of great value in
resolving disputes of lost and damaged goods.

Inventory Control:
One of the primary benefits of using RFID is inventory tracking, especially in areas
where tracking has not been done or was not possible before. RFID tags can be read
even if the contents are packed and are not in the direct line of sight. This means
that an entire pallet with an assortment of goods can be read without disturbing the
arrangement of goods in the pallet. RFID tags are resistant to temperature and
environmental variances such as dirt, moisture, heat, and contaminants. On the
other hand, bar codes cannot handle such conditions and are prone to damage or
errors.

Using an RFID tracking system can result in an optimized inventory level, and thus
reduce the overall cost of stocking and labor. RFID allows manufacturers to track
inventory for raw materials, work in progress, or finished goods. Readers installed
on shelves can update inventory automatically and raise alarms in case the
requirement for restocking arises.

Shipping and Receiving:

RFID tags can also be used to trigger automated shipment tracking applications.
Manufacturers use the readings obtained from these tags for generating a shipment
manifest, which is used for many tasks including:

Printing a shipping document

Recording the shipment automatically in the shipping system

Printing a 2D bar code for a shipping label

Nowadays, Serial Shipping Container Code (SSCC) is widely used in shipping labels.
SSCC can be easily converted into RFID tags in order to provide automatic handling
of shipment. The data contained in the RFID tag can be considered with the
shipment information, which can easily be read by the receiving organization to
simplify the receiving process and eliminate processing delays.

Regulatory Compliance:

The entire custody trail can be produced before regulatory bodies such as the Food
and Drug Administration (FDA), Department of Transportation (DOT), and
Occupational Safety and Health Administration (OSHA) along with other regulatory
requirements, provided the RFID tag that travels with the material has been updated
with all the handling data. This could be of great use for companies that work with
hazardous items, food, pharmaceuticals, and other regulated materials.

A logistic company brings various packages from different locations to a hub.

Thereafter, it sorts out the urgent ones for a morning delivery from the regular
delivery ones. This is where RFID can help in locating these packages or pallets and
loading them for a faster and quicker delivery.

Service and Warranty Authorizations:

A warranty card or document to request for a service warranty would no longer be

necessary because an RFID tag can hold all this information. Once the repair or
service has been completed, the information can be fed into the RFID tag as the
maintenance history. This is something that will always remain on the product. If
future repairs are required, the technician can access this information without
accessing any external database, which helps in reducing calls and timeexpensive
enquiries into documents.

Evolution of Big Data
No ratings yet
Evolution of Big Data
50 pages
Introduction to Big Data Concepts
100% (2)
Introduction to Big Data Concepts
33 pages
Unit 1
No ratings yet
Unit 1
107 pages
Module 1 Intro To Big Data - Hadoop
No ratings yet
Module 1 Intro To Big Data - Hadoop
55 pages
Big Data
No ratings yet
Big Data
110 pages
Big Data
No ratings yet
Big Data
84 pages
Module 1
No ratings yet
Module 1
60 pages
Big Data
No ratings yet
Big Data
34 pages
Bda M1
No ratings yet
Bda M1
111 pages
Unit 1-2
No ratings yet
Unit 1-2
78 pages
Bda Unit - 1
No ratings yet
Bda Unit - 1
21 pages
Bda (Unit 1)
No ratings yet
Bda (Unit 1)
24 pages
Cloud Computing
No ratings yet
Cloud Computing
86 pages
Big Data Analytics
No ratings yet
Big Data Analytics
58 pages
Big Data Class 27feb
No ratings yet
Big Data Class 27feb
48 pages
Big Data Basics for IT Professionals
No ratings yet
Big Data Basics for IT Professionals
108 pages
BDA ppt1
No ratings yet
BDA ppt1
45 pages
Chapter 4 Data Analytics
No ratings yet
Chapter 4 Data Analytics
19 pages
Bda MST Merged
No ratings yet
Bda MST Merged
230 pages
Introduction To Big Data Analytics - Thendral1
No ratings yet
Introduction To Big Data Analytics - Thendral1
26 pages
Big Data UNIT I
No ratings yet
Big Data UNIT I
91 pages
Big Data Unit 1 Notes
No ratings yet
Big Data Unit 1 Notes
37 pages
01 - Introduction To Big Data Analytics PDF
No ratings yet
01 - Introduction To Big Data Analytics PDF
37 pages
1.1 Module-1
No ratings yet
1.1 Module-1
31 pages
UNIT - 1 - DA - Notes
No ratings yet
UNIT - 1 - DA - Notes
51 pages
Big Data (Unit 1)
No ratings yet
Big Data (Unit 1)
32 pages
BD U1.PDF - Crdownload
No ratings yet
BD U1.PDF - Crdownload
65 pages
Big Data Chapter-I - New
No ratings yet
Big Data Chapter-I - New
49 pages
Big Data - Unit-1 - KCS-061
No ratings yet
Big Data - Unit-1 - KCS-061
63 pages
Big Data Processing
No ratings yet
Big Data Processing
19 pages
Da Unit - I - Notes
No ratings yet
Da Unit - I - Notes
30 pages
BD Unit 1
No ratings yet
BD Unit 1
72 pages
Unit 1 Bigdata
No ratings yet
Unit 1 Bigdata
30 pages
Bda - Unit 1
No ratings yet
Bda - Unit 1
32 pages
Big Data Unit-1 Kcs-061
No ratings yet
Big Data Unit-1 Kcs-061
64 pages
Unit 1 Big Data
No ratings yet
Unit 1 Big Data
34 pages
Wa0008.
No ratings yet
Wa0008.
25 pages
Unit - I Part I
No ratings yet
Unit - I Part I
48 pages
Bda Unit 1
No ratings yet
Bda Unit 1
47 pages
BigData 1
No ratings yet
BigData 1
14 pages
Assignment: Advance Marketing Research & Data Analytics
No ratings yet
Assignment: Advance Marketing Research & Data Analytics
4 pages
Itfm Assignment Group 8
100% (1)
Itfm Assignment Group 8
16 pages
Understanding Big Data: Types & Uses
No ratings yet
Understanding Big Data: Types & Uses
4 pages
Unit I-KCS-061
No ratings yet
Unit I-KCS-061
42 pages
Bigdata Analytics
No ratings yet
Bigdata Analytics
19 pages
Big Data Introduction Unit 1
No ratings yet
Big Data Introduction Unit 1
19 pages
Module I Big Data
No ratings yet
Module I Big Data
7 pages
Unit 01
No ratings yet
Unit 01
32 pages
Unit-I (Big Data)
No ratings yet
Unit-I (Big Data)
30 pages
Introduction To Big Data BS (CS) 6 Lecture # 2: Dr. Syed Attique Shah (PH.D.)
No ratings yet
Introduction To Big Data BS (CS) 6 Lecture # 2: Dr. Syed Attique Shah (PH.D.)
28 pages
Ds Assignment
No ratings yet
Ds Assignment
4 pages
Big Data Intro
No ratings yet
Big Data Intro
12 pages
Big Data Hadoop
No ratings yet
Big Data Hadoop
35 pages
Introduction To Bigdata
No ratings yet
Introduction To Bigdata
31 pages
Unit I: Chapter 1: Introduction To Big Data
No ratings yet
Unit I: Chapter 1: Introduction To Big Data
35 pages
Itfm Assignment Group 5
No ratings yet
Itfm Assignment Group 5
14 pages
Big Data Pgdca
No ratings yet
Big Data Pgdca
23 pages
Big Data
No ratings yet
Big Data
7 pages
Unit 1 Introduction To BIG DATA ANALYSIS: Evolution of Technology
No ratings yet
Unit 1 Introduction To BIG DATA ANALYSIS: Evolution of Technology
9 pages
Source Code:: Admin Login Form
No ratings yet
Source Code:: Admin Login Form
19 pages
DM104 - Evaluation of Business Performance
No ratings yet
DM104 - Evaluation of Business Performance
15 pages
Informatica MCQs 50 Final
No ratings yet
Informatica MCQs 50 Final
7 pages
AMDP - Avoiding FOR ALL ENTRIES and Pushing Calculation To Database Layer - SAP Blogs
No ratings yet
AMDP - Avoiding FOR ALL ENTRIES and Pushing Calculation To Database Layer - SAP Blogs
11 pages
Central Hospital Database SOW
No ratings yet
Central Hospital Database SOW
3 pages
ST03N Workload Monitor
No ratings yet
ST03N Workload Monitor
3 pages
Scan 02 Nov 2020
No ratings yet
Scan 02 Nov 2020
2 pages
Python Record
No ratings yet
Python Record
35 pages
Imdb SQL
No ratings yet
Imdb SQL
2 pages
Data Lineage: Tracking Data Flow
No ratings yet
Data Lineage: Tracking Data Flow
14 pages
EV Utilities
No ratings yet
EV Utilities
257 pages
SQL Basics
No ratings yet
SQL Basics
34 pages
WebSphere Interview Questions For An Administrator
No ratings yet
WebSphere Interview Questions For An Administrator
6 pages
Ltetest 2025 Interview Questions
No ratings yet
Ltetest 2025 Interview Questions
4 pages
04 - MariaDB Create User and Database
No ratings yet
04 - MariaDB Create User and Database
3 pages
Sheet 2
No ratings yet
Sheet 2
3 pages
3 P32 Midterm 2019
No ratings yet
3 P32 Midterm 2019
6 pages
Chapter 05
100% (1)
Chapter 05
20 pages
ADBMS - Unit 1 - 21042018 - 032136AM
No ratings yet
ADBMS - Unit 1 - 21042018 - 032136AM
21 pages
IT - Basic SQL Exam
No ratings yet
IT - Basic SQL Exam
2 pages
Resume of AAA
No ratings yet
Resume of AAA
2 pages
Yuvaraj S
No ratings yet
Yuvaraj S
2 pages
IGNOU Mini Project Synopsis On "Online Agricultural Marketing System" MCS-044
No ratings yet
IGNOU Mini Project Synopsis On "Online Agricultural Marketing System" MCS-044
36 pages
Whats Is A Database
No ratings yet
Whats Is A Database
10 pages
Informatica DVO for Data Validation
No ratings yet
Informatica DVO for Data Validation
13 pages
Assignment On Ddic
No ratings yet
Assignment On Ddic
16 pages
Big Data Analytics - 7th Sem VTU 2018 Scheme - Class 3
No ratings yet
Big Data Analytics - 7th Sem VTU 2018 Scheme - Class 3
10 pages
DBMS Storage and Indexing
No ratings yet
DBMS Storage and Indexing
80 pages
Database UAS
No ratings yet
Database UAS
6 pages
DBMS REDUCTION 2023.PDF - Odt - 0
No ratings yet
DBMS REDUCTION 2023.PDF - Odt - 0
8 pages