Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
12 views25 pages

What Is Data Masking?

Data masking is a technique used to protect sensitive information by creating an obfuscated version of the data that cannot be reverse-engineered, ensuring compliance with privacy regulations. It is crucial for enterprises to mitigate risks associated with data exposure, insider threats, and cyberattacks while maintaining data usability. The guide discusses various data masking techniques, approaches, and the importance of implementing data masking in non-production environments to safeguard Personally Identifiable Information (PII).

Uploaded by

hposhtak
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views25 pages

What Is Data Masking?

Data masking is a technique used to protect sensitive information by creating an obfuscated version of the data that cannot be reverse-engineered, ensuring compliance with privacy regulations. It is crucial for enterprises to mitigate risks associated with data exposure, insider threats, and cyberattacks while maintaining data usability. The guide discusses various data masking techniques, approaches, and the importance of implementing data masking in non-production environments to safeguard Personally Identifiable Information (PII).

Uploaded by

hposhtak
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

What is Data Masking?

2023-01-15, 23:23

K2VIEW EBOOK

What is Data Masking?


A Reference Guide
This reference guide answers "What is data masking" by discussing
its bene>ts, challenges, and a data product approach that overcomes
all the obstacles.

PIl

PIl
Name: Maskeddata
司oa

PII
Email: Maskeddata

https://www.k2view.com/what-is-data-masking Page 1 of 25
What is Data Masking? 2023-01-15, 23:23

INTRO

Data masking: An
imperative for Table of Contents
today’s enterprises 01 What is data masking?

With the proliferation of personal data –


Data masking vs other data
collected by enterprises across all 02 obfuscation methods

industries – the need for protecting


individual privacy is paramount. One way 03 Why data masking?

to protect Personally Identi>able


04 Data masking approaches
Information (PII) is by masking data (i.e.,
consistently changing names, or including 05 Data masking techniques
K2
only Book a Demo
the last 4 digits in a credit card
VIEW. or
Social Security Number). 06 Data masking challenges

07 Data masking with data products


This reference guide explores today’s data
masking techniques, the challenges they 08 Summary

pose for enterprises, and a novel


approach, based on data products, that
Download as PDF
addresses these challenges in the most
comprehensive manner.

CHAPTER 01

What is data masking?


https://www.k2view.com/what-is-data-masking Page 2 of 25
What is Data Masking? 2023-01-15, 23:23

Data masking protects sensitive data by creating a version of the data that
can’t be identi>ed or reverse-engineered. It should assure data consistency,
and usability, across multiple databases.

Data masking substitutes real information with random characters.

The most common types of data masking include:

PII: Personally Identi>able Information, in response to privacy regulations,


such as GDPR and CPRA

PCI-DSS: Payment Card Industry Data Security Standard (payment card


information)
PHI: Protected Health Information

IP: Intellectual Property

Data masking best practices call for its use in non-production environments
– such as software development, data science, and testing – that don’t
require the original production data.

https://www.k2view.com/what-is-data-masking Page 3 of 25
What is Data Masking? 2023-01-15, 23:23

Simply de>ned, data masking combines the processes and tools for making
sensitive data unrecognizable, but usable, by software or authorized
personnel.

CHAPTER 02

Data masking vs other data


obfuscation methods
Data obfuscation refers to a variety of processes that transform data into
another form, in order to secure and protect it. The 3 most common data
obfuscation methods are data masking, data encryption, and data
tokenization. While data masking is irreversible, encryption and tokenization
are both reversible in the sense that the the original values can be derived
from the obscured data. Here’s a brief explanation of the 3 methods:

Data masking
Data masking substitutes realistic, but fake, data for the original values, to
ensure data privacy. Development, support, data science, business
intelligence, testing, and training teams use masked data in order to make
use of a dataset without exposing real data to any risk.

There are many techniques for masking data, such as data scrambling, data

https://www.k2view.com/what-is-data-masking Page 4 of 25
What is Data Masking? 2023-01-15, 23:23

blinding, or data shuZing, which will be explained in greater detail later on.
The process of permanently removing all Personally Identi>able Information
(PII) from sensitive data is also known as data anonymization, or data
sanitization. There is no algorithm to recover the original values of masked
data.

Data encryption
While data encryption is very secure, data teams can’t analyze or work with
encrypted data. The more complex the encryption algorithm, the safer the
data will be from unauthorized access. Encryption is ideal for storing or
transfering sensitive data securely.

Data tokenization
Data tokenization, which substitutes a sensitive data element with random
data (token), is a reversible process. The token can be mapped back to the
original data, which is stored in a secure “data vault”.

In a data masking vs tokenization comparison, tokenization supports


operations like processing a credit card payment without revealing the credit
card number. The real data never leaves the organization, and can’t be seen
or decrypted by a third-party processor.

https://www.k2view.com/what-is-data-masking Page 5 of 25
What is Data Masking? 2023-01-15, 23:23

Data tokenization supports the Payment Card Industry Data Security.

So, what is data masking? It's the most common form of data obfuscation.
The fact that data masking is not reversible makes it more secure, and less
costly, than encryption.

Another big plus is that data masking maintains data integrity across
systems and data bases, which is critical in software testing and data

https://www.k2view.com/what-is-data-masking Page 6 of 25
What is Data Masking? 2023-01-15, 23:23

analysis. Minimizing the use of actual data protects an enterprise from


unnecessary risk.

In the case of obfuscated data, integrity means that the dataset maintains
its validity and consistency, despite undergoing data anonymization. For
example, a real credit card number can be replaced by any 16-digit value that
is validated by the “CheckSum” function. Once anonymized by a new value,
the same (new) value must be used consistently across all systems.

In short, there are 2 major differences between data masking and other data
obfuscation methods like encryption or tokenization:

1. Masked data remains usable in its obfuscated form.


2. Once data is masked, the original value can’t be recovered.

CHAPTER 03

Why data masking?


Data masking is important to enterprises because it enables them to:

Maintain compliance with privacy laws, like GDPR and CCPA, by


eliminating the risk of sensitive data exposure.
Protect data from cyberattacks, while preserving its usability and
consistency.

https://www.k2view.com/what-is-data-masking Page 7 of 25
What is Data Masking? 2023-01-15, 23:23

Reduce the risk of data sharing, e.g., in the case of cloud migrations, or
when integrating with third-party apps.

While data masking has been around for decades, it is now needed more
than ever to effectively protect sensitive data, and to address the following
challenges:

Regulatory compliance
Highly regulated industries, like >nancial services and healthcare, already
operate under strict privacy regulations, including the Payment Card Industry
Data Security Standard (PCI DSS), and the Health Insurance Portability and
Accountability Act (HIPAA). Since the introduction of Europe’s GDPR in 2018,
there has been a proliferation of privacy laws across the globe including
CCPA and CCPR in California, LGPD in Brazil, and PDPA in the Philippines and
Singapore. Such privacy laws seek to protect Personally Identi>able
Information (PII) by, and restrict access to it whenever possible.

Insider threats
Many employees and third-party contractors access enterprise systems on a
regular basis. Production systems are particularly vulnerable, because
sensitive information is often used in development, testing, and other pre-
production environments. With insider threats rising 47% since 2018,
according to the Ponemon Institute report, containing sensitive data costs
companies an average of more than $200,000 per year.

External threats
In 2020, personal data was compromised in 58% of the data breaches,
states a Verizon report. The study further indicates that in 72% of the cases,
the victims were large enterprises. With the vast volume, variety and velocity

https://www.k2view.com/what-is-data-masking Page 8 of 25
What is Data Masking? 2023-01-15, 23:23

of enterprise data, it is no wonder that breaches proliferate. Taking measures


to protect sensitive data in non-production environments will signi>cantly
reduce the risk.

Data governance
Data masking is commonly used to control data access. While static data
masking obscures a single dataset, dynamic data masking provides more
granular controls. With dynamic data masking, permissions can be granted
or denied at many different levels. Only those with the appropriate access
rights can access the real data. Others will see only the parts that they have
to see.

Flexibility
Data masking is highly customizable. Data teams can choose which data
>elds get masked, and how to select and format each substitute value. For
example, every Social Security Number (SSN) has the format xxx-xx-xxxx,
where “x” is a number from 0 to 9. They can substitute the >rst >ve digits
with the letter x, or all 9 numbers with other random numbers, according to
their needs.

CHAPTER 04

Data masking approaches


Over time, a variety of data masking techniques have been devised.

https://www.k2view.com/what-is-data-masking Page 9 of 25
What is Data Masking? 2023-01-15, 23:23

Selecting the right approach is dependent on the intended data use. The goal
is to maximize data protection, while minimizing data exposure.

Static data masking


Non-production environments, such as those used for analytics, testing,
training, and development purposes, often source data from production
systems. In such cases, private data is protected with static data masking, a
one-way transformation ensuring that the masking process cannot be
undone. When it comes to testing and analytics, repeatability is a key
concept because using the same input data delivers the same results. This
requires the masked data values to persist, over time, and through multiple
extractions.

Static data masking is usually employed on a copy of a production database.


It makes data look real enough to permit accurate development, testing, and
training, without exposing the original data.

Dynamic data masking


Dynamic data masking is used to protect, obscure, or block access to,
sensitive data. While prevalent in production systems, it is also used when
testers or data scientists require real data. Dynamic data masking is
performed in real time, in response to a data request. When the data is
located in multiple source systems, masking consistency is dilcult,
especially when dealing with disparate environments, and a wide variety of
technologies. Dynamic data masking protects sensitive data on demand.

Dynamic data masking automatically streams data from a production


environment, to avoid storing the masked data in a separate database. As a

https://www.k2view.com/what-is-data-masking Page 10 of 25
What is Data Masking? 2023-01-15, 23:23

rule, it’s used for role-based security for applications – such as handling
customer queries, or processing sensitive data, like health records – and in
read-only scenarios, so that the masked data doesn’t get written back to the
production system.

On-the-Cy data masking


When analytics or test data is extracted from production systems, staging
sites are often used to integrate, cleanse, and transform the data, before
masking it. The masked data is then delivered to the analytics or testing
environment. This multi-stage process is slow, cumbersome, and risky due
to the possible exposure of private data.

On-the-my data masking is performed on data as it moves from one


environment to another, such as from production, to development or test. It’s
ideal for enterprises engaging in continuous software development and
large-scale data integrations. A subset of the masked data is generally
delivered to authorized users upon request, because keeping a backup of all
the masked data is inelcient and impractical.

Statistical data masking


Production data can hold different statistical information, which statistical
data obscuration techniques can masquerade. Differential privacy is one
technique where you can share information about patterns in a data set
without revealing information about the actual individuals in the data set.

Test data masking


Applications, of any kind, require extensive testing before they can be
released into production. Test data management tools that provision

https://www.k2view.com/what-is-data-masking Page 11 of 25
What is Data Masking? 2023-01-15, 23:23

production data for testing must mask the test data to protect sensitive
information. For example, in a legacy modernization program, the
modernized software components must be tested continuously, making test
data masking a key component in the testing process. Masking data with
referential integrity – from production systems, to the test environments – is
critical.

Unstructured data masking


Scanned documents and image >les, such as insurance claims, bank
checks, and medical records, contain sensitive data stored as images. Many
different formats (e.g., pdf, png, csv, email, and Olce docs) are used daily by
enterprises in their regular interactions with individuals. With the potential for
so much sensitive data to be exposed in unstructured >les, the need for
unstructured data masking is obvious.

https://www.k2view.com/what-is-data-masking Page 12 of 25
What is Data Masking? 2023-01-15, 23:23

Masking of unstructured data is particularly important in >nancial services and healthcare industries

CHAPTER 05

Data masking techniques

https://www.k2view.com/what-is-data-masking Page 13 of 25
What is Data Masking? 2023-01-15, 23:23

There are several techniques associated with data masking, including:

Scrambling
Scrambling randomly orders characters and/or numbers to obscure the
original content. For example, when a shipment with tracking number
572918 in a production environment undergoes character scrambling, it
might read 125879 in a different environment. Although easy to implement,
scrambling can only be used on certain data types, and is not as secure as
other techniques.

01

02

03

Data scrambling assures that the data can’t be easily traced back to its
04
source.
05
Nullifying
Nullifying
06 applies a null value to a data column so that unauthorized users

https://www.k2view.com/what-is-data-masking Page 14 of 25
What is Data Masking? 2023-01-15, 23:23

won’t be able to see the actual data in it. Despite its ease of implementation,
07
nullifying results in data with less integrity, which is often problematic in
development and testing environments.
08
Substitution
Substitution, which replaces the original data with another value, is one of
the most effective data masking techniques because it preserves the
original nature of the data. Although dilcult to execute, substitution can be
applied to several types of data, and is excellent protection against data
breaches.

ShuHing
Like substitution, shuZing uses the same individual masking data column
for randomly ordering characters or numbers. For example, when patient
name columns are shuZed across multiple patient records, the results look
accurate but don’t reveal any personal medical information. However,
anyone with access to the shuZing algorithm can reverse-engineer the
process.

Date/number variance
Data/number variance is used for masking important >nancial and
transaction date information. For example, masking the employee salaries
column with the employee salary variance, displays the salaries between the
highest- and lowest-paid employees. Data integrity can be assured by
applying a variance of, say, +/- 5% to all salaries in the dataset.

Date aging
Date aging increases or decreases a date >eld based on a pre-de>ned data

https://www.k2view.com/what-is-data-masking Page 15 of 25
What is Data Masking? 2023-01-15, 23:23

masking policy, within a speci>c date range. For example, decreasing the
date of birth >eld by 1,000 days would change the date 1-January-2023 to 7-
April-2020.

https://www.k2view.com/what-is-data-masking Page 16 of 25
What is Data Masking? 2023-01-15, 23:23

CHAPTER 06

Data masking challenges


To effectively answer the question "What is data masking?" the following
challenge must be addressed: Not only must the altered data retain the
basic characteristics of the original data, it must also be transformed
enough to eliminate the risk of exposure, while retaining data integrity

Enterprise IT landscapes typically have many production systems, that are


deployed on premises and in the cloud, across a wide variety of
technologies. To mask data effectively, an organization needs to:

1. Identify the sensitive data and PII that require protection

2. Resolve identities to ensure the data integrity across systems. For


example, If Rick Smith is masked as Sam Jones, that identity must be
consistent wherever it is used
3. Comply with company governance policies for role, location, and
permissions-based data access

4. Scale for real-time access and mass-batch data extraction


5. Manage growing volumes of unstructured data

https://www.k2view.com/what-is-data-masking Page 17 of 25
What is Data Masking? 2023-01-15, 23:23

CHAPTER 07

Data masking with data products


Enterprise data masking is powered by the Data Product Platform, which
ingests, organizes, processes, and delivers data from disparate systems by
business entities – customer, order, device, or anything else that’s important
to the business. Data masking is applied on the my, in the context of the
business entity, and safely delivered to authorized data consumers.

By masking the data for a particular business entity as a singular unit,


regardless of the underlying source systems and their technologies, we can
ensure that the referential integrity of the masked data is maintained, and
that the masked data for that business entity is always consistent, and
complete.

A data product that is created and managed in the platform uni>es


everything a company knows about a business entity – including all
interactions, transactions, and master data. The data for each speci>c
business entity is managed in its own Micro-Database, which is encrypted by
its own 256-bit encryption key. And the PII data in the Micro-Database is
masked in-might, according to prede>ned business rules.

Enterprise data masking supports dynamic data masking for operational use
cases, like Customer 360, and static data masking for test data
management and legacy application modernization.

https://www.k2view.com/what-is-data-masking Page 18 of 25
What is Data Masking? 2023-01-15, 23:23

Benefits of data products


Enterprise data masking tools eliminates the need for slow, cumbersome,
and risk-prone staging areas, where unmasked data is exposed to potential
breaches.

https://www.k2view.com/what-is-data-masking Page 19 of 25
What is Data Masking? 2023-01-15, 23:23

Using a no-code data orchestration tool, data from multiple production


systems is integrated, cleansed, and masked on the my.

A data product approach to data masking simpli>es complexity, ensuring


that an individual's customer data, which is fragmented across multiple
sources is:

Consistent, across multiple sources

Persistent, over time and multiple extractions

Preserved, with referential integrity and formatting

Dynamic data masking


Enterprisdynamic data masking transforms, obscures, or blocks access to
sensitive information >elds based on user roles and testing environment
privileges.

And with data orchestration, a wide variety of in-line masking functions can
be invoked to protect the data.

https://www.k2view.com/what-is-data-masking Page 20 of 25
What is Data Masking? 2023-01-15, 23:23

Unstructured data masking


Protect unstructured data including images, PDFs, XML, CSV, text-based
>les, and more, with static and dynamic masking capabilities. Unstructured
data masking lets you:

Replace sensitive photos with fake alternatives

Use OCR to detect content and enable intelligent masking

Employ synthetic data generation, to create digital versions of receipts,


checks, contracts and other items for testing purposes.

By managing unstructured data within a data product schema,


referential integrity and consistency are ensured.

Extensive and extendible masking functions


https://www.k2view.com/what-is-data-masking Page 21 of 25
What is Data Masking? 2023-01-15, 23:23

Enterprise data masking comes with a comprehensive library of prebuilt


masking functions, designed to provide realistic, but fake, data.

The table below highlights a few examples, including masking that creates a
valid social security number (SSN), selecting (masked) names from name
directories, as well as generating random numbers, and address-based zip
codes. The library can be easily extended by custom Java functions that
implement additional masking functions.

Field Masking function

SSN/National ID Generate valid SSN

Credit card Generate valid number based on


card type

First name/Last name/Zip code Select from collection

DOB ShuZe (preserve statistical diversity)

Any String/number Random String/number

Email Concatenation based on new >rst


and last name

Const Static masking based on a pre


provided value

https://www.k2view.com/what-is-data-masking Page 22 of 25
What is Data Masking? 2023-01-15, 23:23

Address Based on the provided Zip

CHAPTER 08

Summary
Data masking has become a pillar technology that global enterprises use to
comply with privacy protection regulations.

Although the practice of masking data has been around for years, the sheer
volume of data – structured and unstructured – and the ever-changing
regulatory environment, have increased the complexity of data masking at
enterprise scale.

The offerings of the current data masking vendors are proving to be


insulcient. However, a new approach, based on data products, is setting the
data masking standard at some of the world’s largest organizations.

https://www.k2view.com/what-is-data-masking Page 23 of 25
What is Data Masking? 2023-01-15, 23:23

See in-flight data


masking in action

Book a Demo

Platform Architecture Initiatives

Data Integration Data Fabric Architecture Agile Customer 360

iPaaS Data Mesh Architecture Test Data Management

Data Virtualization Data Hub Architecture Data Migration

Data Catalog Data Privacy Management


Industries
Micro-Database Data Pipelining

Telco and Media


Data Governance Customer Data Hub

Financial Services
Data Orchestration Multi-Domain MDM

Healthcare
Data Masking Operational Intelligence

https://www.k2view.com/what-is-data-masking Page 24 of 25
What is Data Masking? 2023-01-15, 23:23

Data Service Automation DataOps

eBooks Company

Data Products About

Data Masking Customers

Test Data Management News

Customer 360 Resources

Data Fabric Partners

Data Mesh Academy

iPaaS Careers

Data Migration Support

Legacy Modernization Contact Us

Data Virtualization

© COPYRIGHT 2023 K2VIEW Privacy Policy Security Terms Of Use Cookie Policy

https://www.k2view.com/what-is-data-masking Page 25 of 25

You might also like