Customize appearance
Data Analysis Terms: A to Z Glossary
Written by Coursera • Updated on Aug 23, 2023
Share
Common data analysis terms to know for certi cation prep,
interviewing, and resume writing.
Data analysis is the process of working with data to derive useful
information, which can then be used to make data-informed
decisions. Data analysis is generally a six step process: ask a
question, prepare your raw data sets, process your data for analysis,
analyze your data, share your results, and act in accordance with your
data.
Data analysts are data professionals who gather, clean, study, or
interpret data in order to solve business problems. They tend to work
alongside other data analytics professionals, such as data scientists
and data engineers.
This beginner-friendly data analysis glossary can be a useful
reference if you are launching a new career in data or looking to
enhance your data skills.
professional certi cate
Google Data Analytics
This is your path to a career in data analytics. In this program, you’ll
learn in-demand skills that will have you job-ready in less than 6
months. No degree or experience required.
4.8
(125,034 ratings)
0 already enrolled
BEGINNER level
Average time: 6 month(s)
Learn at your own pace
Skills you'll build:
Data Analysis, SQL, Business Communication, Spreadsheet Software,
Business Analysis, Data Visualization, Data Management, General
Statistics
Data analysis terms
You’ll nd common data analysis terms in the glossary below.
Attribute
When working in a spreadsheet or database, an attribute is a common
descriptor used to label a column. Labeling columns clearly and
precisely can enable you to keep your data organized and ready for
analysis.
Changelog
A changelog is a list documenting all of the steps you took when
working with your data. This can be helpful in the event that you need
to return to your original data or recall how you prepared your data for
analysis.
Clean data
Clean data is data that is accurate, complete, and ready for analysis.
Data cleaning, an important step in the data analysis process,
involves checking your data for inaccuracies, inconsistencies,
irregularities, and biases.
CSV (comma-separated values) le
A CSV le is a text le that separates pieces of data with commas.
This is a common le type when downloading data les for analysis,
as it tends to be compatible with common spreadsheet and database
software.
Dashboard
A dashboard is a tool used to monitor and display live data.
Dashboards are typically connected to databases and feature
visualizations that automatically update to re ect the most current
data in the database.
Data analytics
Data analytics is the collection, transformation, and organization of
data in order to draw conclusions, make predictions, and drive
informed decision making. Data analytics encompasses data analysis
(the process of deriving information from data), data science (using
data to theorize and forecast) and data engineering (building data
systems). Data analysts, data scientists, and data engineers are all
data analytics professionals.
There are four key types of data analytics:
Descriptive analytics tell us what happened
Diagnostic analytics tell us why something happened
Predictive analytics tell us what will likely happen in the future
Prescriptive analytics tell us how to act
Data architecture
Data architecture, also called data design, is the plan for an
organization’s data management system. This can include all
touchpoints in the data lifecycle, including how the data is gathered,
organized, utilized, and discarded. Data architects design the
blueprints that organizations use for their data management systems.
Data cleaning
Data cleaning, cleansing, or scrubbing is the process of preparing raw
data for analysis. When cleaning your data, you verify that your data is
accurate, complete, consistent, and unbiased. It’s important to make
sure you have clean data prior to analysis because unclean or dirty
data can lead to inaccurate conclusions and misguided business
decisions.
Data engineering
Data engineering is the process of making data accessible for
analysis. Data engineers build systems that collect, manage, and
convert raw data into usable information. Some common tasks
include developing algorithms to transform data into a more useful
form, building database pipeline architectures, and creating new data
analysis tools.
Data enrichment
Data enrichment the process of is adding data to your existing
dataset. You’d typically enrich your data during the data
transformation process as you are getting ready to begin your
analysis if you realize you need additional data in order to answer your
business question.
Data governance
Data governance is the formal plan for the way an organization
manages company data. Data governance encompasses rules for the
way data is accessed and used, and can include accountability and
compliance rules.
Data integrity
Data integrity encompasses the accuracy, reliability, and consistency
of data over time. It involves maintaining the quality and reliability of
data by implementing safeguards against unauthorized modi cations,
errors, or data loss.
Data mining
Data mining is closely examining data to identify patterns and glean
insights. Data mining is a central aspect of data analytics; the insights
you nd during the mining process will inform your business
recommendations.
Data science
Data science is the scienti c study of data. Data scientists ask
questions and nd ways to answer those questions with data. They
may work on capturing data, transforming raw data into a usable
form, analyzing data, and creating predictive models.
Data source
A data source refers to the origin of a speci c set of information. As
businesses increasingly generate data year over year, data analysts
rely on different data sources to measure business success and offer
strategic recommendations.
Data visualization
Data visualization is the representation of information and data using
charts, graphs, maps, and other visual tools. With strong data
visualizations, you can foster storytelling, make your data accessible
to a wider audience, identify patterns and relationships, and explore
your data further.
Data wrangling
Data wrangling, also called data munging or data remediation, is the
process of converting raw data into a usable form. There are four
stages of the munging process: discovery, data transformation, data
validation, and publishing. The data transformation stage can be
broken down further into tasks like data structuring, data
normalization or denormalization, data cleaning, and data enrichment.
Database
A database is an organized collection of information that can be
searched, sorted, and updated. This data is often stored electronically
in a computer system called a database management system
(DBMS). Oftentimes, you’ll need to use a programming language, such
as structured query language (SQL), to interact with your database.
Metadata
Metadata is data about data. It describes various characteristics of
your data, such as how it was collected, where it’s stored, its le type,
or creation date. Metadata can be particularly useful for veri cation
and tracking purposes.
Open data
Open data, also called public data, is data that is available for anyone
to use. Exploring and analyzing open datasets is one way to practice
data analysis skills.
Qualitative data
Qualitative data is data that describes qualities or characteristics. It’s
generally non-numeric data and can be subjective, for example eye
color or emotions.
Quantitative data
Quantitative data is objective data with a speci c numeric value. It’s
generally something you can count or measure, such as height or
speed.
Query
A query is a request for information. It’s essentially the question you
ask a database in order to return the data you want to retrieve. In data
analytics, you’ll formulate your database queries using a query
language, such as Structured Query Language (SQL).
Relational database
A relational database is a database that contains several tables with
related information. Even though data is stored in separate tables, you
can access related data across several tables with a single query. For
example, a relational database may have one table for inventory and
another table for customer orders. When you look up a speci c
product in your relational database, you can retrieve both inventory
and customer order information at the same time.
Structured Data
Structured data is formatted data, for example data that is organized
into rows and columns. Structured data is more readily analyzed than
unstructured data because of its tidy formatting.
Structured Query Language (SQL)
Structured Query Language, or SQL (pronounced “sequel”), is a
computer programming language used to manage relational
databases. It’s among the most common languages for database
management.
Unstructured data
Unstructured data is data that is not organized in any apparent way. In
order to analyze unstructured data, you’ll typically need to implement
some type of organization.
Explore further
Learn more about data analysis from industry leaders on Coursera.
Strengthen your data analysis skills with Google's Data Analytics
Professional Certi cate.
professional certi cate
Google Data Analytics
This is your path to a career in data analytics. In this program, you’ll
learn in-demand skills that will have you job-ready in less than 6
months. No degree or experience required.
4.8
(125,034 ratings)
0 already enrolled
BEGINNER level