Data Source

The document discusses various data sources, including primary and secondary data, and the methods for gathering data for research purposes. It highlights the importance of identifying data needs and selecting appropriate sources, such as databases, APIs, web scraping, and data streams. Additionally, it emphasizes the role of relational databases and tools like SQL in data management and analysis.

Uploaded by

Pearly Tahum Jaleco

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views7 pages

Data Source

Uploaded by

Pearly Tahum Jaleco

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

Introduction

A data source can be the original site where data is created or

where physical information is first digitized. Still, even the most
polished data can be used as a source if it is accessed and used by
another process. A data source can be a database, a flat file, real-
time measurements from physical equipment, scraped online data,
or any of the numerous static and streaming data providers
available on the internet.

There are a variety of outcomes for which data collectors gather

data. However, the primary goal of data collection is to place a
researcher in a position to make predictions about future probability
and trends.

Primary and secondary data are the two types of data obtained. The
former is gathered by a researcher using first-hand sources,
whereas the latter is gathered by someone other than the user.

Based on Nature

The data sources which provide primary data are known as primary
data sources, and information gathered directly from first-hand
experience is referred to as preliminary data. This is the information
you collect for the aim of a particular research endeavour.
Primary data gathering is a straightforward method suited to a
company’s particular requirements. It’s a time-consuming
procedure, but it provides valuable first-hand knowledge in many
business situations.

E.g., in Census data collected by the government, Stock prices are

taken from the stock market.

Secondary Data Sources

These data sources provide secondary data. Secondary data has

previously been gathered for another reason but is relevant to your
investigation. Additionally, the data is collected by someone other
than the team who needs the data.

Secondhand information is referred to as secondary data. It is not

the first time it has been used, and that’s why it’s referred to as
secondary.

Secondary data sources contribute to the interpretation and analysis

of main data. They may describe primary materials in-depth and
frequently utilize them to promote a certain thesis or point of view.

Identifying and Gathering Data

The first step in identifying data is deciding what information needs

to be gathered, which is decided by the aim we want to achieve.
After we have identified the data, we will need to choose the
sources from which we shall extract the essential information and
create a data collecting strategy. At this step, we decide on the
duration over which we want the data collection, as well as how
much data is required to arrive at a viable analysis.

Data sources can be internal or external to the company, and they

can be primary, secondary, or third-party, depending on whether we
are getting data directly from the source, accessing it from external
data sources, or buying it from data aggregators.

Databases

Databases, the web, social media, interactive platforms, sensor

devices, data exchanges, surveys, and observation studies are some
of the data sources we may be using. Data from diverse data
sources is recognized and acquired, then merged using a range of
tools and methodologies to create a unified interface for querying
and manipulating data. The data we identify, the source of that
data, and the methods we use to collect it all have quality, security,
and privacy concerns that must be considered at this time.

Relational databases, such as SQL Server, Oracle, MySQL, and IBM

DB2, are used to store data in an organized manner in these
systems. Data from databases and data warehouses can be utilized
as an analysis source. Data from a retail transaction system, for
example, can be used to analyze sales in different regions, while
data from a customer relationship management system can be used
to forecast sales. There are additional publicly and privately
available datasets outside of the organization.

SQL stands for Structured Query Language, and it is a querying

language for extracting data from relational databases. SQL
provides simple commands for specifying what data should be
retrieved from the database, the table from which it should be
extracted, grouping records with matching values, dictating the
order in which query results should be displayed, and limiting the
number of results that can be returned by the query, among a
variety of other features and functionalities.

APIs

APIs, or Application Program Interfaces, and Web Services are

provided by many data providers and websites, allowing various
users or programmes to communicate with and access data for
processing or analysis. APIs and Web Services often listen for
incoming requests from users or applications, which might be in the
form of web requests or network requests, and return data in plain
text, XML, HTML, JSON, or media files.
Source: https://www.browserstack.com/blog/diving-into-the-world-of-
apis/

APIs (Application Programming Interfaces) are widely used to

retrieve data from a number of data sources. APIs are used by apps
that demand data and access an end-point that contains the data.
Databases, online services, and data markets are examples of end-
points. APIs are also used to validate data. An API might be used by
a data analyst to validate postal addresses and zip codes, for
example.

Web Scraping
Web scraping is a technique for obtaining meaningful data from
unstructured sources. Online scraping, also known as screen
scraping, web harvesting, and web data extraction, allows you to
retrieve particular data from websites depending on predefined
parameters.

Source: https://www.toptal.com/python/web-scraping-with-python

Web scrapers may harvest text, contact information, photos, videos,

product items, and other information from a website. Web scraping
is commonly used for a variety of purposes, including gathering
product details from retailers, manufacturers, and eCommerce
websites to provide price comparisons, generating sales leads from
public data sources, extracting data from posts and authors on
various forums and communities, and gathering training and testing
datasets for machine learning models. BeautifulSoup, Scrapy,
Pandas, and Selenium are some prominent web scraping tools.

Data Streams
Data streams are another popular method for collecting continuous
streams of data from sources such as instruments, IoT devices and
apps, GPS data from cars, computer programmes, websites, and
social media posts. This information is often timestamped and
geotagged for geographic identification. Stock and market tickers
for financial trading, retail transaction streams for projecting
demand and supply chain management, surveillance and video
feeds for danger detection, social media feeds for emotion research,
and so on are examples of data streams and how they may be
utilized.

50Cc Scooter Ac Ignition System: B G/Y G Y/R BR BR/W B Y BL/W
100% (1)
50Cc Scooter Ac Ignition System: B G/Y G Y/R BR BR/W B Y BL/W
1 page
Lecture Notes 2
No ratings yet
Lecture Notes 2
5 pages
Data Analytics: UCSC0601
No ratings yet
Data Analytics: UCSC0601
64 pages
Chapter II Data Collection and Management
No ratings yet
Chapter II Data Collection and Management
19 pages
Ics054 Unit 1
No ratings yet
Ics054 Unit 1
14 pages
Unit-2 DS
No ratings yet
Unit-2 DS
10 pages
L2 - Data Acquisition
No ratings yet
L2 - Data Acquisition
48 pages
BigDataAnalytics - Unit1
No ratings yet
BigDataAnalytics - Unit1
21 pages
Chapter-1 Introduction To Data Analytics
No ratings yet
Chapter-1 Introduction To Data Analytics
34 pages
Data Analysis & Business Insights
No ratings yet
Data Analysis & Business Insights
2 pages
Unit 1 - PPT
No ratings yet
Unit 1 - PPT
67 pages
All About Data Science
No ratings yet
All About Data Science
35 pages
Chapter 2 - Sources of Data
No ratings yet
Chapter 2 - Sources of Data
11 pages
Data Acquistion 1
No ratings yet
Data Acquistion 1
22 pages
Secondary Data Collection Methods
No ratings yet
Secondary Data Collection Methods
15 pages
Data Sources
No ratings yet
Data Sources
9 pages
Cs3352 Foundation of Data Science
No ratings yet
Cs3352 Foundation of Data Science
80 pages
Data Science-Unit-2
No ratings yet
Data Science-Unit-2
33 pages
Chapter Six: Secondary Data Research: 1. Discuss The Advantages and Disadvantages of Using Secondary Data
No ratings yet
Chapter Six: Secondary Data Research: 1. Discuss The Advantages and Disadvantages of Using Secondary Data
5 pages
Comprehensive Guide To Data Collection
No ratings yet
Comprehensive Guide To Data Collection
16 pages
Data Analytics - Intermediate
No ratings yet
Data Analytics - Intermediate
36 pages
Module 2
No ratings yet
Module 2
70 pages
CRM Data Collection and Storage
No ratings yet
CRM Data Collection and Storage
22 pages
Data Analytics
No ratings yet
Data Analytics
21 pages
Data Collection and Storage
No ratings yet
Data Collection and Storage
15 pages
ToolKit 1 - Unit 1 - Introduction To Data Analytics
No ratings yet
ToolKit 1 - Unit 1 - Introduction To Data Analytics
15 pages
Xi Ai Unit - 5 Notes
No ratings yet
Xi Ai Unit - 5 Notes
28 pages
Data Mining: Types and Sources
No ratings yet
Data Mining: Types and Sources
7 pages
Data Sources and Quality Guide
No ratings yet
Data Sources and Quality Guide
25 pages
Slide#3 - Understanding Data
No ratings yet
Slide#3 - Understanding Data
44 pages
8 Using Secondary Data
100% (1)
8 Using Secondary Data
15 pages
Unit 1ppt
No ratings yet
Unit 1ppt
29 pages
Data Collection for Business Insights
No ratings yet
Data Collection for Business Insights
8 pages
Data Science Basics for Beginners
100% (2)
Data Science Basics for Beginners
68 pages
Data Analytics Unit 1
No ratings yet
Data Analytics Unit 1
16 pages
BDA-24 - Lect (3-4) - (Fundamentals of Data Analysis)
No ratings yet
BDA-24 - Lect (3-4) - (Fundamentals of Data Analysis)
15 pages
Unit 1ppt 241202105748 Ba1c594f
No ratings yet
Unit 1ppt 241202105748 Ba1c594f
30 pages
Lect 4
No ratings yet
Lect 4
21 pages
DMW M1 Ktunotes - in
No ratings yet
DMW M1 Ktunotes - in
75 pages
Topic 2 DA
No ratings yet
Topic 2 DA
3 pages
Introduction To Data Science
No ratings yet
Introduction To Data Science
6 pages
Data:: IT 9626 Chapter 1 Fahim Siddiq 03336581412
100% (2)
Data:: IT 9626 Chapter 1 Fahim Siddiq 03336581412
28 pages
An Introduction To Data Mining
No ratings yet
An Introduction To Data Mining
3 pages
Module 2 Data Science
No ratings yet
Module 2 Data Science
28 pages
Fact Sheet: Emerging Data Tools
No ratings yet
Fact Sheet: Emerging Data Tools
1 page
DM Unit I
No ratings yet
DM Unit I
52 pages
Data Structures
No ratings yet
Data Structures
50 pages
Big Data & Web Analytics Insights
No ratings yet
Big Data & Web Analytics Insights
9 pages
Understanding Secondary Data
No ratings yet
Understanding Secondary Data
29 pages
Introduction to Big Data Types
No ratings yet
Introduction to Big Data Types
7 pages
Secondary Data Collection
No ratings yet
Secondary Data Collection
18 pages
Week 3 - Getting Data Lesson 6
No ratings yet
Week 3 - Getting Data Lesson 6
6 pages
Chapter 3 - Data Collection 1
No ratings yet
Chapter 3 - Data Collection 1
33 pages
Lec02 Business Analytics - 20231224 - 102047 - 0000 1
No ratings yet
Lec02 Business Analytics - 20231224 - 102047 - 0000 1
23 pages
Da Notes U1 U3 Complete
No ratings yet
Da Notes U1 U3 Complete
56 pages
Da Notes
No ratings yet
Da Notes
61 pages
Data Curation and Managment Chap1-5 1-5
No ratings yet
Data Curation and Managment Chap1-5 1-5
31 pages
Lecture 3 (DS) - Steps in Data Science Process
No ratings yet
Lecture 3 (DS) - Steps in Data Science Process
57 pages
MongoDB For Absolute Beginners - Withlogo
No ratings yet
MongoDB For Absolute Beginners - Withlogo
29 pages
ICT 118 Organization and Structure Syllabus
No ratings yet
ICT 118 Organization and Structure Syllabus
14 pages
Introduction To ExpressJS
No ratings yet
Introduction To ExpressJS
28 pages
Strat Plan
No ratings yet
Strat Plan
2 pages
LISTLinked List
No ratings yet
LISTLinked List
22 pages
Chapter 4 Device Management
No ratings yet
Chapter 4 Device Management
42 pages
Strategic Plan For Addressing Software Project Del 2
No ratings yet
Strategic Plan For Addressing Software Project Del 2
3 pages
LESSON PLAN FORMAT HAND TOOLS Arbelle
No ratings yet
LESSON PLAN FORMAT HAND TOOLS Arbelle
2 pages
SAP MM - Purchase Info Record
100% (1)
SAP MM - Purchase Info Record
6 pages
1 s2.0 S1877705812011332 Main
No ratings yet
1 s2.0 S1877705812011332 Main
10 pages
Provantage 1
No ratings yet
Provantage 1
220 pages
Mos Word 2016 - Core Practice Exam 3 Training
No ratings yet
Mos Word 2016 - Core Practice Exam 3 Training
9 pages
CaseStudy Ch8 (3) Eng
No ratings yet
CaseStudy Ch8 (3) Eng
2 pages
Foundation (NCA) Sample PAGES 1
No ratings yet
Foundation (NCA) Sample PAGES 1
3 pages
I-Sem-Marketing Management
No ratings yet
I-Sem-Marketing Management
2 pages
Dietetics As A Profession
No ratings yet
Dietetics As A Profession
11 pages
OD429516601930181100
No ratings yet
OD429516601930181100
1 page
HRM: Job Analysis Essentials
100% (1)
HRM: Job Analysis Essentials
11 pages
Merritt V Government FT
No ratings yet
Merritt V Government FT
11 pages
Digital Paddlewheel Flow Meter: Features
No ratings yet
Digital Paddlewheel Flow Meter: Features
4 pages
Bungalow Melody - Lyrics - Chords
No ratings yet
Bungalow Melody - Lyrics - Chords
1 page
Export Import and Countertrade
No ratings yet
Export Import and Countertrade
32 pages
Halit Sahitaj - Criminal Network and Russian Intelligence Ties
No ratings yet
Halit Sahitaj - Criminal Network and Russian Intelligence Ties
5 pages
Guide To Developing An Approved Culinology Degree Program - Updated 2017
No ratings yet
Guide To Developing An Approved Culinology Degree Program - Updated 2017
15 pages
Nov 2024 - UPSC Infocademy
No ratings yet
Nov 2024 - UPSC Infocademy
77 pages
Malayala Manorama Company Limited
100% (2)
Malayala Manorama Company Limited
31 pages
Essential of Financial Accounting
No ratings yet
Essential of Financial Accounting
8 pages
How to Stop Sending Money to Girls
No ratings yet
How to Stop Sending Money to Girls
1 page
PT - 1 Apr 2025
No ratings yet
PT - 1 Apr 2025
4 pages
3M - Zinc Spray 16-501 - Data Sheet - 78-8125-9796-7-B
No ratings yet
3M - Zinc Spray 16-501 - Data Sheet - 78-8125-9796-7-B
2 pages
Maintenance Task Record E Rating English
No ratings yet
Maintenance Task Record E Rating English
11 pages
Dissertation Presentation Powerpoint Sample
100% (2)
Dissertation Presentation Powerpoint Sample
5 pages
Insurance Premium Rates Guide
No ratings yet
Insurance Premium Rates Guide
6 pages
Business Functions - Chaper 1-9-162-191
No ratings yet
Business Functions - Chaper 1-9-162-191
30 pages
Appendix-47
No ratings yet
Appendix-47
9 pages
IIT BH - DNC Lab - EE - Manual - Expt 7
No ratings yet
IIT BH - DNC Lab - EE - Manual - Expt 7
1 page