0% found this document useful (0 votes)

6 views44 pages

Introduction To Data Science UNIT 1

Data science is a multidisciplinary field focused on extracting insights from vast amounts of data using various tools and algorithms. It encompasses techniques from statistics, machine learning, and data engineering to uncover patterns and inform business decisions. The document outlines the history, applications, and core competencies of data science, emphasizing its importance in modern industries and the role of programming languages like Python in the data science pipeline.

Uploaded by

theiconicps

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views44 pages

Introduction To Data Science UNIT 1

Uploaded by

theiconicps

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 44

Introduction to Data

Science
Unit 1
What is Data Science

⚫ Data science – getting insight from data

⚫ Data Science is a blend of various tools, algorithms, and

machine learning principles with the goal to discover
hidden patterns from the raw data.

⚫ This aspect of data science is all about uncovering findings

from data. Diving in at a granular level to mine and
understand complex behaviors, trends, and inferences. It's
about surfacing hidden insight that can help enable
companies to make smarter business decisions.

⚫ “Data science is the discipline of making data useful.”

Data All Around
⚫Lots of data is being collected and
warehoused
⚫Web data, e-commerce
⚫Financial transactions, bank/credit
transactions
⚫Online trading and purchasing
⚫Social Network
Types of Data We Have
⚫Relational Data (Tables/Transaction/Legacy
Data)
⚫Text Data (Web)
⚫Semi-structured Data (XML)
⚫Graph Data
⚫Social Network, Semantic Web (RDF), …
⚫Streaming Data
Big Data and Data Science
⚫ Over the last decade there’s been a massive
explosion in both the data generated and retained
by companies. Sometimes we call this “big
data,” we’d like to build something with it.

⚫ Data scientists are the people who make sense

out of all this data and figure out just what can be
done with it.

⚫ Data Scientist is the“sexiest job title of the

21st century.
Data Science---Multidisciplinary field
⚫ Data science is an extension of various data analysis
fields such as data mining, statistics, predictive analysis
and many more.

⚫ Data Science is a huge field that uses a lot of methods

and concepts which belongs to other fields like
information science, statistics, mathematics, and
computer science.

⚫ Some of the techniques utilized in Data Science

encompasses machine learning, visualization, pattern
recognition, probability model, data engineering, signal
processing, etc.
A Brief History of Data Science
⚫ The term “Data Science” has emerged only recently to
specifically designate a new profession that is expected to
make sense of the vast stores of big data.
⚫ But making sense of data has a long history and has been
discussed by scientists, statisticians, librarians, computer
scientists and others for years. The following timeline traces
the evolution of the term “Data Science” and its use, attempts
to define it, and related terms.
⚫ Early Beginnings (1960s-1970s):
The roots of data science trace back to the world of statistics,
a discipline with a history dating back centuries. In the 1960s
and 1970s, as computers began to emerge, statisticians and
computer scientists started exploring ways to analyze and
visualize data using these early computing machines.
A Brief History of Data Science
⚫ Emergence of Data Mining (1980s-1990s):
Fast forward to the 1980s and 1990s, and we see the birth of data
mining as a distinct field. Researchers began developing algorithms to
uncover meaningful patterns and insights buried within vast datasets.
They started working on classification, clustering, and association rule
mining.
⚫ Growth of Data Warehousing (1990s):
Around the same time, data warehousing technologies became more
prominent. These systems allowed organizations to centralize and
manage large volumes of data effectively.

⚫ Rise of Machine Learning (1990s-Present):

Machine learning came into the limelight in the 1990s, with significant
progress in algorithms and techniques. This paved the way for
predictive modeling and data-driven decision-making. Advances in
computational power and the availability of labeled data fueled the
growth of machine learning.
A Brief History of Data Science
⚫ Big Data Era (2000s-Present):
The 2000s ushered in the era of big data, marked by the explosive
growth of data thanks to the internet, social media, and sensors.
Technologies like Hadoop and MapReduce were developed to
process and analyze these massive datasets.
⚫ Data Science as a Discipline (2000s-Present):
In the early 2000s, the term “data science” gained popularity as a
way to describe this interdisciplinary field that combines statistics,
computer science, domain expertise, and data analysis. Universities
and organizations started offering data science programs and
certifications.
⚫ Data Science in Industry (2010s-Present):
Data science became indispensable across various industries like
finance, healthcare, marketing, and e-commerce. Companies like
Google, Facebook, and Amazon played pivotal roles in shaping the
field and its applications.
A Brief History of Data Science
⚫ Tools and Frameworks (2010s-Present):
The open-source ecosystem for data science tools and
frameworks, including Python, R, TensorFlow, and scikit-
learn, expanded rapidly, making data analysis more
accessible to a broader audience.

The history of data science continues to evolve alongside

technology and the ever-growing ocean of data. Today, data
science is a crucial discipline in our modern world, with
applications spanning a wide array of domains. It’s likely to
remain a dynamic and evolving field for years to come.
What do Data Scientists do?
⚫ Data scientists are the key to realizing the opportunities
presented by big data.
⚫ They bring structure to it, find compelling patterns in it, and
advise executives on the implications for products, processes,
and decisions
⚫ Some of them are:
⚫ Internet Search
⚫ Digital Advertisements
⚫ Recommender Systems like Amazon, Netflix
⚫ Image Recognition e.g. tag your friends
⚫ Speech Recognition e.g. google Voice, Siri, Cortana etc.
⚫ Healthcare eg. Disease prediction
⚫ Airline Route Planning
⚫ Fraud and Risk Detection
⚫ Delivery logistics e.g. DHL, FedEx
Data Science Use Cases
⚫ The evolution of data science and advanced forms of
analytics has given rise to a wide range of applications that
are providing better insights and business value in the
enterprise.

⚫ While many different types of organizations

are implementing analytics applications driven by data
science, those applications are mostly focused on areas
that have proven their value over the past decade. By
digging deeper into them, businesses can gain benefits that
include competitive advantages over business rivals; better
service to customers, citizens, users and patients; and the
ability to respond more effectively to a rapidly changing
business environment that demands continuous adaptation.
Data Science Use Cases
Pattern recognition
⚫ identifying patterns in data sets is a fundamental data science project.
For example, pattern recognition helps retailers and e-commerce
companies spot trends in customer purchasing behavior.
⚫ Companies such as Amazon and Walmart have long used data science
approaches to discover purchasing patterns. In one interesting early
example, Walmart noticed that many customers making purchases in
anticipation of a hurricane or tropical storm also bought strawberry Pop-
Tarts. Such correlations, often unexpected, can help drive more effective
purchasing, inventory management and marketing strategies.

Predictive modeling
⚫ While predictive analytics has been around for decades, data science
applies machine learning and other algorithmic approaches to large
data sets to improve decision-making capabilities by creating models
that better predict customer behavior, financial risks, market trends and
more.
Data Science Use Cases
⚫ Predictive analytics applications are used in a wide range of
industries, including financial services, retail, manufacturing,
healthcare, travel and government.
⚫ For example, manufacturers use predictive maintenance
systems to help reduce equipment breakdowns and improve
production uptime. Airplane makers Boeing and Airbus also
depend on predictive maintenance to improve their fleet
availability. Similarly, Chevron, BP and other companies in the
energy sector use predictive modelling to improve equipment
reliability in settings where maintenance is costly, difficult and
expensive to perform.

Recommendation engines and personalization systems

⚫ It traditionally has been very difficult to tailor products and
services to the specific needs of individuals; doing so was too
time-consuming and costly.
Data Science Use Cases
⚫ Fortunately, the combination of data science, machine learning
and big data now enables organizations to build a detailed profile
of individual customers. Over time, their systems can learn
people's preferences and match them with others who have
similar preferences -- an approach known as hyper-
personalization.
⚫ Companies such as Home Depot, Lowe's and Netflix use hyper-
personalization techniques driven by data science to better focus
their offerings to customers through recommendation engines
and personalized marketing.
In Delivery Logistics
⚫ Various Logistics companies like DHL, FedEx, etc. make use of
Data Science. Data Science helps these companies to find the
best route for the Shipment of their Products, the best time
suited for delivery, the best mode of transport to reach the
destination, etc.
Data Science Use Cases
Medicine and Drug Development
⚫ The process of creating medicine is very difficult and
time-consuming and has to be done with full disciplined
because it is a matter of Someone’s life.
⚫ Without Data Science, it takes lots of time, resources, and
finance or developing new Medicine or drug but with the
help of Data Science, it becomes easy because the
prediction of success rate can be easily determined based
on biological data or factors. The algorithms based on
data science will forecast how this will react to the human
body without lab experiments.
Myths about Data Science
⚫ Ph.D is Mandatory to Become a Data Scientist
⚫ A Full-Time Data Science Degree is a Must for Making the Transition
⚫ All your previous Work Experience will Translate to the Data Science
Domain
⚫ Necessary to have a Computer
Science/Mathematics/Statistics/Programming Background
⚫ Learning a Tool is Enough to Become a Data Scientist
⚫ Deep Learning Requires Computational Power that Only Top
Companies Have
⚫ Once Built, AI Systems will Continue to Evolve and Generalize by
Themselves Universe of AI Jobs
⚫ Data Science is Only About Building Predictive Models
⚫ Participating in Data Science Competitions Translates to Real-Life
Projects
⚫ Data Collection is a Breeze, the Focus should be on Building Models
Myths about Data Science
⚫ Data Science is all about building machine learning and deep
learning models
⚫ Only people with a programming or mathematical
background can become Data Scientists
⚫ Data Analysts, Data Engineers, and Data Scientists all
perform the same tasks
Choosing a data science language
⚫ Choosing the correct tool makes your life easier.
⚫ Data scientists usually use only a few languages
because they make working with data easier.
⚫ With this in mind, here are the four top languages for
data science work in order of preference (used by
91 percent of the data scientists out there):
⚫ Python
⚫R
⚫ SAS
⚫ SQL
Outlining the core competencies of a
data scientist
⚫ data scientist requires knowledge of a broad range of skills
in order to perform the required tasks. In fact, so many
different skills are required that data scientists often work
in teams.
⚫ Someone who is good at gathering data might team up with
an analyst and someone gifted in presenting information.
⚫ It would be hard to find a single person with all the
required skills.
⚫ following list describes areas in which a data scientist
could excel
⚫ Data capture
⚫ Analysis
⚫ Presentation
Outlining the core competencies of a data
scientist
⚫Data capture:
⚫ Capturing data begins by managing a data source using
database management skills. However, raw data isn’t
particularly useful in many situations— you must also
understand the data domain. Finally, you must have
data‐modeling skills so that you understand how the
data is connected and whether the data is structured.
⚫Analysis:
⚫ You perform some analysis using basic statistical tool
skills, much like those that just about everyone learns in
college. However, the use of specialized math tricks and
algorithms can make patterns in the data more obvious
or help you draw conclusions
Outlining the core competencies of a data
scientist
⚫Presentation:
⚫ It’s important to provide a graphical presentation of
these patterns to help others visualize what the numbers
mean and how to apply them in a meaningful way. More
important, the presentation must tell a specific story so
that the impact of the data isn’t lost.
Understanding the role of programming
⚫ A data scientist may need to know several programming
languages in order to achieve specific goals.
⚫ Manually performing these tasks is time consuming and error
prone, so programming presents the best method for achieving
the goal.
⚫ Given the number of products that most data scientists use, it may
not be possible to use just one programming language.
⚫ Yes, Python can load data, transform it, analyze it, and even
present it to the end user, but it works only when the language
provides the required functionality.

⚫ The languages you choose depend on a number of criteria.

⚫ How you intend to use data science in your code
⚫ Your familiarity with the language
⚫ The need to interact with other languages
⚫ The availability of tools to enhance the development environment
⚫ The availability of APIs and libraries to make performing tasks easier
Creating the Data Science Pipeline
Data science pipeline, which requires the data scientist to follow
particular steps in the preparation, analysis, and presentation of the
data.

⚫ Preparing the data- The data that you access from various sources
doesn’t come in an easily packaged form, ready for analysis. you may
also need to transform it to make all the data sources cohesive and
amenable to analysis. Transformation may require changing data
types, the order in which data appears, and even the creation of data
entries based on the information provided by existing entries.

⚫ Performing exploratory data analysis-data science provides access

to a wealth of statistical methods and algorithms that help you
discover patterns in the data. A single approach doesn’t ordinarily do
the trick. You typically use an iterative process to rework the data from
a number of perspectives.
Creating the Data Science Pipeline
⚫ Learning from data-As you iterate through various statistical
analysis methods and apply algorithms to detect patterns, you begin
learning from the data. . In fact, it’s the fun part of data science
because you can’t ever know in advance precisely what the data will
reveal to you. Of course, the imprecise nature of data and the finding
of seemingly random patterns in it means keeping an open mind.

⚫ Visualizing-Visualization means seeing the patterns in the data and

then being able to react to those patterns. It also means being able
to see when data is not part of the pattern. Think of yourself as a
data sculptor — removing the data that lies outside the patterns (the
outliers) so that others can see the masterpiece of information
beneath.

⚫ Obtaining insights and data products-The insights you obtain

from manipulating and analyzing the data help you to perform real‐
world tasks. For example, you can use the results of an analysis to
make a business decision.
Understanding Python’s Role in Data Science

⚫ Many different ways are available for accomplishing data science

tasks.
⚫ Python represents one of the few single‐stop solutions that you can
use to solve complex data science problems.
⚫ Instead of having to use a number of tools to perform a task, you can
simply use a single language, Python, to get the job done.
⚫ The Python difference is the large number scientific and math
libraries created for it by third parties. Plugging in these libraries
greatly extends Python and allows it to easily perform tasks that other
languages could perform, but with great difficulty.

⚫ Python is that it supports four different coding styles:

⚫ Functional
⚫ Procedural
⚫ Object Oriented
⚫ imparative
Learning to Use Python Fast
⚫ As part of Data Science many things are required to
be done. Python can be used to perform various task
of data science pipeline like:
⚫ Loading data
⚫ Training model
⚫ Visualizing data
Performing Rapid Prototyping and
Experimentation
⚫ Python is all about creating applications quickly and
then experimenting with them to see how things work.
⚫ The act of creating an application design in code
without necessarily filling in all the details is
prototyping and is faster in python.
⚫ Data science doesn’t rely on static solutions. You may
have to try multiple solutions to find the particular
solution that works best.
⚫ After you create a prototype, you use it to experiment
with various algorithms to determine which algorithm
works best in a particular situation.
Considering Speed of Execution
⚫Following factors control the speed of
execution for your data science application:
⚫Dataset size
⚫Loading technique
⚫Coding style
⚫Machine capability
⚫Analysis algorithm
Using the Python Ecosystem for Data Science
This section provides an overview of the libraries you use for the
data science examples
⚫ Accessing scientific tools using SciPy
The SciPy stack (http://www.scipy.org/) contains a host of other
libraries that you can also download separately. These libraries
provide support for mathematics, science, and engineering.
⚫ These libraries are
⚫ NumPy
⚫ SciPy
⚫ matplotlib
⚫ IPython
⚫ Sympy
⚫ Pandas
⚫ SciPy is a general‐purpose library that provides
functionality for multiple problem domains. It also provides
support for domain‐specific libraries, such as Scikit‐learn,
Scikit‐image, and statsmodels.
Using the Python Ecosystem for Data Science
⚫ Performing fundamental scientific computing
using NumPy
The NumPy library (http://www.numpy.org/) provides the
means for performing n‐dimensional array manipulation.
NumPy functions that include support for linear algebra,
Fourier transform, and random‐number generation.

⚫ Performing data analysis using pandas

The pandas library (http://pandas.pydata.org/) provides
support for data structures and data analysis tools. The
basic principle behind pandas is to provide data analysis
and modeling support for Python.
Using the Python Ecosystem for Data Science
⚫ Implementing machine learning using Scikit‐learn
⚫ The Scikit‐learn library (http://scikit‐learn.org/stable/) is
one of a number of Scikit libraries that build on the
capabilities provided by NumPy and SciPy to allow Python
developers to perform domain‐specific tasks. It provides
access to the following sorts of functionality:
⚫ Classification
⚫ Regression
⚫ Clustering
⚫ Dimensionality reduction
⚫ Model selection
⚫ Preprocessing
Using the Python Ecosystem for Data Science
⚫ Plotting the data using matplotlib
The matplotlib library (http://matplotlib.org/) provides you with a
MATLAB‐like interface for creating data presentations of the
analysis you perform.
The library is currently limited to 2D output, but it still provides
you with the means to express graphically the data patterns you
see in the data you analyze. Without this library, you couldn’t
create output that people outside the data science community
could easily understand.

⚫ Parsing HTML documents using Beautiful Soup

The Beautiful Soup library (http://www.crummy.com/software/
BeautifulSoup/) download is actually found at https://pypi.python.
org/pypi/beautifulsoup4/4.3.2. This library provides the means for
parsing HTML or XML data in a manner that Python
understands. It allows you to work with tree‐based data.
Working at the command line or in the IDE
⚫ Anaconda is a product that makes using Python even
easier. It comes with a number of utilities that help you
work with Python in a variety of ways.
⚫ Anaconda is actually a compilation of several open source
applications. You can use these applications individually or
in cooperation with each other to achieve specific coding
goals.

⚫ Jupyter Notebook/IPython Notebook

⚫ Spyder
⚫ IPython QT Console
⚫ Anaconda Command Prompt
⚫ Python Interprepter
⚫ Ipython Console
Creating new sessions with Anaconda
Command Prompt
⚫ Only one of the Anaconda utilities provides direct access to
the command line, Anaconda Command Prompt.
⚫ When you start this utility, you see a command prompt at
which you can type commands.
⚫ Using Anaconda Command prompt you can start various
utility through command:
⚫ jupyter notebook
⚫ spyder
⚫ ipython to open IPython Console
⚫ python to open python interpreter
⚫ IPython qtconsole
Python Interpreter
Python Interpreter
⚫ Python interpreter is in an interactive mode when it reads
commands from a tty. The primary prompt is the following:
>>>
⚫ When it shows this prompt, it means it prompts the
developer for the next command. This is the REPL. Before it
prints the first prompt, Python interpreter prints a welcome
message that also states its version number and a copyright
notice.
This is the secondary prompt:
…
⚫ This prompt denotes continuation lines.

⚫ You quit the interpreter by typing quit() and pressing Enter.

Entering the IPython environment
⚫ The Interactive Python (IPython) environment
provides enhancements to the standard Python
interpreter.
⚫ Run Native Shell Commands
⚫ Syntax Highlighting
⚫ Proper Indentation
⚫ Tab Completion
⚫ Documentation like str.capitalize?
Entering IPython QTConsole environment
⚫ It adds a GUI on top of IPython that makes using the
enhancements that IPython provides a lot easier,
Editing scripts using Spyder
⚫ Spyder is a fully functional Integrated Development
Environment (IDE). You use it to load scripts, edit
them, run them, and perform debugging tasks.

Introduction To Data Science - A Beginner Guide
100% (1)
Introduction To Data Science - A Beginner Guide
18 pages
Seminar On Data Science
100% (7)
Seminar On Data Science
25 pages
Datascience
75% (8)
Datascience
28 pages
DS Notes
No ratings yet
DS Notes
159 pages
Passwords
100% (2)
Passwords
487 pages
Fundamentals of Data Science
No ratings yet
Fundamentals of Data Science
53 pages
Module 1 Introduction Ds
No ratings yet
Module 1 Introduction Ds
18 pages
Fundamentals of Data Science Course
100% (3)
Fundamentals of Data Science Course
62 pages
Int To Ds
No ratings yet
Int To Ds
61 pages
Introduction To Data-Science
No ratings yet
Introduction To Data-Science
246 pages
Unit 1-FDS
100% (2)
Unit 1-FDS
18 pages
Introductions
No ratings yet
Introductions
14 pages
Ids Mod1
No ratings yet
Ids Mod1
21 pages
Unit I TYCS DS
No ratings yet
Unit I TYCS DS
73 pages
Unit 1 DS BCA NOTES
No ratings yet
Unit 1 DS BCA NOTES
7 pages
Module-1 Notes Basics 09.07.25
No ratings yet
Module-1 Notes Basics 09.07.25
45 pages
Bsd1313 Chapter 1
No ratings yet
Bsd1313 Chapter 1
60 pages
Unit 1
No ratings yet
Unit 1
28 pages
Need For Data Science
No ratings yet
Need For Data Science
11 pages
DS-BDS (Unit 1) Technical
No ratings yet
DS-BDS (Unit 1) Technical
22 pages
Data Science - AD1102-1
No ratings yet
Data Science - AD1102-1
53 pages
Unit 4 Notes
No ratings yet
Unit 4 Notes
16 pages
DS B&V-1
No ratings yet
DS B&V-1
30 pages
DataScience Intro
No ratings yet
DataScience Intro
36 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
17 pages
CSIC 221: Machine Learning & Data Analytics: Mayank Dave Professor Dept. of Computer Engineering
No ratings yet
CSIC 221: Machine Learning & Data Analytics: Mayank Dave Professor Dept. of Computer Engineering
23 pages
Class X Data Science
No ratings yet
Class X Data Science
29 pages
Big Data and Data Science Guide
No ratings yet
Big Data and Data Science Guide
62 pages
Intro to Data Science Basics
No ratings yet
Intro to Data Science Basics
171 pages
IDS Complete Notes
No ratings yet
IDS Complete Notes
126 pages
Fds Module 1
No ratings yet
Fds Module 1
65 pages
Data Science Basics
No ratings yet
Data Science Basics
25 pages
Chapter 1
No ratings yet
Chapter 1
85 pages
M 1 FDS Notes
No ratings yet
M 1 FDS Notes
19 pages
Himadev
No ratings yet
Himadev
37 pages
Ch7-Overview of Data Science-Part 1
No ratings yet
Ch7-Overview of Data Science-Part 1
37 pages
00 Introduction To Data Science
No ratings yet
00 Introduction To Data Science
4 pages
Introduction To Datasciecne
No ratings yet
Introduction To Datasciecne
50 pages
Unit 3 Part 1
No ratings yet
Unit 3 Part 1
43 pages
Unit-1 IDS
No ratings yet
Unit-1 IDS
26 pages
1.1 Idml
No ratings yet
1.1 Idml
3 pages
Introduction To Data Science - Unit-1
No ratings yet
Introduction To Data Science - Unit-1
9 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
25 pages
Introduction To Data Science Lecture 1
No ratings yet
Introduction To Data Science Lecture 1
4 pages
Data Science Overview & Applications
No ratings yet
Data Science Overview & Applications
10 pages
PSD02 - Data Science Overview
No ratings yet
PSD02 - Data Science Overview
64 pages
Data Science for Industry Innovators
No ratings yet
Data Science for Industry Innovators
2 pages
1) Data-Sci Chapter-1
No ratings yet
1) Data-Sci Chapter-1
17 pages
EXCEL Lab Exercise
78% (18)
EXCEL Lab Exercise
20 pages
Data Science - FYBCA-Sem-II
No ratings yet
Data Science - FYBCA-Sem-II
13 pages
Lecture 1 What Is Data Science Prerequisites, Lifecycle and Applications Simplilearn
No ratings yet
Lecture 1 What Is Data Science Prerequisites, Lifecycle and Applications Simplilearn
5 pages
DataScience Intro
No ratings yet
DataScience Intro
36 pages
Data Science Life Cycle
No ratings yet
Data Science Life Cycle
12 pages
DIDM e Projects
100% (7)
DIDM e Projects
9 pages
Data Science Notes - 1-PD
No ratings yet
Data Science Notes - 1-PD
17 pages
Data Science
No ratings yet
Data Science
8 pages
EDS Unit 1?
No ratings yet
EDS Unit 1?
15 pages
Chapter 1 Data Science Fundamentals
No ratings yet
Chapter 1 Data Science Fundamentals
34 pages
Data Science Components
No ratings yet
Data Science Components
7 pages
Introduction To Data Science Basic and Overview
No ratings yet
Introduction To Data Science Basic and Overview
3 pages
Audit Data Standards & Analytics Quiz
No ratings yet
Audit Data Standards & Analytics Quiz
3 pages
CISA Lecture Domain 2
50% (4)
CISA Lecture Domain 2
116 pages
Lab 07
No ratings yet
Lab 07
17 pages
ABAP Programming Language Guide
100% (1)
ABAP Programming Language Guide
30 pages
Ict Upper Sixth
No ratings yet
Ict Upper Sixth
4 pages
Digital Comm Lab Manual
No ratings yet
Digital Comm Lab Manual
33 pages
Rfi RFQ Rfe
No ratings yet
Rfi RFQ Rfe
3 pages
Description Features: PT6311 VFD Driver/Controller IC
No ratings yet
Description Features: PT6311 VFD Driver/Controller IC
22 pages
Assignment
No ratings yet
Assignment
8 pages
Wa0002.
No ratings yet
Wa0002.
15 pages
Private Edtech Companies in India:: Ne Techno-Financial Aspects
No ratings yet
Private Edtech Companies in India:: Ne Techno-Financial Aspects
34 pages
Canteen App for Students & Staff
No ratings yet
Canteen App for Students & Staff
4 pages
RPCS3 Old Log
No ratings yet
RPCS3 Old Log
211 pages
Requisition
No ratings yet
Requisition
8 pages
Toronto Restaurant Location Guide
No ratings yet
Toronto Restaurant Location Guide
9 pages
Help Line No: 18003455384 (Toll Free) : State Name District Name Block/Municipality Municipality Name Ward No. Select by
No ratings yet
Help Line No: 18003455384 (Toll Free) : State Name District Name Block/Municipality Municipality Name Ward No. Select by
1 page
Introduction to Operating Systems
No ratings yet
Introduction to Operating Systems
34 pages
Log
No ratings yet
Log
3 pages
Profile Summary: Pallavi Kumari Pandey
No ratings yet
Profile Summary: Pallavi Kumari Pandey
2 pages
Sophos Intercept X Essentials Faq
No ratings yet
Sophos Intercept X Essentials Faq
2 pages
Industry 4.0 Chapter 3 Notes
No ratings yet
Industry 4.0 Chapter 3 Notes
6 pages
HirePurchaseScheme CurrentPriceList (07!06!2013)
No ratings yet
HirePurchaseScheme CurrentPriceList (07!06!2013)
60 pages
Top10VPN GWI Global VPN Usage Report 2020
No ratings yet
Top10VPN GWI Global VPN Usage Report 2020
20 pages
Metro Wholsale Management System Report
No ratings yet
Metro Wholsale Management System Report
35 pages
Visualizing CO2 Emissions - Instruction and Assignment-1550
No ratings yet
Visualizing CO2 Emissions - Instruction and Assignment-1550
13 pages
WHERE Clause: DCL Command
No ratings yet
WHERE Clause: DCL Command
8 pages
HD Video Capture Setup Guide
No ratings yet
HD Video Capture Setup Guide
49 pages

Introduction To Data Science UNIT 1

Uploaded by

Introduction To Data Science UNIT 1

Uploaded by

Introduction to Data

⚫ Data science – getting insight from data

⚫ Data Science is a blend of various tools, algorithms, and

⚫ This aspect of data science is all about uncovering findings

⚫ “Data science is the discipline of making data useful.”

⚫ Data scientists are the people who make sense

⚫ Data Scientist is the“sexiest job title of the

⚫ Data Science is a huge field that uses a lot of methods

⚫ Some of the techniques utilized in Data Science

⚫ Rise of Machine Learning (1990s-Present):

The history of data science continues to evolve alongside

⚫ While many different types of organizations

Recommendation engines and personalization systems

⚫ The languages you choose depend on a number of criteria.

⚫ Performing exploratory data analysis-data science provides access

⚫ Visualizing-Visualization means seeing the patterns in the data and

⚫ Obtaining insights and data products-The insights you obtain

⚫ Many different ways are available for accomplishing data science

⚫ Python is that it supports four different coding styles:

⚫ Performing data analysis using pandas

⚫ Parsing HTML documents using Beautiful Soup

⚫ Jupyter Notebook/IPython Notebook

⚫ You quit the interpreter by typing quit() and pressing Enter.

You might also like