Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
17 views2 pages

PANDAS

Pandas is a Python library used for data analysis and manipulation of structured data. It allows data to be stored and manipulated in the form of labeled arrays called Series (1D) and DataFrames (2D). Series are homogeneous 1D structures with an associated array of indexes while DataFrames are heterogeneous 2D structures that allow different data types. The main author of Pandas is Wes McKinney and it supports reading/writing different data formats, selecting and combining data subsets, and time-series analysis.

Uploaded by

Isha Bohra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views2 pages

PANDAS

Pandas is a Python library used for data analysis and manipulation of structured data. It allows data to be stored and manipulated in the form of labeled arrays called Series (1D) and DataFrames (2D). Series are homogeneous 1D structures with an associated array of indexes while DataFrames are heterogeneous 2D structures that allow different data types. The main author of Pandas is Wes McKinney and it supports reading/writing different data formats, selecting and combining data subsets, and time-series analysis.

Uploaded by

Isha Bohra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

PANDAS

 Python’s library for data analysis


 Name derived from “panel data system” meaning multidimensional, structured data sets
 Data analysis- process of evaluating big data sets using analytical and statistical tools
 Main author of PANDAS – WES McKinney

Why pandas

 Can read and write in many diff. data formats(integer,float,double etc.)


 Can calculate in all possible ways data is organized
 Can easily select subsets of data from bulky data sets and even combine multiple datasets
together.
 Allows you to apply operations to independent groups within data
 Supports reshaping of data
 Supports advance time-series
 Supports visualization

DATA STRUCTURE- specific way of storing and organizing data in a computer to suit specific purpose
so can be accessed easily.

SERIES- 1dimensional data structure of Pandas

It has 2 main components- a)an array of actual data, b) an associated array of indexes

DATAFRAME- 2-dimensional structure of pandas

SNO. SERIES DATAFRAME


1 1D 2D
2 HOMOGENEOUS HETEROGENEOUS
3 VALUE MUTABLE VALUE MUTABLE
4 SIZE IMMUTABLE SIZE MUTABLE

SNO Series object Lists


.
1 1D CAN BE 1D AND MULTI-D BOTH
2 CAN HAVE NUMERIC AND LABELS INDEXES ONLY NUMERIC INDEXES
3 SUPPORTS EXPLICIT INDEXING ONLY SUPPORTS IMPLICIT INDEXING
4 INDEXES CAN BE DUPLICATED INDEXES CANNOT BE DUPLICATED
5 HOMOGENEOUS ELEMENTS HETEROGENOUS ELEMENTS

SERIES V.S DICTIONARY


SNO SERIES DICTIONARY
.
1 1D 1D AND MULTI- D WITH NETED DIC
2 VALUES / KEYS- DIC & VALUES/LABELS-SERIES KEYS-INDEXES & VALUES-ELEMENTS
3 INDEXES CAN BE NUMBERS AND LABELS ONLY KEYS – IMMUTABLE

SERIES V/S NDARRAY


SNO SERIES NDARRAY
.
1 HOMOGENEOUS HOMOGENEOUS
2 SUPPORTS EXPLICIT INDEXING DOES NOT SUPPORT EXPLICIT INDEX
3 BOTH INDEXES AND STRING TYPE ONLY NUMERIC TYPES
4 PERFORM VECT. OPER. IF SHPES ARE DIFFE. PERFORM VECTORIZED OPER. IF THEIR
USING NaN FOR NON-MATCHING LABELS SHAPES MATCH
5 TAKES MORE MEMORY TAKES LESSER MEMORY

SNO DATAFRAME VS NDARRAY


.
1 2D 2D
2 HETEROGENEOUS HOMOGENEOUS
3 CAN HAVE BOTH INDEXES AND LABELS FOR R INDEXED BY TUPLE OF +VE INT FOR
AND C BOTH AXES
4 CONSUMES MORE MEMORY TAKES LESS MEMORY
5 EXPANDABLE NOT EXPANDABLE

You might also like