Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
13 views28 pages

Data Science Notes Unit 1

data science notes unit 1 notes.data science notes unit 1 notes.data science notes unit 1 notes.data science notes unit 1 notes.data science notes unit 1 notes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
13 views28 pages

Data Science Notes Unit 1

data science notes unit 1 notes.data science notes unit 1 notes.data science notes unit 1 notes.data science notes unit 1 notes.data science notes unit 1 notes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 28
Data science Daa — science combines math e advanced analytics, artificial intelligence (Al) and machine learning with speci mnatier expertise 10 uncover actionable insights hidden in_an organization's: insights can be used to guide decision making and strategic planning. and — slatistics, specialized programming, ific subject These ines that uses statistics, data analysis, .e and insights from it. making sis, and make future * Data Science is a combination of multiple disci and machine learning to analyse data and to extract knowl © Data Science is about data gathering, analysis and deci © Data Science is about finding pattems in data, through analy: predic © By using Data Science. companies are able to make: + Better decisions (should we choose A or B) + Predictive analysis (what will happen next?) + Pattern discoveries (find pattern, or maybe hidden information in the data) Where is Data Science Needed? Data Science is used in many industries in the world today, e.g. banking, consultancy, healtheare, and manufacturing. Examples of where Data Science is needed: + For route planning: To discover the best routes to ship + To foresee delays for Mlight/ship/train etc, (through predictive analysis) + To create promotional offers + To find the best suited time to deliver goods + To forecast the next years revenue for a company + To analyze health benefit of training + To predict who will win elections Data Science can be applied in nearly every part of a business where data is available. Examples are: + Consumer goods + Stock markets + Industry + Politics + Logistic companies + E-commer The accelerati of the fastest growing field across every industry. 1g Yolume of data sources, and subsequently data, has made daa science is one ss analys The data science lifecycle involves various roles, tools, and processes, which enables anal ving stages to glean actionable insights. Typically. a data seteice project undergoes the following stae (& scanned with OKEN Scanner lection, both raw structured and sty of methods. These methods 1g data from systems Data ingestion: The lifecycle begins with the data coll unstructured data from all relevant sources using a var can include manual entry, web scraping, and real-time stres and deviees, Data sources can inelude structured data, such as customer data, along ‘with unstructured data like log files, video, audio, pictures, the Internet of Things (oT). social media, and more. nd data proces: ince data can have different formats and -d to consider different storage systems based on the type of tured, Data management teams help to set standards around data storage and structure, which facilitate workflows around analytics, machine learning and deep learning models. This stage includes cleaning data, deduplicating, transforming and combining the data using ETL (extract, transform, load) jobs or other data integration technologies. This data preparation is essential for promoting data quality before loading into a data warehouse, data lake, or other repository. Data analysis: Here, data scientists conduct an exploratory data analysis to examine biases. patterns, ranges. and distributions of values within the data. This data analytics Sploration drives hypothesis generation for a/b testing. It also allows analysts to deiermine the data’s relevance for use within modelling efforts for predictive analytics, machine learning, and/or deep learning. Depending on a model’s accuracy, ms can become reliant on these insights for business decision making, Dats sto structures, companies. n data that needs to be organizati allowing them to drive more scalability. Communicate: Finally, insights are presented as reports and other data visualizations that make the insights and their impact on business easier for business analysts and other decision-makers to understand, A data science programming language such as R cor Python includes components for generating visualizations; alternately, data scientists can use dedicated visualization tools. How Does a Data Scientist Work? A Data Scientist requires expertise in several backgrounds: Machine Learning Statistics Programming (Python or R) Mathematics Databases A Data Scientist must find patterns within the data, Before he/she can find the pattems, he/she must organize the data in a standard format. Jere is how a Data Scientist works: Lj 2i ns - To understand the business problem. Explore and collect data - From database, web logs, customer feedback, ete. ‘Extract the data - ‘Transform the data to a standardized format, in the data - Remove erroneous values from the data. and replice missing values - Check for missing values and replace them with a suitable value (e.g, an average value). Ask the right quest (& scanned with OKEN Scanner 6. Normalize data - Scale the values in a practical range (e.g. 140 em is smaller than 1,8 m, However, the number 140 is larger than 1,8. - so scaling is important), , 7. Analyze data, find patterns and make future predictions. 8. Represent the result - Present the result with useful insights in a way the "company" can understand Data Preprocessing 2 involves cleaning and transforming raw data into a usable format for Date, preprocessi jable analysis. accurate and rel r Daia anal make informed decision Data Vi Data visualization uses (ss Analy sis is the process of inspecting data to discover meaningful insights and trends to alization raphical representations such as charts and graphs to understand and interpret complex data, Machine Li Machi and make predictio ing Tocuses on developing algorithms that helps computers to learn from data plicit programming. or decisions without What is Data Analyties? Dats Awalsties is the process of collecting, 01 «derstand what’s happening and make betl people ‘s learn from data like what worked in the past, what is happening now ‘und what might happen in the future. People ofivs mie up data analyties and data analysis but they're not exactly the same, Data sais fs just one part of data anulyties it focuses on finding meaning in data, On the other ote unui ties includes mare than just analysis, It also involves things like coming up panel predictions from data and building the tools and systems needed (o handle ‘and busin with ide cms of date. Importance and Usage of Data Analytics Data analytes is used in many fields like banking, fat ming, shopping, government and more. It helps in many ways: Data. Analytics Importance ‘on Making: It gives elear facts and patterns from data which help people ps in Dec snake smarter choi © Jelps in Problem Solving: It points out what's going wrong and why making it easier to ‘is problems. © Helps Identify Opportunities: It shows trends and new chances for growth that might not be obvious Improved Efficiency: It helps reduce waste, saves time and makes work smoother by finding better ways to do things. ties ptists and data engineers together create data pipelines which helps to her analysis, Data Analytics can be done in the following steps of Data Ana ysts. data s ie model and do fi mentioned below: Pro (& scanned with OKEN Scanner Data Analyties Process Data colleetion is the first step where raw information is gathered from. metimes data comes from 1. Data Collectio different places like s and needs (0 be join websites, apps. surveys or machines. Sor ned together. Other times only a small useful part of the many souree alata is selected. 2. Data Cleansing Once ceniigs, missin «s or repeated rows. In this step 1 anything that isn’t needed. Clean data make the data is collected it usually contains mistakes like wrong the data is cleaned to fix those s the results more value syoblems and renwove r accurate andl rustsorthy Data Analysis and Dat ai, Python, Bor SQL. Analysts look like Excel. > salve problems or answer questions, The goal her fier cleaning the data is studied using tools for patterns, trends or us eful information that wre is to understand what the data 1a Interpretatio ean help is telling us. 4. Data Visualization: Data visualiza .¢ plots. charts and graphs whieh helps + valuable insights of the data, By comparing the J he useful data tiem the raw data, 1 is the process of creating visual representation of irends and get to analyze the patterns, datasets and analyzing it data analysts sta usi u find Types of Data Analytics There are different types of data analysi jis, Some of the types of data analysis is in which raw data is converted into valuable are mentioned below: Types of Data Analytics Data Analytics: Descriptive data analytics helps to summarize and ia, It shows What has happened by using tables, CRAPS and averages, ‘and weaknesses and spot any unusial 1. Descriptive understand past d ‘eo compare results, find strengths pattems nostic Data Analyties: Diagnostic data analytics looks at why something happened: comparison to find the cause of a tools like correlation, regression or panies understand the reason behind a drop in sales oF & sudden in the past. It use problem. This helps com change in performance vreditive Data Analytis: Predictive data analytics is used to guess what might happen jn the future, It looks at et past data to find patterns and make forecasts. usinesses use fto prediet things like customer be havior, future sales or possible risks. tive Data Analyfies: Prescriptive data analytics helps to choose the best action ferent options and suggests what should be done next. jons and managing machines 4. Prese ‘or solution. It looks at diff Coinpantes use it for things like loan approval, pricing de ‘or schedules. ‘Methods of Data Analytics ‘There are two types of methods in data analyties which are mentioned belo 1, Qualitative Data Analytics ualitative data ana ri fr a a ye data analy st 's and derives data from the words, pictures and symbols. Some common qualitative methods are: is doesn’t use statis (& scanned with OKEN Scanner + Narrative Analytics is used for working with data acquired from diaries, interviews and so on, + Content Analyties is used for Analyties of verbal data and behaviour. ‘+ Grounded theory is used to explain some given event by studying. 2, Quantitative Data Analysis ae Quantitative data Analyties is used to collect data and then process it into the numerical data. Some of the quantitative methods are mentioned below: «= ypothesis testing assesses the given hypothesis ofthe data set. J sample size determination is the method of taking e small sample from a large group of people and then analysing it «Average or mean of a subject of items present in that list. dividing the sum total numbers in the list by the number = a Maebrne hearing a N O © scanned with OKEN Scanner . Supuunises deauning tvil Pol ~Toaining diath halo bed rect b “| ( foetback, foon Me > Training ala =~ toluely in rp : bat > beth ajp 2e/p cup Pao both. Na/p 2 of WP tp 7 0 “o ne D> Classifralidn 7 OY ym fe est vr ap : WW Op ane Tinbgy Ont leavni Jrenate age dows 4 Qiuten/4 my) p Oe aN tote "fe CNaive Oe ic tp aye) T° Model, + Vrwapanined tanning ye > ony T/p 2 > contig Chased on ie) > k= Mee. PD ly ore. of Sarma lupe. + Koi forcement deauninge ss => Revwasol / fenatity eo hewn ( Apert) Vo C : 1 © scanned with OKEN Scanner woot 2 Or ae Arif Neural Networks — | Brnin, — thas 2 tae va raunars [ baste wn) rane Jakes b/p —> neurons accenk ipl ae Yp\) take. achion, 2 Hehor use Buy y % 7 Xn, wor re} : gd: EC Sri) tail ~ Node Crepiiea sf nestron) Exarnple - jf heat. fi actin. — Grains Neuron coAbe Le and tends 10 brains: KX, Ho. . wn — Wi, Wa--. w= _wwuplits assocralocs (& scanned with OKEN Scanner = = vootp lech. acum Uy, VO, + 9-7 +. - EK LO _ (& scanned with OKEN Scanner . puiddan, ——Ouetpuh “Toye tag og : consfan| a —, — x, @ — CO: Le OSX 4+ 0-902+b~ value wr A -] ahaa 72) o sO-7 ze "OO Rince zyo 4e 0-670 © $0 Relu AF gies rO~8S a pm es d me +E] na inuxcpt wath on peradian Pee ot forclin = 1. Item = ushan. ughen 27e e* N A ° pon, tee Top N/K Both usith give diff susulh Aoak Relu —_ (& scanned with OKEN Scanner bare A = en bey OF 06 OT as We aga sane pr ee to_get v ae : Activation, Function. : dinear Lunction :- t(v) = Q+v ‘nel bio Loeig Wd eum A+ Fu Xf (& scanned with OKEN Scanner acai Sigmotad fumcH om fms l Vv (& scanned with OKEN Scanner C==0 Curate, ee en_i/p aecordingit of — : Aipmoid fue t at ma + -i C)} Ei Vanishing Gradient Puoblom tie = 08 tip inewor, pueditant does nots choope So hoop glous Accurat® | psx oltety' 4a gloud - Non 20.0 contol pemetim (& scanned with OKEN Scanner @D holu — Rectified Kimear Unit: x ipiye Ae © df aido (& scanned with OKEN Scanner Ar lepucrak, Neural Nebus0sk, Numerical Back propopedtien, example ust ee aclivetions We usilh peur he, foleuriee Bitps (i) Forward pass (iD Compute, te total evror ci) do beeper a font, WY) Colcufatr g (Uplate weights Given tT er has 2 nodes %,= 0-0S RG= Ol 7s voeights (input > ‘iddar’) Wo, = OS W, = 0-20 eg h wy A we a \ Wa= 0-25 Udy = 0-20 1,2 hy ] bye, I piddon a he 0 fey) hy he (he (& scanned with OKEN Scanner 1 5 woof (hiadon > output) Wop = 0:40 , log = OUT wz = 0°50, Wa OSS Wp 2 fi OL Sa eae Sout Nod 4 ti Oo. : ‘1, ©: ‘08 : need Cle) tiddon bias ® by = O35 output bica by = 0-60 > Baises :- > Activation Funct (sigmoid) t o-(2)=, eon: = (ree) e- 2 LEC of show 4 ytarget ouput Forward Pass j— O > obtained oukbul © Compuls friddan hay. net sinputs and activadion nebh, u WO, Gy Wy Ky + by 1 OIS XOOS + 020X010 + OS 0-00TS + 0-02 +035 o-BFtS nl! Te fy = wy%i + Wy % + by = 095x005 + 0-30 X0:10 $35 90-0185 + 003+ O98 2 O-D925 Apr Sigmoid ene funtlions Ai (out) = = (reth) = Bt | acozis pet Lt &°27"5 Ag owt) = 7 (neha) = esas = Omics (& scanned with OKEN Scanner Comput, output layer net tpl and activations ». NebO, = AyWs + hy Weg +b = CHoxO-544+ 0-596 KOUS TOE = 1 fos netO, = A, wy t hy wog + bo 0-sq3 XO-Go + O596 ¥ oss +060 = 029654 0:3278 +060 = |-2243 pelt Agroail, actuation, funtions 0 get outpuls © (out) = o (neta) = ios = 0-7513 Onouct) = o-Crot 2) > pra = 0.4424 @ Crmpue bool vor Csum 4 squcseud evr) Fa = 2 Cti-9)*= L(o.0- 04513) = o2ay Eos = 4b (tq-oay'= + ( 0.99 -0-4729) = 0.0295 Eee = Eo, + Foo = O-RIFS (& scanned with OKEN Scanner I. Backpropagation Cqrradient and updatis) ~ Beegropgain dis the aligosecthin used Jo Train newrall es sua dha, werglets un the network 20 tha the = nelioaek's psuckichions ge closer fo Bhe dosivet owtpuls : Qrrockionts i ub othe puciol dovivaltve of the evr, function, uwilty seespects bo yee 3 strum abicaby for weight w; Wright update susbe » WOnaus = Wy — K 3& pers og LU oe vale Now dn, given rumetcad ate oer ght x utp hey gq, tah = 02989 . , é. : Enron 3F€. 3€. Ow} helo, 7 ps Sa But, MOMO Beef *e =. EF ig Gradient dO Ws Ooutoy ba. ” t -_ partial How totes] [tt mn re votive, ean 0; Cout) tat 0; otal changia wrt Wr vont wrt ages de gradient 0 (out) nelO| ee (x ort moons — with veapet 40 ) XK (& scanned with OKEN Scanner OEtetal = | Oubo, — Target 0 OO o751g- OL = OTHID - SOubO = | ouxo, (- outo,) | — oneto = 07513 (4-0-7513) = 01868 + Oneb Foe hy = 0¢990 ows > Germ ~ 6.7413 * 01868 X OS5F32 ows = [eo 08214 < Now new value a Wwe ss WweX = We- K* Shon * ushere we — no We woos = old We = a rate wee oy 0-6 ¥ 008213 Oui Oreta28 0-350% Ain wt find, welt cea Mia we, ©z and wg me el (& scanned with OKEN Scanner Adpusting cups sn Riddance boy Eres, _ boat, SOHN y Srathy wor bbs) Ow ~ Qout hy © Sheth) BW] by en © Oba Fo, 4 ooo South, duty cam re _ | Eo, ye SnetOy. dEo2 ye ONetOr _ i —" er Douth, sell 30 S6on Y doutOr * oe o eA A x [OS Jouko, omeon “H)| duos anton . [orseua x oy] 4 posses Hose} if a€ = o= fom! = o-ssa9 4+ (-0-01904 a heaaial 2 ‘Out hy E (eae © 5)_Cateutalvant Ey _ dutor— 02 bdoutos Za maps eoatta. — exteni-ouy 3 = 0-77292 — 0.99 pete = [pss] = [ear ss 4 se a = 27Ol = oto, ~ Tnsyel 0, 20utoy _ = buld4(I-eut-o) 3 Outo, neko, 0-513 — 0-0) = 07513 (I-60 #913) = Ocg 7 / (& scanned with OKEN Scanner = OFS Now we heed to cabotate = Outt,) = 0-24) 200404 } = oth; ( | D) [ecaieorey pout hi aon anet hy Snet Ay osha OWL use knovd woe. posi diffarntiot, eq- 1 vark v9, we ae meth, = uy t WX + by eq-t ant hy = % aw Novo —T oll Dcomponents calculated, dGiotab 3¢ SOW 5. Snethy “Douth, ONet hy aw| 0363 oeeeet ee 0-2QU12 %* OOS S Evol Ou, _ 6-006 128S~ a Now we caluclate od value of WL OE tort, wit = wy - K SE Lobo w/t = nLLd LO) uw, = ald wo) Ko = ‘earmining rae So wo" = Org — 0-6 & 0.0004 38668 ‘ = 0°(49F d68S92 (& scanned with OKEN Scanner Simi use caleulot ugg, ur, Q voy - Prope these changed. tour hk and A one cyeke 4s tide epoc Cone full paw the nowal neltowrp Contin this prow tH the mor % minimal er equal oe ais — NG (& scanned with OKEN Scanner en = Before inpat It ANN pring of i > impak layer imonge “2 siaen OF of image p oye : Sininn al. ray Pelli Comvelation re Activation a Rel lonceletien — important clement i, Fill Cfealtire |- rT I —> hoop feculiure 48 Lemmen ‘uv ry & 9, O He Var tioat Une ~ 2,3 9 ete J “eu, Tluse ase fects tn 1 toe need to Fetenty that ib 4 Loop file eclo xg i a. Ung Wwe place (Tue en im Rial / i Cots ushice ounatapp fy Pe cee Veli ase rpuad on meh y Ds acing en ete T4faat tod ttt Ei = oy T Cttat two. of entail et [ele meas Alou? Cloth ome cole on} Acfeonty [itive spect 4 ek ketene again place file tclee, woop Po e2afeaa [we Wt bel LR LEI al a & . fen | . a a Now Sto lian oss |o. © scanned with OKEN Scanner CNN vidal for tasks Like dimaga clascigiention, efject | detection, Ancl segmentation b Convelulson, Ketwoukse use a process catleel convoluton veohtth: Combs'nes two fonctions fo! thow hour one changes the shape ae tae, other + > Role 4 the ‘Convouitton melwork. ue to Seohuce the rmaga lurcto a fer chat 4¢ easier, to process esitnout fealtoros thet ano oritioal for potting a goon preelie ton. Losin How doe CNN Works 7 > Aixt we understand het, an tmage as arel how it is seprasenTeal An RGB image tt nothing beta matux of prkel valuu hang Dheee planes , ‘whereas a Grays fmape. Ls the game, but wt hata ee” plere. Exampu Tage + For Siempliceg det Us consider gregicale ones (& scanned with OKEN Scanner 7 Ly ‘3 4 ee LLP To | Uf |e oft for of : Be eae kK Convolssod, Tnput Data Kewnad On loc Li, Fi (for, + > The abouc tmage shou shat © tonvolwlton J's we Take a / Konad (3x3, malsu) and opply oe to the input image to ge the emvolyed. deakiny,. This convowed fecdtwo, is passek on Jo the mut iyer- > The no, of prrametins dim a CNN doyer Aopends on thy size oy te eet he fields ( Pir kernels) and £he Number of fillirs. > Each Nneuren in 2% CNN dayer suceiva inputs foortlfe eenl, sapiens faa foe prandias Aeype. Kame an Hs scnceptins, t! . > The socopte tala tape suns he Inpe > eta et oat rs lg oe, feative Shap as Lhe eclput ; > This omop thon ie po paeae se cific, Linoas unt (Rel) activelion, funetion. 5 claccle CNN awchitectunes Like oe and more modern ones Like Reset arpiaty hts ferdemontal. principle - (& scanned with OKEN Scanner ‘onyolutional, nuwul networks ane Comporech of I ttiiple feqe % astifictal, Neon. When we on] an ima. into a ConvNet, each, Haya generates 2¢verok b clivation, functors that are passech on ta roxt Sayer fer featino, eefiadtion! G featiwuw ExDiachion iv CNN t > The pout doy & uescinlly exbiacs basic peatiina such as fortzontal or bt ok es This output 6 passed en 40 Are next wry uhh dotecls more Complex featiow such corners or combination te Va, we mou dleeprr unle Lhe nitioork | itleam ftnty eer ere compen fete guch as objects 1 faces ee > Conpllels axe feed- foroard networks, Drak proces the vapet data, Jn @ single pass: > Based on tre acuvalion, the firoe convolution a, the elassft cation, Sayer tl a set of a dceres ( Value “between 2 Shah ape Abe image is to bolon, Ras eed) ah a > Gradient doseort Lu comm ophimrction aly orithmn clue Dh lacy aout to wou hs “hy He “oud subsequent toys kL a (& scanned with OKEN Scanner 5 NN was pout duel ord al 2 N (986s. + wahat ib a faring lays | on is seesponsible for seolue! De > Pools apatiol size of the convolved feature . Unorour to Leowase DL ulational poor seep ints: do prows the oldtor by sediaclig te dimonsions . Tan one foe Gs Fp” + 7) aunage pooling 3 [3/2[1 Jo + i) max pooling .— efols Bit — 2 B costes, GR belay 4 2 [2 [e Valse. of a parol = fiom dO porlsn, i couareol, ; P Chats pe by tre > Max froling also pejoure as a Noise Suppressor. > Tt décandls Are noe sw aelivahon ab taper ark also poferms lenses tong. uxcth- camenuonalty veoluction Average fooling -— sultry the cuurage of all the batt One Portion oh the nce Courrecl. by thy Kernel. > Max poly foferms a lot beta the Average 0 (& scanned with OKEN Scanner ¢ Average ee Usha, Aoppens after C7 un CNN Ts > Marry CAINS Stack multiple convelutisnal ancl Posten Aoyos Eo extiact Heong absback feotpus , Example: fiat lay. clelacks eclges > Looper, eg Aki cle Ahopes , objec ete. ein) tay | ~~ Conunts te 2D featinne JVI C ct, pooting) tuts a LD Veclon ? “p tem > This prepares te date. for input inte a an connecTec doyer. (& scanned with OKEN Scanner Ee Commas hay ~ ding, neootk. Sf, Murer Connection, beliucon, We use iso mens Connedlicl Netpale fo clasty imoge to potatos mand oy on aa extrac SMG fealines fm de image ms Convetutrion ae Ooh peoll The numba sf, ruwesns in, output ot uae) with be same ar Lhe no-of caegonts use haus TH we ae Ag binary clas fication then, ge achiatiin function and tse hose 1 neuren i obetpat Lager Tf we trou Suppote Y merery use vac Seftma x ac'ualen function, aoe Use $1 (& scanned with OKEN Scanner

You might also like