Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
5 views127 pages

DM Total Notes

The document provides an overview of data warehousing and data mining systems, detailing the types of storage environments such as file systems, DBMS, and data warehouses. It discusses the processes involved in data mining, including data preprocessing, integration, and transformation, as well as the challenges faced in handling large datasets. Additionally, it outlines the functionalities of data mining tasks, differentiating between descriptive and predictive tasks, and covers various types of data attributes.

Uploaded by

videosp312
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
5 views127 pages

DM Total Notes

The document provides an overview of data warehousing and data mining systems, detailing the types of storage environments such as file systems, DBMS, and data warehouses. It discusses the processes involved in data mining, including data preprocessing, integration, and transformation, as well as the challenges faced in handling large datasets. Additionally, it outlines the functionalities of data mining tasks, differentiating between descriptive and predictive tasks, and covers various types of data attributes.

Uploaded by

videosp312
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 127
s\\F Tntroduction to daa. wayehousing and tala Mining . Pala Minivig Ayptem vequives imput of cota. fiom ‘Three types of storage. Environments. WW File. aysien (®) DBMS CPolabare Management sytem) CoLTP) 4 S Palawarehouse COLAPi sytem)’ yi od W File System: eur 9 Th File System jth datas Organized “with a” set! of files. Generally in mining system the Input & only given rough a ad files such as + CSV 5 +HLbs ranfh ete open DalaBane Gmnechivitg, [Some cl tyson (ariiitute velated file “format ) iP Avadus ex ove. palatine Connectivity Vorlucs c- oujec J inking embeddings palabase (cormeck ity, ar > The main cdrawmack of a file sytem ww bins Seely , no valideiion on a data, lock °5 management operations ono Dara, | sosat [luge @ DBMS * | —> Tn DEMS, The dala % orgavized with a Sek of DalaBare, Generally Dajabares or wed do Hove Lage amount of Gperationol para: -> The main annnnges of DBMS Are Nigh secunity , ‘gt fetes validaions —cha. "aks ehbickenk iatonage mer = operahons ono pola etc-, b y ~ (3) Delaware house — On Odlawnrehoure » The data, Qigonized with a different ounces of daka- thot means it con stove hebra geneous (ip, Catection of data. it =a Generally Thlaware houses ave Used for storing nion-operaty, or historical data , Henig ora “The input of data requires 1 types of Prepacessing acti Tuy anv 7 i20ala honing ~ feriove Ae -neisy on dala, ill the miss Cn ‘ “ality Rermoving intomislancy on-deta. ‘ *avUrfoyrnation ~"This i the process. of Conventiing the dota into a suitable form for mining applications. (3) Daa. Integration ~ procs of Combining multiple date Aources into Armgle Paka source. © Ohta Leduction - he Joige Volume of data & comet into smoll sek of veprosentaigns withou Lossing any in-ermation, — UNIT: Tahoduction Jo "bake ting. (0 Whats Dlaining_ 9 origin ‘end, ohana challenges (lenaualedge piscavenss tent , GcDD) in elabane) . ! i) ae a vil : 1 a0 ee : P Semeigh D0o'o cheney eolepaltenytvalvalion —* ¢ 0 MN ata, Wow four i oy (1) knowledge regnesetalion Von Bo heey Cvisintigation Techniaw s) / palamining is a proces to discover she (mowledge om large amount Of dclavases Cr Dokwarehouses or files . Tu olhen name o: taiaming is. also called as "knowledge, Discavery im Databases fe (KDD) “He above, diagram, shows the working, prows of a palaminin which indudus both preprocessing amd “fot processirg steps “Ty ane (Dela clearing ~ Remove the noisy Qf dota » fin jhe 'otissing Naluss Yempve, incimsistamoy om” doko. pala Tawsformation - To convert the dala io “suttabte | form ss aes for mining process. @ data Tntearation - To Combine the dake rom rrubiiple, Sources ' into single. Aounce« ) paka Reduckion - sarge’ volume of data we Arnall, 4c OF vepresentations: ; (S) pallom evaluakion,—- Once the knowledge i, extracted where the knowledge bs evaluaking ‘by using . interesting pattems nt " _ © knowledge gepresentakion ~ te ‘Kngusledge i presented” by wing diffewnt Visualization Techniques + puch as Tables» Cross tales , Decision Trees » Graphs» pie chovts ten, The following ave the paric origin. Of o balamining system = AS ctakisiical appyeaches : (ey Asdigcdol tnkelligene Haun on Aarne >" Madkinastearting wor wvoay ok gevibwaai: (4) Peep leaning. wth . - s tSfdilowrs fovea}: 5), pathern igakions i ; isin sant nA JB A Ci converting mo ‘ Sd tu following, ant the ‘motivational Chidllenges ofa ° i] Datatvining: system ( Scalability ~ Handling Jorge amount of data. ) High Virewsionality - Handling large set of dimension, © Complex Data, > which imeludes mmulimedia. data’ —— fadak dota web data - (Oa Dishibation,~ Guileting the dota from differnt { Geographical Jovations- (©) non Traditional stabitical amalisis ~ vohich includes Reqgressim , Bayesiam approach » Mypothusis Test pouadignus ete.” 1. Dota Mining “aks (on) functionalities : There axe ,Tu0’ Gakegories of Data Mining Tasks: 0) Descriptive , Pala. mining Tasks. @ Predictive pata mining Tasks: © Descriptive doa mining Tasks ~ Focus on feng human leprae on data. @) Predictive fala minitng Works. fous on extinatun a unknown dala. or fehne date boyd on cuonwil dala 0 | Scoring 40 above salamat Tea ‘functionalities Mis ) Elrlare) qu i ( Aa hwvobtation, ilgfumnren: log (2d Gs pallens describing a : : (ad CLasiificaluowe * ; _ } custeing . up outlier Analysts - 1) Association Analysis : , Assotiotion, Analysts monly used for mmankst Bosak analy: Tis analy \ u used To find’ the frequunt pallens from Int given dota set + Agvourdiivi ons “b the prs of 7 piscover the assce' alien vl Tok. shales a “ae oe CUCtLLs freojently dogetliin Hal! ii wT Geurrolly, auouatiom Yulyy awe dowribed fa he 791 4ormat 0f = AB tne expandable formal of, above “alte Ayaan Ne~An > BNA. .8n ge hae Cote) a saaenont atudunt => BiLys (Ce, lepiop) cpuanisation : seit | Torte classifi uv “the “process do denive the model used gush “We doka wilh a Set Of dawes For ES whose Ao duscnibe- dist’ tre purqo, OF VN ab 0 predict “the * han 04% Obje ors Wed & unRnowy" B® wutenty” ‘ ausering | ‘a Ane” coxs"4o'! amaly2¢ the data seck whidn withoul Gshy Cauuttig Ties 4 NB? apie AG «pa S wa Ut ee DLE ROE 48,9 Sothis ma apr, te iv Cu gi WT: Too principles - ha mai nab e ers breeg ™ P B 5 maxi) sing! Ya a rtiaicor ete! ane minimising the inter class ‘arttastilyy ‘between whe objet Wek MONE > The objects aw forming based “on Aira sui, dissimilanty: (4 Ouldien Analysis Oude Avolyi salto coll as oudlien | mining: or oul . by Anomoly mining: Genevally owlier Analisis u we 4 idurdtipying the mistthaviow oF dale oljed s in dala Boy. , Mt applicaliows. includes credit. cod fection, tabs imcbusia detection. « WM - Types of data - Ud Types -of’ Athibutes : Atinbuke © » ahJiwspp Qualitative -f 4 ; ; - aueriiatve ; yd v “bt sy Worinal ordinal pin Numenic piscrete: Continuous _ a - Ce + Summetiic Assymmebic ‘nkenval . “Ratio : ee HP Stole. Stabe Q) Qualitative : @ romrinal,: powrimnal., oltiloules ave olso called. _ categorical athibutes where fe, oubibute,. yale: ere defined iw igs 7 a inte re if Suk i aoe tele J, a ee hd ahs te a paseo ie oe alaiwriry Men | ——— wacominghul order, sto! : exs profession , hain Colowt >, deparcyent Sees Sail ct y (h ovdinal:: The ris @ subsd.- of, nominal, attribuke and where ane VOlUEs C202" O10: reprenpnited in diffeunk categorie « b are ool Volos have, a maeorninghu Orde + Ko eda» Avink, se) AQe credit Rating. he ce) Binary = Whanes tt obi valli ave omy foo Oulawus - i ‘ wx: Gendut », Kusull + monital salu » Tossing © coin ete» TAL OM fun types. of Binary ale bukes -“W ane ~ Gy Syiammnbric. Were shy cuca of two Valles have _ occ ns eaga) tmpottomce Om priowty . | mae ; wy OE Gender a a i) Asyamebn © Wlhwre The ovkcomt Of 4wo valus mot Wate oye & quol importance ‘ 07 priority. we Raut _ masittal status ; medical foeceaet diagonlsis ports ges st (2) Qualitative : WOOLY IC | tage ME (o> Numuatc + via olin Rapin. bygone ve iy Trdenval Scale * where the alfribuke Walutes have a bof ta pal categorical nature » the values Placing oI. muperiy od bat eeu sting 9 CTL, re iO ~ Hegre Ne ve no rae ero points Defi aed alate gifaratse 99), cons F Ee Yernparar Ww “too MRS 2, Colundvr dates , stlisfactiy levels ek * Wh Ratko” seals * TY Wd iSubset’ Of Trtenval' scale ‘abbiby "These. athibucte values aio have a “ @alagerical value Paldmungful. orden, : equal pacing . bout Tene Bo True" Pen / “yoink: ex: “Fennel. in’ Kalvin, out mous, weight ee» | 6) aiscatte | Discrete odvibute Valuies are pelle na Finite or Countable PROBL where the ‘athibute’ valiies ave! -olefinid in A concrete dpprocich. thai treans te abbrlute valu Ore defined In a’ specifet ov fied, sét of values. ex: Number of childven > Ho. of Bouts vootking , No- oof lomquages,Jpeaking Cer! €) Covdlinuaus : ‘This. abbibute 8 “opposite ‘to descutt alfribute- whow the valuss ane defined. nw infinite but the values Or Voried accovcling 4o opplicolim of dota Processing. ox Height wught ; age eke Gi Types G doings) there. avs. diffeunl. types of ania ss re | ony 5 | ne ae pmo 2) ety (a Tiguaclgnal bel” mre} toh hes Ke | 4 vs 1 age!

You might also like