Unit 3 Machine Learning Notes

The document discusses the K-Nearest Neighbor (K-NN) algorithm, a supervised learning technique used for classification and regression tasks. It outlines the steps involved in implementing K-NN, including selecting the number of neighbors, calculating distances, and assigning categories based on majority voting. The document also highlights the advantages and considerations for choosing the value of K in the algorithm.

Uploaded by

lakhwanabhi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

0% found this document useful (0 votes)

37 views23 pages

Unit 3 Machine Learning Notes

Uploaded by

lakhwanabhi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

You are on page 1/ 23

ne Deg, ult. tion Teer leeenivg — ( Supoanises leasing ) Declsiou Len Denning a meted for apprenirnati vy discrete —vabued ragek function, in whiter tu beaiued function is supsnented by 4 docigion Tau « 4 oxtaut Teuvino Decision Tes © Root Node Te supauent Hn entice population whic gets fuctner Qivided Into wo or more sett. eo © Sptrting dt uwio procure of divide a node luto wo or more dub- noder ne incase tree ® decision Mod a When a dub node Apuits ito fortner dub noden Hew dt 2 calla 0 decision vode @ wo /Tovmtnst neue van ent (Fee nodes whic a0 not split ane > wller oa fouutnat nodes Bur nodeo 1s abled prunivg gor Sauce Aue. . © Braue ( SubTeee) A mutes Cif) odire Mu ps hive duu js tallea brawch of by: A geurret bub ree’ decision tee QD Porn md thild nodes A ede Wt diulder Structun, ito Sub wede> BB called 2 parent node . Tlu Submod e a port ede au tatled Child neder rere, ame per\-fi 6 yw 19 1 enc er : Suteome of a random Vaslalt or an eves - rep y desertbeo aboud ue homogencity of He data Ind fer ape When a Coin Jt Leppes, tnuad on Haid ant 1 “? outtewus , rune He eudropy Ju Lower has Compared ee a dia which har got Six puttemt - Entrapy 2 = 2 P(aa) . begs Plaid A-Sematins Ta) TH ounaton pain 2 find 24 @ seduction ( ducrtase) sn Beep tk TO(GA) = Entropy (6) - 2 Med | tances. 1 Ip3 Magonttnen CD45 Maasitam Gud Tijormation Gain for cach of Hre data gel.Steps, Clwsoke pur attribute for whic Ae ; i ipl 2 is Hus dest APE “and De ioe tue fa ue maw O ‘ Attribute Stepy “Ths bert pitt attribute tb plead ad the riot unde. into Subtree WA Step5 The suot node Lt branched ae 1 each subtree as an outcome a Hu test “ae sol node adtribute : “yw subse Steps Recurively apply Hoa dame operation for of Hur taint Aad wth He eee altributer a Lah nate is derived. e. Question wake a deeigion Axe row fue given tae’ p data Bie Tabes tie + ee Tiwation | upeetin| Melty [whe | 2 Mo unt) Day DI Het vf wae Weak Hot Higa Stremg No Sun rome Kot ie weak | Yes Rain [Led cE Ht week | Yer Rain (od Nosrmat | Weal [vex Nooumat Stow No Nownal Strova | Yew igs week | No Nosanad Ww Yes Nosomad Steal Yes trough eS Nosmat ae NotedIMS ENGINEERING COLLEGE Department of CSE/CS/CSD dud ve jet "|" wropy (44, 5) = -2 14. (45) | | Reot Node Selochou Pp i | Mributes = Oudlook , Temp , han, bund i d Tonge = Pag Teunis | (| The pbibude Wide gfves wight ipoemiton | | “guin us selected QL gupb node | 1 | i itIMS ENGINEERING COLLEGE Department of CSE/CS/CSD | Se [a 5] Eatvopy (se - 4 bog 9 _ 5 | ry ORT deg 474 sae (4.3-) Entropy yan 2-¢4 = 0.93) 7 aS 5 Sougear € [44,0-] Cutropy ( Souercast) es doy, fp last u S | | ree , | v S | Gain (5, outlok) = Erctoopy (5) az Ty Entropy ($y) | eben, voces Cis) ~ t 134,2-] Batropy (Spas) « sm 2i2 ay, 2 = oom = Exo py (s) - f eras 2, Gateopy G Sung )- ig Paty a) Entropy (S It M4 = 0-4¥- 0-346] - 0 - 03467 = 0 2ybhe (Hiput) This 42 He iebor matt on gain (14) 4 outlwk attribute Je. 2464 Siomitanty we tam cabertate Ht information Juin 4 Mernsiving 3 alivibsdes “he Temperature | Humidity aud wind Pe eee hea aw ee De6 es te daa Smaba —[ 44, 9-] Erctrepy ( Sputaetdes) =t top, 4 4,2 0-983 Scour €= (84, t=} Fudropy ( Seone) = : e pe ~ tad OBIS ~ s IS) Ent (Sv) esp): halle «18 5 eG mild coat) + Evtrepy ($row? 6 Exaror (Sniea) repy ( Sums.) = & 10.9103 - Y x.08N3 Tu Iy, SE nadues (wend) ~ Shreug | weak | S= (44, 5] Entopy (5) 20.9% Samay [+4,3-] Endrayy ( Sctrou ) = st jeg $ - . ks = Og 871 S weak e [5+2] Evtropy (Sweak) 2 5. dey . £ dog 2 s08tte Gain ( Sruiina) = Entropy('s) - et Entropy (su) VE ( Chrong weak) L | = Erctrepy (5) - 7 Entre (Seta ) - ‘2 Entropy (aug ‘ OM ~F 0.985) ~ Fy 0.8630, > ~ I 14 = OM ~ 0-49255~ o.y2/s | = COTE 0°01595 (4+, 5-) | Sunny Overtast Rasa & 9 / | \ E I (>1,p2, DQ Da,p11) (03,03, 012,13) (D4, Ds, >, Dio,biy | (24,3-) (44, 0-) Gt, 2-) mH a 3}Sua € [04 22] Entropy (Sse) Smite (14) I-J Eutrepy ( Smita) = Sem € (14, O-] Garepy ( Scot) 201 o- ISul Evt vey (Su) Cain ( (Sunny Temp) = Entropy (8) ~ 2 ey 7s] "Hy VE Hot, nate, cot = 0-0 Entropy ($) - Eatery ( Spot) - 2 Batrepy (Sucisa ) “LG. A Euatrepy ( Sunl) 2x 1-Lxoo eS 5 0-97- x Oro- ale 097- 2 - 0570 5Sa OS mie Ca Vadis (Aunt dity ) e Hig, Nosemat Sst = [24,3-]J Enhepy (5) > 0-97 & gh = (0413-9 Entropy ( Spigu) 2b0 O° Eutropy CSNesunap) 20-0 7 Jy) Su Gain ( Scunny, Humidity ) E Entrepy (9) al Entropy (Sv) Nosunet = Entropy (¢) eS Eutropy (Shiga) - 2 Extropy oe) @ “8 O7- Byo- 2B eo = 0-97 = M40 s Ss Values (wing) = Stroag, wealt. Sstennsy 2 Rt, 3-J Entropy (s) = 0497 “sroug Cit, 1-J Entropy (Sstseng te Sweek & [1 4, 2-] Evdtropyy (Swean) = <1 deg 2 o = “3 3 . Is 6 , Gain ( Ssunny, Wind ) = Cut ropy (5) - Tey ty (Su) ue (Shou, wet ) S Entropy 5) - 3 Entropy ( Sstrong ) -4 Exarog (Seek) = 097- Bre “Zyone = 0:0192tallied Hig Notunat D (0! bz, bg) (4, on) Mo Yer. finet Corapee . | _ aT Overcas i Sunny Hips Noxansd / \ No YeMEE He Dein Tok ang © 6 aed Dp Redued. oan prunivg © vec dake Seay mame eee ws rae Teconponacing Coudivusts Valued adtyibut er: slain diseiate o Paeemining Te ahinisfitation 499 ® Mavdttng atintbutes wite tif Eenad costs © mecastive fox setect M222 W9ULs, Le adbehutes . ‘ec a4 © peat ae Cnamptas witte miss ing. Odribute vabwert Live Bias in Decisiou Tyee Decision Thue learniieg vin tin bee tn ung. 1 Here 18 wo bias . oR <> bud Af we gut mire Hon one tee ) So He Bb a Problem | eel be Used to Atanoify He new examples Haak Problem As, eabtad dvductive bias sn IpD3 Mypsdtion . Se eee See es ag” Aharter trees One prepoud over tonper tyeen . (6) A ploser approxi mation to Het> inducctive bias cl Id3 ~ Bhovier trees axe pralraur Over doug er freer. gene ay = Tou Heat place aig. inposumation ain attributes tose to tie woot oe prerred over +usse Hunt do not: Hf a, &Testanee Based latiatug dee bavance 4 aw Oli OF | On examapar tv Hat chravdvg data A eb - | Simbtent ty pased hamifier or Justanes baaed lami e7* Use simdtanity measures to mate Hou nemurt ighbeve totes Cobras - and clarify 9 test instamee Which wasdes 1 wile ober earning nrtchamibnn Such ar autizion Het: Tur advantage of Uses Hus Learning ts ted ) Proceraing weeuns guly when a Hempel to elamif f 4 re Jost it gee Tas madly pr pwhon Hu Whobe date get ik not avaltabte inv 1 beginning bul uobiucted In duommental manner a of Hui haxnivg Ls Hoar jt Aegiseea snag te ms Bh tes i vn am Th tak ovnbrused iabHably wht pur © aii dake. Approoemee fe rT? J I l weiquted Radial Basi — (ase Booed Se kn wee “e Function teeta, He in “4 Tel p operat ou Ofter Coie Hae Cound I stan - wit. yt poorious Instances , $1 1s called ay lang oe or Mompry based Learning . Because dn this technique tHe nod At token ot ores tur newt instwe 4 anriverl. + Tustance Based Learningo a K-Nearest Neighbor(KNN) Algorithm for Machine Learning © K-Nearest Neighbour is one of the simplest Machine Learning algorithms based ‘on Supervised Learning technique. © K-NN algorithm assumes the similarity between the new case/data and available cases and put the new case into the category that is most similar to the available categories. K-NN algorithm stores all the available data and classifies a new data point based on the similarity. This means when new data appears then it can be easily classified into a well suite category by using K- NN algorithm. © K-NN algorithm can be used for Regression as well as for Classification but mostly itis used for the Classification problems. o K-NN is anon-parametric algorithm, which means it does not make any assumption on underlying data © It is also called a lazy learner algorithm because it does not learn from the training set immediately instead it stores the dataset and at the time of classification, it performs an action on the dataset. © KNN algorithm at the training phase just stores the dataset and when it gets new data, then it classifies that data into a category that is much simiiar to the new data. Example: Suppose, we have an image of a creature that looks similar to cat and dog, but we want to know either it is a cat or dog. So for this identification, we can use the KNN algorithm, as it works on a similarity measure. Our KNN model will find the similar features of the new data set to the cats and dogs images and based on the most similar features it will put it in either cat or dog category. KNN Classifier 8° Predicted Output Input valueWhy do we need a K-NN Algorithm’ new ‘Suppose there are two categories, i.e., Category A and Category B, and we have a data point x1, so this data point wll le in which of these categories. To solve this type of problem, we need a K-NN algorithm. With the help of K-NN, we can easily identify the ‘category or class of a particular dataset. Consider the below diagram: ap o \ Category B © Nev data point ~\ Category B ° New data point assigned to Category 1 INN Category A Category A How does K-NN work? ‘The K-NN working can be explained on the basis of the below algorithm: ‘Step-1: Select the number K of the neighbors alculate the Euclidean distance of K number of neighbors ‘ake the K nearest neighbors as per the calculated Euclidean distance. Step-4: Among these k neighbors, count the number of the data points in each category. © Step-5: Assign the new data points to that category for which the number of the neighbor is maximum o Step-6: Our model is ready. ‘Suppose we have a new data point and we need to put it in the required category. Consider the below image:%e o*% @ ° ° } ° Category 8 New Data a ® point 4 Category A © Firstly, we will choose the number of neighbors, so we will choose the k=5. o Next, we will calculate the Euclidean distance between the data points. The Euclidean distance is the distance between two points, which we have already studied in geometry. It can be calculated as:X2.¥2) Yiferessee= (Xn¥1) x Euclidean Distance between Ai and Bz = © By calculating the Euclidean distance we got the fearest neighbors, as three nearest neighbors in category A and two nearest neighbors in category B. Consider the below image:ee @O* « o © ° Category B ©O\ *,@ New Data ® point Category A © Aswe can see the 3 nearest neighbors are from category A, hence this new data point must belong to category A. How to select the value of K in the K-NN Algorithm? Below are some points to remember while selecting the value of K in the K-NN algorithm: © There is no particular way to determine the best value for "K", so we need to try some values to find the best out of them. The most preferred value for K is 5. 5 Avery low value for K such as K=1 or K=2, can be noisy and lead to the effects of outliers in the model. co Large values for K are good, but it may find some difficulties. Advantages of KNN Algorithm: ‘© [tis simple to implement. © Itis robust to the noisy training data © Itean be more effective if the training data is large. Disadvantages of KNN Algorithm: ‘Always needs to determine the value of K which may be complex some time. The computation cost is high because of calculating the distance between the data points for all the training samples.TA» .. a for foliaing datags and praait tn 4 | Row on aL (aa) (4% (17+ (7- y)> 4 ie “art | (7n* = too ef (2-1 (TM —s ho | Row 4 [Tau v | Row | Pouce rue 7 | Lae weto 0 eH , Jeri 23 (29°*+ (3) - Nae (art C47" ; ing Oxdor Ea ele Tet eumapt wit be’ . Test oummple (ar 23, Ar27) Assume K23 Al Ar Un y) Tate 4 Due 3 + fatse 1 4 Tone é @5 3 false xy C 3 Tae ig ( 2 [c= vio 4Dusstion of tui! Mgorttim, @ Rot - eo arate « dratiregg, knvss [Taapecanne = Monet = here aes Row 3 = Al (20-60) *# (35-40) > 2 afer t(asye= Jiswt 3028 5 jes Rows 2 [Ganley + (36-25)9™ * ol (rea [tet (loi > fiver = Meat ee eee i shee Sa Rew sz J (qo-To)>+ (35 To)” [oops (359% = [satinas = 6) Row 6 = TC (20-to)>+ (35-10)” A Cert (250s, [Tht ase ATI Rewo = [(ae-25y> + (26- Be)" 7 pf (574 (457% = fastans = [m0 = 45-28 n a _ Jucuaaivg order Row 4 Red v Tut majontty than With ia Row | Red vw tw 5 neanest hela lator sa Row 2, Biwe ~ Lag how 7 Blue ~ nan cutry ip Re Rew b Red ~ Row % Blue Row 3 BlueWweets) Fig. 12.7. Radial Basis Function (RBF) Network (is) _ CASE-BASED LEARNING OR CASE-BASED REASONING (CBR) an S site posed reasoning ‘based reasoning (CBR) is used for classification and regression. Tee an past brobleina, oe Process of solving new problems based on the solutions fee a past problems. The CBR is an advanced instance-based learning method which used solve more complex problems. Idoesnot use Buldean distance metic Se arrives to classify, then first an identi i i imilar case i classify, ientical case is checked in memory. Ifany similar case is found in stored memory, then its solution is also retrieved. Fig. 12.8. Case-based Reasoning (CBR) life cycle12.5.1 Steps in CBR e Retrieve: Gather data from memory. Check any previous solut;, A ae ion Similar, current problem. Reuse: Suggest a solution based on the experience. A, , demands of new situation. " Adapt it to Meet the Revise: Evaluate the use of solution in new context, Retain: Store this new problem-solving method in memory system, 12.5.2 Applications of CBR e Customer service helpdesk for diagnosis of problems. e Engineering and law for technical design and legal rules. e Medical science for patient case histories and treatment. 12.5.3 CBR Example (Smart Software Agent) A common example of CBR is a helpdesk system. Here, the user calls for computer related service problem. CBR is used by the software assistant to diagnose the problem, Then software assistance recommends some possible solution to solve the current problem. For example, a printer problem, internet connection problem etc.Case Based Reasoning (CBR) ‘As we know Nearest Neighbour classifiers stores training tuples as points in Euclidean space. But Case-Based Reasoning classifiers (CBR) use 2 database of problem solutions to solve new problems. It stores the tuples or cases for problem-solving as complex symbolic descriptions. How CBR works? When a new case arises to classify, a Case-based Reasoner(CBR, will first check if an identical training case exists. If one is found, then the accompanying solution to that case is returned. If no identical case is found, then the CBR will search for training cases having components that are similar to those of the new case. Conceptually, these training cases may be considered as neighbours of the new case. If cases are represented as graphs, this involves searching for subgraphs that are similar to subgraphs within the new case. The CBR tries to combine the solutions of the neighbouring training cases to propose a solution for the new case. If compatibilities arise with the individual solutions, then backtracking to search for other solutions may be necessary. The CBR may employ background knowledge and problem-solving strategies to propose a feasible solution. Applications of CBR includes: 1. Problem resolution for customer service help desks, where cases describe product-related diagnostic problems. 2. It is also applied to areas such as engineering and law, where cases are either technical designs or legal rulings, respectively. 3. Medical educations, where patient case histories and treatments are used to help diagnose and treat new patients. Challenges with CBR « ‘Finding a good similarity metric (eg for matching subgraphs) and suitable methods for combining solutions.+ Selecting salient features for indexing training cases and the development of efficient indexing techniques. CBR becomes more intelligent as the number of the trade-off between accuracy and efficiency evolves as the number of stored cases becomes very large. But after a certain point, the system's efficiency will suffer as the time required to search for and Process relevant cases increases.

Machine Learning Unit - 1,2
No ratings yet
Machine Learning Unit - 1,2
29 pages
MOD2 ImpQN+ANS
No ratings yet
MOD2 ImpQN+ANS
21 pages
Decision Tree
No ratings yet
Decision Tree
12 pages
MLT Unit 3
No ratings yet
MLT Unit 3
26 pages
ML Notes
No ratings yet
ML Notes
113 pages
Machine Learning Notes 1686281543
No ratings yet
Machine Learning Notes 1686281543
113 pages
Daa Unit Iii
No ratings yet
Daa Unit Iii
58 pages
1
No ratings yet
1
76 pages
Daa Unit3
No ratings yet
Daa Unit3
34 pages
AIML Module 3 & 4
No ratings yet
AIML Module 3 & 4
97 pages
ML-2 Copy 2
No ratings yet
ML-2 Copy 2
6 pages
Unit-2 ML
No ratings yet
Unit-2 ML
24 pages
Deep Learning Unit - 1
No ratings yet
Deep Learning Unit - 1
35 pages
ML Observation
No ratings yet
ML Observation
17 pages
Wa0018.
No ratings yet
Wa0018.
38 pages
Decision Tree & Random Forest
No ratings yet
Decision Tree & Random Forest
41 pages
Machine Learning Sec-A
No ratings yet
Machine Learning Sec-A
22 pages
Internal Assessment
No ratings yet
Internal Assessment
8 pages
ML Theory Notes Full
No ratings yet
ML Theory Notes Full
39 pages
ML Unit-1
No ratings yet
ML Unit-1
15 pages
Minhaj DL Unit-1
No ratings yet
Minhaj DL Unit-1
23 pages
DAA Unit 3
No ratings yet
DAA Unit 3
14 pages
Chapter 8. Data Mining Notes
No ratings yet
Chapter 8. Data Mining Notes
14 pages
6,7 Chapters
No ratings yet
6,7 Chapters
25 pages
Aiml Unit 4 Notes
No ratings yet
Aiml Unit 4 Notes
15 pages
DAA Module 5
No ratings yet
DAA Module 5
26 pages
Inductive Learning in AI
No ratings yet
Inductive Learning in AI
53 pages
Unit-1 ML
No ratings yet
Unit-1 ML
16 pages
DWDW Practical Writeup (4-4-24)
No ratings yet
DWDW Practical Writeup (4-4-24)
9 pages
DocScanner 06-May-2024 11-31 Am
No ratings yet
DocScanner 06-May-2024 11-31 Am
13 pages
Adobe Scan 15 Jan 2025
No ratings yet
Adobe Scan 15 Jan 2025
17 pages
Ds &ML Notes Complete
No ratings yet
Ds &ML Notes Complete
31 pages
Assignment 7
No ratings yet
Assignment 7
24 pages
Daa Unit - 3
No ratings yet
Daa Unit - 3
38 pages
MLT Unit-3
No ratings yet
MLT Unit-3
39 pages
Unit 2 Soft Computing
No ratings yet
Unit 2 Soft Computing
46 pages
Decision Tree c4.5
No ratings yet
Decision Tree c4.5
22 pages
Adobe Scan Mar 27, 2023
No ratings yet
Adobe Scan Mar 27, 2023
34 pages
ML Unit 4 Material 1
No ratings yet
ML Unit 4 Material 1
35 pages
Decision Tree
No ratings yet
Decision Tree
9 pages
DWM OnePiece Pika78
No ratings yet
DWM OnePiece Pika78
41 pages
ML Unit-4
No ratings yet
ML Unit-4
57 pages
Daa Notes - Unit IV (Svecw)
No ratings yet
Daa Notes - Unit IV (Svecw)
54 pages
Dashrath Notes
No ratings yet
Dashrath Notes
20 pages
ML Unit 3 Part 3
No ratings yet
ML Unit 3 Part 3
33 pages
2024 Decision Trees
No ratings yet
2024 Decision Trees
28 pages
DAA
No ratings yet
DAA
61 pages
DL Notes
No ratings yet
DL Notes
35 pages
Clasification 1 - 240117 - 133229
No ratings yet
Clasification 1 - 240117 - 133229
10 pages
Statistical Methods (Complete Notes) - Suhani
No ratings yet
Statistical Methods (Complete Notes) - Suhani
62 pages
ML Unit-2
No ratings yet
ML Unit-2
46 pages
ML Unit-2
No ratings yet
ML Unit-2
27 pages
IAT-II Question Paper With Solution of 18CS71 Artificial Intelligence and Machine Learning Dec-2021-Dr - Kavitha P and MR - Radha Krishnan
No ratings yet
IAT-II Question Paper With Solution of 18CS71 Artificial Intelligence and Machine Learning Dec-2021-Dr - Kavitha P and MR - Radha Krishnan
8 pages
DAA Unit 3 (Dynamic Programming)
No ratings yet
DAA Unit 3 (Dynamic Programming)
33 pages
Machine Learning
No ratings yet
Machine Learning
76 pages
Cooperating Intelligent Systems: Learning From Observations Chapter 18, AIMA
No ratings yet
Cooperating Intelligent Systems: Learning From Observations Chapter 18, AIMA
51 pages
Daa Assignment 2
No ratings yet
Daa Assignment 2
13 pages
DAA Unit-3
No ratings yet
DAA Unit-3
32 pages
Data Mining Unit 3 Classification-1
No ratings yet
Data Mining Unit 3 Classification-1
24 pages

Unit 3 Machine Learning Notes

Uploaded by

Unit 3 Machine Learning Notes

Uploaded by

You might also like