ne
Deg, ult.
tion Teer leeenivg — ( Supoanises leasing )
Declsiou Len Denning a meted for apprenirnati vy
discrete —vabued ragek function, in whiter tu beaiued
function is supsnented by 4 docigion Tau «
4
oxtaut Teuvino Decision Tes
© Root Node Te supauent Hn entice population
whic gets fuctner Qivided Into wo or more sett.
eo © Sptrting dt uwio procure of divide a node luto
wo or more dub- noder ne incase tree
® decision Mod a When a dub node
Apuits ito fortner dub noden Hew
dt 2 calla 0 decision vode
@ wo /Tovmtnst neue van ent (Fee
nodes whic a0 not split ane
> wller oa fouutnat nodes
Bur nodeo 1s abled prunivg
gor Sauce Aue. .
© Braue ( SubTeee) A mutes Cif)
odire Mu ps hive duu js tallea brawch of by: A geurret
bub ree’ decision tee
QD Porn md thild nodes A ede Wt diulder Structun,
ito Sub wede> BB called 2 parent node . Tlu Submod e
a port ede au tatled Child neder
rere, ame per\-fi
6 yw 19
1 enc er
: Suteome of a random Vaslalt or an eves -
rep y desertbeo aboud ue homogencity of He data Ind
fer ape When a Coin Jt Leppes, tnuad on Haid ant 1
“? outtewus , rune He eudropy Ju Lower has Compared
ee a dia which har got Six puttemt -
Entrapy 2 = 2 P(aa) . begs Plaid
A-Sematins Ta) TH ounaton pain 2 find 24
@ seduction ( ducrtase) sn Beep tk
TO(GA) = Entropy (6) - 2 Med |
tances.
1
Ip3 Magonttnen
CD45 Maasitam
Gud Tijormation Gain for cach of Hre
data gel.Steps, Clwsoke pur attribute for whic Ae ; i ipl 2
is Hus dest APE
“and De ioe tue fa ue maw O ‘
Attribute
Stepy “Ths bert pitt attribute tb plead ad the riot unde.
into Subtree WA
Step5 The suot node Lt branched ae 1
each subtree as an outcome a Hu test “ae
sol node adtribute :
“yw subse
Steps Recurively apply Hoa dame operation for
of Hur taint Aad wth He eee altributer
a Lah nate is derived.
e. Question wake a deeigion Axe row fue given tae’ p
data Bie Tabes tie
+ ee
Tiwation | upeetin| Melty [whe | 2
Mo
unt)
Day
DI Het vf wae Weak
Hot Higa Stremg No
Sun
rome Kot ie weak | Yes
Rain [Led cE Ht week | Yer
Rain (od Nosrmat | Weal [vex
Nooumat Stow No
Nownal Strova | Yew
igs week | No
Nosanad Ww Yes
Nosomad Steal Yes
trough eS
Nosmat
ae
NotedIMS ENGINEERING COLLEGE
Department of CSE/CS/CSD
dud ve
jet
"|" wropy (44, 5) = -2 14. (45) |
| Reot Node Selochou
Pp
i | Mributes = Oudlook , Temp , han, bund
i d Tonge = Pag Teunis |
(| The pbibude Wide gfves wight ipoemiton |
| “guin us selected QL gupb node |
1 |
i itIMS ENGINEERING COLLEGE
Department of CSE/CS/CSD
| Se [a 5] Eatvopy (se - 4 bog 9 _ 5
| ry ORT
deg 474
sae (4.3-) Entropy yan 2-¢4
= 0.93)
7
aS
5
Sougear € [44,0-] Cutropy ( Souercast) es doy, fp last
u
S
| | ree ,
| v S
| Gain (5, outlok) = Erctoopy (5) az Ty Entropy ($y)
| eben, voces Cis) ~
t 134,2-] Batropy (Spas) « sm 2i2 ay, 2 = oom
= Exo py (s) -
f eras
2, Gateopy G Sung )-
ig Paty a)
Entropy (S
It M4
= 0-4¥- 0-346] - 0 - 03467
= 0 2ybhe (Hiput)
This 42 He iebor matt on gain (14) 4
outlwk attribute Je. 2464
Siomitanty we tam cabertate Ht information
Juin 4 Mernsiving 3 alivibsdes “he Temperature |
Humidity aud wind
Pe eee hea aw ee
De6
es
te daa
Smaba —[ 44, 9-] Erctrepy ( Sputaetdes) =t top, 4 4,2 0-983
Scour €= (84, t=} Fudropy ( Seone) = : e pe ~ tad OBIS
~ s IS) Ent (Sv)
esp): halle «18 5 eG
mild coat)
+ Evtrepy ($row? 6 Exaror (Sniea)
repy ( Sums.)
= & 10.9103 - Y x.08N3
Tu Iy, SE
nadues (wend) ~ Shreug | weak |
S= (44, 5] Entopy (5) 20.9%
Samay [+4,3-] Endrayy ( Sctrou ) = st jeg $ - . ks = Og 871
S weak e [5+2] Evtropy (Sweak) 2 5. dey . £ dog 2 s08tte
Gain ( Sruiina) = Entropy('s) - et Entropy (su)
VE ( Chrong weak) L
| = Erctrepy (5) - 7 Entre (Seta ) - ‘2 Entropy (aug
‘
OM ~F 0.985) ~ Fy 0.8630,
> ~ I 14
= OM ~ 0-49255~ o.y2/s
| = COTE 0°01595
(4+, 5-)
| Sunny Overtast Rasa
&
9 / | \ E
I (>1,p2, DQ Da,p11) (03,03, 012,13) (D4, Ds, >, Dio,biy
| (24,3-) (44, 0-) Gt, 2-)
mH a 3}Sua € [04 22] Entropy (Sse)
Smite (14) I-J Eutrepy ( Smita) =
Sem € (14, O-] Garepy ( Scot) 201 o-
ISul Evt vey (Su)
Cain ( (Sunny Temp) = Entropy (8) ~ 2 ey 7s] "Hy
VE Hot, nate,
cot
= 0-0
Entropy ($) - Eatery ( Spot) - 2 Batrepy (Sucisa )
“LG.
A Euatrepy ( Sunl)
2x 1-Lxoo
eS 5
0-97- x Oro-
ale
097- 2 - 0570
5Sa OS mie Ca
Vadis (Aunt dity ) e Hig, Nosemat
Sst = [24,3-]J Enhepy (5) > 0-97
&
gh = (0413-9 Entropy ( Spigu) 2b0
O° Eutropy CSNesunap) 20-0
7 Jy) Su
Gain ( Scunny, Humidity ) E Entrepy (9) al Entropy (Sv)
Nosunet
= Entropy (¢) eS Eutropy (Shiga) - 2 Extropy oe)
@
“8 O7- Byo- 2B eo = 0-97 = M40
s Ss
Values (wing) = Stroag, wealt.
Sstennsy 2 Rt, 3-J Entropy (s) = 0497
“sroug Cit, 1-J Entropy (Sstseng te
Sweek & [1 4, 2-] Evdtropyy (Swean) = <1 deg 2
o = “3
3
. Is 6 ,
Gain ( Ssunny, Wind ) = Cut ropy (5) - Tey ty (Su)
ue (Shou, wet )
S Entropy 5) - 3 Entropy ( Sstrong ) -4 Exarog (Seek)
= 097- Bre “Zyone = 0:0192tallied
Hig Notunat
D
(0! bz, bg) (4, on)
Mo Yer.
finet Corapee
. | _ aT
Overcas i
Sunny
Hips Noxansd
/ \
No YeMEE He Dein Tok ang
© 6 aed Dp Redued. oan prunivg
© vec dake Seay mame eee ws rae
Teconponacing Coudivusts Valued adtyibut er: slain diseiate
o Paeemining Te ahinisfitation 499
® Mavdttng atintbutes wite tif Eenad costs
© mecastive fox setect
M222 W9ULs, Le adbehutes . ‘ec a4
© peat ae Cnamptas witte miss ing. Odribute vabwert
Live Bias in Decisiou Tyee Decision Thue learniieg
vin tin bee tn ung. 1 Here 18 wo bias . oR <>
bud Af we gut mire Hon one tee ) So
He Bb a Problem | eel be Used to Atanoify He new examples
Haak Problem As, eabtad dvductive bias sn IpD3 Mypsdtion .
Se eee See es
ag” Aharter trees One prepoud over tonper tyeen .
(6) A ploser approxi mation to Het> inducctive bias cl Id3
~ Bhovier trees axe pralraur Over doug er freer. gene ay
= Tou Heat place aig. inposumation ain attributes tose to tie
woot oe prerred over +usse Hunt do not:
Hf a, &Testanee Based latiatug dee bavance 4 aw Oli OF |
On examapar tv Hat chravdvg data A eb - |
Simbtent ty pased hamifier or Justanes baaed lami e7*
Use simdtanity measures to mate Hou nemurt ighbeve
totes Cobras -
and clarify 9 test instamee Which wasdes 1
wile ober earning nrtchamibnn Such ar autizion Het:
Tur advantage of Uses Hus Learning ts ted
) Proceraing weeuns guly when a Hempel to elamif f 4
re Jost it gee Tas madly pr
pwhon Hu Whobe date get ik not avaltabte inv 1
beginning bul uobiucted In duommental manner
a of Hui haxnivg Ls Hoar jt Aegiseea
snag te ms Bh tes i
vn am Th tak ovnbrused iabHably wht pur
© aii dake.
Approoemee fe
rT? J I l
weiquted Radial Basi — (ase Booed Se
kn wee “e Function teeta, He in
“4
Tel p operat ou Ofter Coie Hae Cound I stan -
wit. yt poorious Instances , $1 1s called ay lang oe
or Mompry based Learning . Because dn this technique tHe
nod At token ot ores tur newt instwe 4 anriverl.
+ Tustance Based Learningo
a
K-Nearest Neighbor(KNN) Algorithm for Machine
Learning
© K-Nearest Neighbour is one of the simplest Machine Learning algorithms based
‘on Supervised Learning technique.
© K-NN algorithm assumes the similarity between the new case/data and available
cases and put the new case into the category that is most similar to the available
categories.
K-NN algorithm stores all the available data and classifies a new data point
based on the similarity. This means when new data appears then it can be easily
classified into a well suite category by using K- NN algorithm.
© K-NN algorithm can be used for Regression as well as for Classification but
mostly itis used for the Classification problems.
o K-NN is anon-parametric algorithm, which means it does not make any
assumption on underlying data
© It is also called a lazy learner algorithm because it does not learn from the
training set immediately instead it stores the dataset and at the time of
classification, it performs an action on the dataset.
© KNN algorithm at the training phase just stores the dataset and when it gets new
data, then it classifies that data into a category that is much simiiar to the new
data.
Example: Suppose, we have an image of a creature that looks similar to cat and
dog, but we want to know either it is a cat or dog. So for this identification, we
can use the KNN algorithm, as it works on a similarity measure. Our KNN model
will find the similar features of the new data set to the cats and dogs images and
based on the most similar features it will put it in either cat or dog category.
KNN Classifier
8°
Predicted Output
Input valueWhy do we need a K-NN Algorithm’
new
‘Suppose there are two categories, i.e., Category A and Category B, and we have a
data point x1, so this data point wll le in which of these categories. To solve this type of
problem, we need a K-NN algorithm. With the help of K-NN, we can easily identify the
‘category or class of a particular dataset. Consider the below diagram:
ap o
\ Category B
© Nev data point
~\ Category B
° New data point
assigned to
Category 1
INN
Category A Category A
How does K-NN work?
‘The K-NN working can be explained on the basis of the below algorithm:
‘Step-1: Select the number K of the neighbors
alculate the Euclidean distance of K number of neighbors
‘ake the K nearest neighbors as per the calculated Euclidean distance.
Step-4: Among these k neighbors, count the number of the data points in each
category.
© Step-5: Assign the new data points to that category for which the number of the
neighbor is maximum
o Step-6: Our model is ready.
‘Suppose we have a new data point and we need to put it in the required category. Consider
the below image:%e
o*% @
° °
} °
Category 8
New Data
a ® point
4
Category A
© Firstly, we will choose the number of neighbors, so we will choose the k=5.
o Next, we will calculate the Euclidean distance between the data points. The
Euclidean distance is the distance between two points, which we have already
studied in geometry. It can be calculated as:X2.¥2)
Yiferessee= (Xn¥1)
x
Euclidean Distance between Ai and Bz =
© By calculating the Euclidean distance we got the fearest neighbors, as three
nearest neighbors in category A and two nearest neighbors in category B.
Consider the below image:ee
@O* «
o
© °
Category B
©O\
*,@
New Data
® point
Category A
© Aswe can see the 3 nearest neighbors are from category A, hence this new data
point must belong to category A.
How to select the value of K in the K-NN Algorithm?
Below are some points to remember while selecting the value of K in the K-NN algorithm:
© There is no particular way to determine the best value for "K", so we need to try some values to
find the best out of them. The most preferred value for K is 5.
5 Avery low value for K such as K=1 or K=2, can be noisy and lead to the effects of outliers in the
model.
co Large values for K are good, but it may find some difficulties.
Advantages of KNN Algorithm:
‘© [tis simple to implement.
© Itis robust to the noisy training data
© Itean be more effective if the training data is large.
Disadvantages of KNN Algorithm:
‘Always needs to determine the value of K which may be complex some time.
The computation cost is high because of calculating the distance between the data points for all
the training samples.TA» ..
a for foliaing datags and praait tn 4 |
Row
on
aL (aa) (4%
(17+ (7- y)>
4
ie “art | (7n* =
too ef (2-1 (TM
—s
ho |
Row 4
[Tau v |
Row |
Pouce rue 7 |
Lae
weto 0 eH
,
Jeri 23
(29°*+ (3) - Nae
(art C47"
; ing Oxdor
Ea
ele
Tet eumapt wit be’ .
Test oummple (ar 23, Ar27)
Assume K23
Al Ar Un
y) Tate
4 Due
3 + fatse
1 4 Tone é
@5 3 false xy
C 3 Tae
ig (
2 [c= vio 4Dusstion of tui! Mgorttim,
@ Rot - eo arate « dratiregg,
knvss [Taapecanne = Monet = here aes
Row 3 = Al (20-60) *# (35-40) > 2 afer t(asye= Jiswt 3028 5 jes
Rows 2 [Ganley + (36-25)9™ * ol (rea [tet (loi > fiver = Meat
ee eee i shee Sa
Rew sz J (qo-To)>+ (35 To)” [oops (359% = [satinas = 6)
Row 6 = TC (20-to)>+ (35-10)” A Cert (250s, [Tht ase ATI
Rewo = [(ae-25y> + (26- Be)" 7 pf (574 (457% = fastans
= [m0 = 45-28
n
a
_ Jucuaaivg order
Row 4 Red v Tut majontty than With ia
Row | Red vw tw 5
neanest hela lator sa
Row 2, Biwe ~ Lag
how 7 Blue ~ nan cutry ip Re
Rew b Red ~
Row % Blue
Row 3 BlueWweets)
Fig. 12.7. Radial Basis Function (RBF) Network
(is) _ CASE-BASED LEARNING OR CASE-BASED REASONING (CBR)
an S
site posed reasoning ‘based reasoning (CBR) is used for classification and regression.
Tee an past brobleina, oe Process of solving new problems based on the solutions
fee a past problems. The CBR is an advanced instance-based learning method
which used solve more complex problems. Idoesnot use Buldean distance metic
Se arrives to classify, then first an identi i i
imilar case i classify, ientical case is checked in
memory. Ifany similar case is found in stored memory, then its solution is also retrieved.
Fig. 12.8. Case-based Reasoning (CBR) life cycle12.5.1 Steps in CBR
e Retrieve: Gather data from memory. Check any previous solut;,
A ae
ion Similar,
current problem.
Reuse: Suggest a solution based on the experience. A, ,
demands of new situation. " Adapt it to Meet the
Revise: Evaluate the use of solution in new context,
Retain: Store this new problem-solving method in memory system,
12.5.2 Applications of CBR
e Customer service helpdesk for diagnosis of problems.
e Engineering and law for technical design and legal rules.
e Medical science for patient case histories and treatment.
12.5.3 CBR Example (Smart Software Agent)
A common example of CBR is a helpdesk system. Here, the user calls for computer
related service problem. CBR is used by the software assistant to diagnose the problem,
Then software assistance recommends some possible solution to solve the current
problem. For example, a printer problem, internet connection problem etc.Case Based Reasoning (CBR)
‘As we know Nearest Neighbour classifiers stores training tuples as points
in Euclidean space. But Case-Based Reasoning classifiers (CBR) use 2
database of problem solutions to solve new problems. It stores the tuples or
cases for problem-solving as complex symbolic descriptions. How CBR
works? When a new case arises to classify, a Case-based Reasoner(CBR,
will first check if an identical training case exists. If one is found, then the
accompanying solution to that case is returned. If no identical case is found,
then the CBR will search for training cases having components that are
similar to those of the new case. Conceptually, these training cases may be
considered as neighbours of the new case. If cases are represented as
graphs, this involves searching for subgraphs that are similar to subgraphs
within the new case. The CBR tries to combine the solutions of the
neighbouring training cases to propose a solution for the new case. If
compatibilities arise with the individual solutions, then backtracking to search
for other solutions may be necessary. The CBR may employ background
knowledge and problem-solving strategies to propose a feasible
solution. Applications of CBR includes:
1. Problem resolution for customer service help desks, where cases
describe product-related diagnostic problems.
2. It is also applied to areas such as engineering and law, where cases are
either technical designs or legal rulings, respectively.
3. Medical educations, where patient case histories and treatments are used
to help diagnose and treat new patients.
Challenges with CBR
« ‘Finding a good similarity metric (eg for matching subgraphs) and suitable
methods for combining solutions.+ Selecting salient features for indexing training cases and the development
of efficient indexing techniques.
CBR becomes more intelligent as the number of the trade-off between
accuracy and efficiency evolves as the number of stored cases becomes
very large. But after a certain point, the system's efficiency will suffer as the
time required to search for and Process relevant cases increases.