Sentence Formation
Sentence Formation
Sentences
based on Pān.inian Grammar Formalism
Master of Philosophy
in
Shabdabodha Systems and Language Technologies
By
N. Shailaja
I hereby declare that the work embodied in this dissertation entitled Parser
for Simple Sanskrit Sentences based on Pān.inian Grammar formal-
ism is carried out by me under the supervision of Amba Kulkarni, Depart-
ment of Sanskrit Studies, University of Hyderabad, Hyderabad and has not
been submitted for any degree in part or in full to this university or any other
university.
N Shailaja
08HSLS03
Date:
Place: Hyderabad
2
Department of Sanskrit Studies
University of Hyderabad
CERTIFICATE
This is to certify that N Shailaja has carried out the research-work em-
bodied in the present dissertation entittled “ Parser for Simple Sanskrit
Sentences based on Pān.inian Grammar Formalism ” at the Univer-
sity of Hyderabad. The dissertation represents her independent work and has
not been submitted for any research degree of this university or any other
university.
Amba Kulkarni
Supervisor
Amba Kulkarni
Head
Department of Sanskrit Studies
Mohan G Ramanan
Dean
School of Humanities
University of Hyderabad
2
ACKNOWLEDGMENTS
At the outset, I would like to thank to God who gave me lot of energy to
complete this work and then I would like to thank to Acharya Anand
Prakash and Mrs B. Neeraja for their precious and valuable guidance
and they enlightened me from the darkness.
The single most important person that must be acknowledged here, without
whom this dissertation would not have been there, is Mrs. Amba Kulka-
rni who not only suggested this topic but also came through in the process
of the research.
I express my gratitude towards my parents for their love and affection and
for their continuous encouragement and moral support. I express my special
thanks to my husband Mr. Lakshman who gave me encouragement and
moral support to finish this work.
I wish to thank them too whom I could not list out but directly or indirectly
they helped me a lot.
N.Shailaja
Contents
Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . i
1 Introduction 1
2.1 kArk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.1 kA . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.2 km . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1.3 krZ . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1.4 sMþdAn . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1.5 apAdAn . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.6 aEDkrZ . . . . . . . . . . . . . . . . . . . . . . . . . . 11
i
ii Contents
3 fANdboD 15
4 Earlier efforts 21
5 About CLIPS 25
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.1.1 Facts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.1.2 Deftemplate . . . . . . . . . . . . . . . . . . . . . . . . 29
5.3 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
6.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
7 Conclusion 51
Chapter 1
Introduction
1
2 Chapter 1. Introduction
the form of s/s – around 4000 divided into 8 chapters of 4 sections each.
“pAEZEn’s grammar is universally admired for its insightful analysis of San-
skrit”(Kiparsky, 2002).
some of the vAEtks. This rich tradition continued further with major contri-
butions from BthEr in the field of Language philosophy, and later by nAg
f
BÓ , kOX BÓ, etc.
It is evident from the scientific literature that Sanskrit was “Lingua franca”
the scientists who do not have enough exposure to Sanskrit. With the help
electronic form, the vast Sanskrit literature is now available to the scholars
easily. But still, those who do not know Sanskrit well, can not understand
it. The advantages in the field of Machine Translation, and the availability
The aim of this thesis is to build a parser for simple Sanskrit sentences. This
parser will analyse simple Sanskrit sentences and assign kArk roles to vari-
ous nouns in a sentence. The parser will be based on pAEZEn’s grammar.
background of the Panini’s sutra related to EvBEÄ and krk. The third
chapter illustrate the process of fANdboD theories the avy of given words.
The fourth chapter gives a breif summary of earlier efforts in this area. The
is being used for developing the parser. Sixth chapter discusses the actual
4 Chapter 1. Introduction
for future work. In the seventh chapter we discuss the problem cases, and
discuss the possible ways of solving them. The simple parser has been tested
evaluation report is in the eighth chapter. The final chapter lists the future
tasks and resources that are needed to build a realistic parser for Sanskrit.
Chapter 2
pd 1 and apdm^ n þyÒFt is the verdict of a grammarian (one must not use
a word, unless it is inflected). There are seven EvBEÄs (case-affixes) which
make a crude form viz. þAEtpEdk usable. These are called kArk EvBEÄs if
they denote a relation between a noun and a verb. When these EvBEÄs are
governed by on uppd, they are known as upapada EvBEÄs. In the following
sections, we give a brief note on both of these. But before that, let us first
1
sEØRtm^ pdm^ (1.4.14)
5
6 Chapter 2. kAraka and vibhakti
2.1 kArk
kArk is the name given to the relation between a noun and a verb in a sen-
tence. The literal meaning of the word is: any factor which contributes to
must be related with a verb: EyAvEy(vm^ kArk(vm^. There are six kArks:
kA, km, krZ, sMþdAn,apAdAn and aEDkrZ. Appendix I lists all the s/s
that discribe these kArks . All these s/s come under the kArk
. These
2
kArkAEZ Eh -v-v&yApAr
Z avAtr EyAAr
Z vA k
nAEp!p
Z þDAn Eyo(ptO
shAykAEn BvEt.
3
EyAEn pAdk(v\ kArk(vm^ aTAt^ Eyo(pAdk!pATyÄ(v\ kArk(vEmEt
2.1. kArk 7
s/s describe the semantic meaning of the kArks. pAEZEn further in the first
chapter fourth pAd gives s/s which describe the EvBEÄs used for realisation
of various kArk relations.
pAEZEn starts his kArk section with the aEDkAr s/ ”kArk
” (1.4.23). We
describe in brief, each of the kArks, as described by the pAEZEn’s rule, fol-
lowed by the rule governing the vibhakti of that kArk, followed by an exam-
Sanskrit has three voices – kt , km and BAv. In case of ktvAQy (roughly
active voice), the kA is said to be expressed through the verbal suffix. If a
verb is skmk (roughly transitive), it is the km which is expressed through
the verbal suffix, and finally in case of akmk (roughly intransitive) verbs,
what is expressed through the verbal suffix is the action. Now let us see more
2.1.1 kA
But if his intention is to express the fact that the vessel is big enough to cook
the rice, or the vessel has a particular capacity to hold the cooked rice, he
will say
-TAlF p Et.
ptÒEl in mahAbhA.syam says ‘in the absence of a king the senior most
minister will enjoy the powers of king.’4 tTA amA(yAdFnAm^ rAâA sh
smvAy -vAtìym^. Now, pada is defined as sEØRtm^, so a
pArtìy\ &yvAy
þAEtpEdk expressing a kA relation in ktvAQy should have some nominal
suffixes else it can not be pd. Therefore, to make such a þAEtpEdk a pd,
þTmA EvBEÄ is added. pAEZEn has a s/
þAEtpEdkATEl½pErmAZv nmA/
þTmA. (2.3.46)
4
ev\ tEh þDAn
n smvAy
-TAlF prt/A , &yvAy
-vt/A
2.1. kArk 9
Thus in case of ktvAQy kA will have þTmA EvBEÄ, and in case of kmvAQy,
km will have a þTmA EvBEÄ. Such a kArk which is expressed by the verbal
suffix is called aEBEht. In case a kA is not expressed or anEBEht, as in the
case of kmvAQy, then it gets third case suffix by the s/ ”ktkrZyo-ttFyA”.
San: dvd
n pQyt
.
gloss: dvd{3} cooks. (being cooked by Devadatta).
2.1.2 km
ktrFE=sttm\ km (1.4.49).
What the kA seeks the most to attain by its action is the karma.
The vibhakti assignment rules are two:
In case the karma is not expressed (anabhihite 2.3.1), kmEZ EtFyA (2.3.2)
assigns second case suffix to the karma.
If the karma is expressed, then karma being abhihita, will get prathamA
vibhakti.
San: dvd
n pQyt
.
10 Chapter 2. kAraka and vibhakti
2.1.3 krZ
San: l
KyA ElKEt
gloss: lekhani{3} writes. (writes with pen)
2.1.4 sMþdAn
kmZA ymEBþ
{Et s sMþdAnm^. (1.4.32)
That which the agent wants to connect with the object of the action of giv-
term sMþdAn is meaningful. The two upasargas sm^ and þ give the qualified
sense viz. complete transfer.
2.1.5 apAdAn
The s/ ”D
}vm^ apAy
apAdAnm^” (1.4.24) states that “when ablation or sepa-
ration is to be effected by a verbal action, the point of separation is called the
The separation or departure of calf from the cow-shade is marked by the fifth
case suffix.
2.1.6 aEDkrZ
agent or the object and is thereby the immediate site of action is aEDÿ
krZȦccording to BthEr
kt - km&yvEhtAmsA"AAryE(yAm^.
12 Chapter 2. kAraka and vibhakti
krZ.
San: kV
bAElkA aE-t.
gloss: The girl is on the mat.
There are also certain indeclinable words which demand certain case suffixes.
Such words are called uppds – upoÎAErt\ pdm^ – a word pronunced in close
proximity. In other words, when a noun takes certain case-affixes by virtue
of its being in composition with some other word in the proximity with it, it
San: rAm
Z sh sFtA vn\ gQCEt.
gloss: Rama with Sita forest goes. (Sita goes to forest with Rama).
2.2. uppd EvBEÄ 13
In this example, rAm has third case suffix. This is because of the presence of
the word sh. This sh is the word in close proximity (uppd). This upapada
demands that rAma should have third case suffix. Thus this third case suffix
There are 6 types of uppd EvBEÄs corresponding to the 6 case suffixes viz.
2, 3, 4, 5, 6 and 7.
The s/s indicating the words demanding these EvBEÄs is given in Appendix
II.
14 Chapter 2. kAraka and vibhakti
Chapter 3
fANdboD
What we have seen so far is how the kArks are realised in a sentence through
verse process. This process is termed as the process of fANdboDİn India, the
three schools viz. &yAkrZ , yAy and mFmA\sA differ in the process of fANdÿ
boD slightly. The process of fANdboD involves basically the identification of
modifier and modified (Evf
y and Evf
qZ). According to v
{yAkrZs DA(vAT
(meaning of a verb) is the mHy Evf
y. The n
{yAEyks take the þTmAt (the
one in first case suffix) as the mHy Evf
y. There is very subtle difference
between the mFmA\sks and v
{yAkrZs approach. mFmA\sks take BAvnA as the
mHy Evf
y.
15
16 Chapter 3. fANdboD
three schools and their consequences, etc. Our main aim is to develop in-
telligence in the system so that computer can assist human beings in under-
If we look at Sanskrit books on various topics, we see that the original s\EhtA
The s/s stated in the chapter 2 above are directly relevant and useful for the
s/he gets is the EvBEÄs. From these EvBEÄs, the listener should analyse and
get the kArks. This task is not an easy task. We give here some examples
to illustrate the complexity.
The morphological analyser does the analysis of each of these words, and
A human being while reading does not even ‘see’ these ambiguities. But
when we analyse using machine, since machine does not use common sense,
world knowledge, etc. shows all possible analysis. So machine has two tasks:
The question is, whether there is any way to rule out the possibility of rA
being a DAt. Following s/ by pAEZEn comes to our rescue:
”y-y BAv
n BAvl"Zm^.”. (2.3.37)
This s/ rules out the possibility of the following two analysis of gQCEt.
gQCEt : gm^ + ft + {7, np\0 , ekv nm^}
gQCEt : gm^ + ft + {7, p\0 , ekv nm^}
This now leaves only one analysis of gQCEt as a verb. Further the verb gam
has an expectancy of two kArks viz. kA and karma. Further the word
gQCEt is ktvAQy. Hence kA being aEBEht should be in þTmA EvBEÄ.
But there are two pds viz. both rAm and vn in þTmA EvBEÄ. Hence the
question is which one should be taken as kA. We postpone this decision,
and look at the word with second case suffix to decide the km. We get only
one word viz. vnm^. Hence it is assigned a km role. Since any noun can not
20 Chapter 3. fANdboD
have more than one roles. Because, all the s/s describing kArk sâAs are
governed by the s/ ”aAkAXArAt^ ekA sâA”(1.4.1). Hence vanam can get
only one sâA viz. km. This leaves rAm with the kA kArk role. Thus we
see that various s/s of pAEZEn come into play in assigning the kArk roles
to the nouns.
We try to follow this process mechanically, taking the help of pAEZEn’s s/s.
To implement this mechanically, what is needed is a mechanism to solve the
Earlier efforts
There have been much enthusiasm in the field of Sanskrit computations since
computational tools for processing Sanskrit texts. The few notable efforts
of the first type are by the Akshar Bharati group(Bharati,1994). The latter
few. There are also ongoing efforts to model the paninian process of anal-
21
22 Chapter 4. Earlier efforts
dle morphology reasonably well. Only few of them have been handling the
parser was developed following the Integer programming approach, and later
on improved by adapting the bipartite graph model. This Parser was devel-
oped for Hindi and not for Sanskrit. The parser required kAraka charts for
each verb describing the various kArk roles that are mandatory for the verb
(i.e. the expectancy), and the corresponding EvBEÄs the corresponding noun
takes.
mandatory kArks are kA and karma in case of transitive verbs, and kA
in case of intransitive verbs. sMþdAn is also mandatory in case of certain
verbs, whose list is available through the pAEZEn’s s/s. Further the desir-
able kArks are the krZ andapAdAn. Most of the verbs have an expectancy
for these kArks. aEDkrZ being the location or aADAr for either kA or
km, is a kArk which any verb can have. At the same time, unless necessary,
one does not mention it. Hence it is optional.
refer to XawuprakASaH (in press) for the kmAkA"A information. Next, the
list of verbs desiring some kArks such as sMþdAn, apAdAn etc. is also listed
separately. So with this information, we decided to use the pAEZEn’s s/s
only and develop a rule based parser, instead of using the kArk charts.
24 Chapter 4. Earlier efforts
Chapter 5
About CLIPS
5.1 Introduction
system consists of
25
26 Chapter 5. About CLIPS
IF
Here the section of the rule between IF and THEN viz. ‘the noun
part or Left Hand Side (LHS). The part after THEN is called the
satisfied, or the pattern is matched with any of the existing facts from
• Facts
The facts is the actual working memory. It varies with the inputs, and
• Inference Engine
The inference engine makes inferences by deciding which rules are sat-
isfied by the facts. All the rules for which the facts match the LHS, are
In such cases, the inference engine should choose one of the rules. This
selected rule then is fired. The selection of the rules is called conflict
5.1. Introduction 27
the rules, say by assigning the saliency, or implicit based on the speci-
ficity of conditions etc. In case the rules are provided with a saliency,
The expert system language that I have used in my thesis is called CLIPS
of providing high portability, low cost, and easy integration with external
these objectives.
5.1.1 Facts
a name with zero or more fields for associated values. For example, the fol-
lowing fact defines a name or a field called ‘sentence’, whose value is ‘rAm,
28 Chapter 5. About CLIPS
g}Amm^ gQCEt.’
(sentence (rAmaH grAmam gacCati))
of the word rAme.na consists of several facts viz. rAma is the prAtipadika,
its gender is masculine, the word is in the singular number, and it is in third
(word rAmeNa)
(number singular)
(vibhakti third)
(gender masculine)
The problem with such kind of representation is, if we have several words,
then whose number is what, which one is singular, which one is masculine
is not recorded. The association of the word with its features is gone. To
avoid it, we make use of structured facts. To use the structured facts, first
CLIPS should be informed about the new structure. This is done through
the DEFTEMPLATE.
5.1. Introduction 29
5.1.2 Deftemplate
Before facts can be constructed, CLIPS must be informed of the list of valid
facts sharing the same relation name and contain common information. The
(deftemplate sup
(slot id)
(slot word)
(slot rt)
(slot lingam)
(slot viBaktiH)
(slot vacanam)
(slot kAraka))
The morpological analysis for the word, say rAmebhyah., now will be a fact.
This analysis will have the template structure of sup. The fact is then de-
(deffacts (sup
bahu)) )
30 Chapter 5. About CLIPS
Facts with a relation name defined using deftemplate are called deftemplate
facts.
1. Once the CLIPS has been installed, the command ‘clips’, should
3. To exit CLIPS ’or’ The normal mode of leaving CLIPS is with the
• Displaying Facts:
The facts command can be used to display the facts in the fact-list.
stored in the fact-list. To add a new fact to the fact-list, we can use
5.2. Basics of CLIPS operations 31
Just as facts can be added to the fact-list, they can also be removed.
Removing facts from the fact-list is called retraction and is done with
• Modifying Facts:
Slot values of deftemplate facts can be modified using the modify com-
mand.
• Rules: Rules can be typed directly into CLIPS or they can be loaded
cond
32 Chapter 5. About CLIPS
=>
action
Thus, explicitly the words IF and THEN are not written. The part
before the => is the condition and the part after it is the action.
After specifying the rules and providing the action, to execute the rules
When the program runs, and there is a conflict, then the rule with the
All the rules that satisfy the conditions, are on agenda. To display the
• Reset:
The reset command is the key method for starting or restarting. Facts
asserted by a reset satisfy the patterns of one or more rules and place
disk.
• save-fact will save all facts in the fact-list in file and load-fact will load
it.
5.3 Architecture
The flow diagram of the system is given in figure 5.5. The input for our
parser is a Sanskrit text with single finite verb (ti”n), and the output is its
kArk analysis. The main purpose of this exercise is to do only kArk and
uppdEvBEÄ analysis. The input sentence is passed through the morpholog-
ical analyser to get the word level analysis. This analysis is then converted
into CLIPS facts. The facts, which vary with a sentence, along with the rules
are then passed to the CLIPS interpreter for the kArk analysis.
34 Chapter 5. About CLIPS
In this chapter we describe the templates used for declaring the facts, fol-
morphological analyser output. pAEZEn classifies the words into two types:
sbt and EtRt. But this classification is not sufficient for our purpose.
Consider the ÆAt word say ‘g(vA’. If we just mark it as a sbt, or to
be more specific, say an a&yy, it is not sufficient. The reason being, the
underlying verb ‘gm^’, has its own expectancies. And to know that it has
expectancies, the machine should know that the there is an underlying verb,
and it has certain expectancies. Hence we define the templates for sbt ,
Et½t , a&yy , tEt and kdts. We give below the slots available in each
of them. These slots correspond to the features associated with that form.
35
36 Chapter 6. Facts and Rules
We also make a provision for the slot corresponding to the kArk analysis,
which will be filled in by the CLIPS inference engine. In case of verbs, slotes
(deftemplate sup
(deftemplate tin
(slot id)
(slot mid)
(slot word)
6.1. Fact Templates 37
(slot rt)
(slot dhatuH)
(slot lakAraH)
(slot prayogaH)
(slot purushaH)
(slot vacanam)
(slot padI)
(slot gaNaH)
(slot karttA_pos)
(multislot karma_pos)
(slot karaNa_pos)
(slot sampradAna_pos)
(slot apAdAna_pos)
(slot adhikaraNa_pos))
(slot id)
(slot mid)
(slot word)
(slot krt_pratyayaH)
(slot lingam)
38 Chapter 6. Facts and Rules
(slot vibhaktiH)
(slot vacanam)
(slot rt)
(slot dhatuH)
(slot gaNaH)
(slot karttA_pos)
(slot karma_pos))
(deftemplate avy
(slot id)
(slot mid)
(slot word))
(deftemplate taddhita
(slot id)
(slot mid)
(slot word)
(slot rt)
(slot lingam)
(slot vibhaktiH)
(slot vacanam))
6.2. Rules 39
6.2 Rules
We process the input in two steps. In the first step, wherever possible, we try
the search space, and also if machine fails, then it helps the human being to
rule out some of the possibilities. For example, as explained earlier, in the
sentence ‘rAm, vanam gQCEt’, the analysis of the word gQCEt as a subanta
is irrelevant. Hence we remove or retract it. Similarly, there are many cases,
where a word is ambiguous between the second case or first case and also the
sMboDn. The sMboDn will be irrelevant only if certain special conditions are
met. For example, consider the sentence
bAl, KAdEt.
In the sentence bAl, KAdEt. the word KAdEt can have the following analysis
KAdEt -/F 8 ek0 where KAdEt is the feminine ft form. But in this context,
since their is no verb with loV^ lkAr or EvEDElR^ lkAr (see appendix
Assign Karaka Rules for more details), this analysis is ruled out.
Second step is to actually assign kArk roles. Now the EvBEÄs may be either
40 Chapter 6. Facts and Rules
upapada or kArk. We first mark the words with uppd EvBEÄs, since these
are just next to the given words. After this, then we assign the kArk EvBEÄs.
Assigning kArk EvBEÄs is not an easy task. Because several times a word
has more than one morphological analysis possible. In such cases, we have to
earlier.
We describe below three rules, one for removing the irrelevant morphological
analysis, the second one for marking the upapada EvBEÄs and the third one
for marking the kArk EvBEÄs.
In this section we describe a rule that retracts the seventh case analysis when
it is irrelevant.
Hence these analysis may be ruled out on the basis of pAEZEn’s s/
”y-y BAv
n BAvl"Zm^” (2.2.37). The meaning of this s/ is:
If an activity aims at another activity then the verb denoting the first
activity will have seventh case (naturally after a kdt). In such cases
either the kA or the km or both of this first activity will also be then
in seventh case.
• Rule in CLIPS
(defrule sati-saptami
=>
(do-for-all-facts
((?w kqw))
(= ?w:vibhaktiH 7)
(retract ?w)
• Explanation
Thus we observe that, in the sentence rAm, vn\ gQCEt’, there is a kdt
(viz.gQCEt) in seventh EvBEÄ. Therefore the condition of the rule is
fulfilled and hence the rule is fired. The action part checks every kdt
with ft+7 EvBEÄ analysis. In the given sentence there is only one
k.rdanta gQCEt satisfying this condition. Further, the next condition
that there be no other word with seventh EvBEÄ is also satisfied and
6.3. Rule Description 43
finally this word has one more analysis viz as a Et½t. Hence the
ft+7 analysis of the word gQCEt is deleted.
• Input
axAxiH) )
1)(vacanam eka) )
1)(vacanam eka) )
2)(vacanam eka) )
)
44 Chapter 6. Facts and Rules
8)(vacanam eka) )
BvAxiH) )
• Execution
When we execute the rules with the given facts, and check the new
(agenda)
0 sati-saptami: f-0
CLIPS> (run)
CLIPS> (facts)
f-0 (initial-fact) f-1 (wif (id 1) (mid 1) (word rAmaH) (rt rA1) (XAwuH
(paxI parasmEpaxI) (gaNaH axAxiH) (karwA pos nil) (karma pos nil)
(karaNa pos nil) (sampraxAna pos nil) (apAxAna pos nil) (aXikaraNa pos
6.3. Rule Description 45
nil))
f-2 (sup (id 1) (mid 2) (word rAmaH) (rt rAma) (lifgam puM) (viBak-
f-3 (sup (id 2) (mid 1) (word vanam) (rt vana) (lifgam napuM) (viBak-
f-4 (sup (id 2) (mid 2) (word vanam) (rt vana) (lifgam napuM) (viBak-
f-7 (kqw (id 3) (mid 3) (word gacCawi) (kqw prawyayaH nil) (lifgam
f-8 (wif (id 3) (mid 4) (word gacCawi) (rt gam1) (XAwuH gamLz)
parasmEpaxI) (gaNaH BvAxiH) (karwA pos nil) (karma pos nil) (karaNa pos
nil) (sampraxAna pos nil) (apAxAna pos nil) (aXikaraNa pos nil))
Thus we see that rule has deleted the following two facts corresponding
to the kqxanwas.
)
46 Chapter 6. Facts and Rules
Similarly I have written rules for retracting the sMboDn analysis whenever
it is irrelevant. The actual coding is available in Appendix ”(assignkAraka)”
There are six rules corresponding to six upapada EvBEÄs. We look at a very
frequent case of ‘saha’. The pAEZEn’s s/ is ”shyÄ
_þDAn
” (2.3.19). The
implementation of this rule is given below.
rAm
Z sh sFtA vn\ gQCEt.
Here in this sentence sh assigns the third case suffix to rAm. We write
the rule as follows.
• Rule in CLIPS
(defrule assign_trtiyA-upapada
6.3. Rule Description 47
(delayed-do-for-all-facts
; If the subanta has thritiyA vibhakti, the avyaya is adjacent to it and the a
(printout bar "(" ?s:id " " ?a:word " kI upapaxa_viBakwiH)" crlf)))
• Explanation
48 Chapter 6. Facts and Rules
so that when there is a conflict, this rule gets priority over the other
rules.
information that bAZ is an instrument. Since our system does not have
such a knowledge yet, we mark both the words as ‘kA - krZ - vA’.
We give below the rule that marks the abhihita as a karma, and the words
comments.
(karma_pos ?s2:id))
(printout bar "(" ?s1:id " " ?w:word " kA {\dn k\381wA\0}_vA_karaNa)" crlf)
(printout bar "(" ?s2:id " " ?w:word " kA karma)" crlf)
50 Chapter 6. Facts and Rules
))
Chapter 7
Conclusion
various kinds of sentences with single Et½t. These sentences had the ex-
amples from both the uppd EvBEÄs as well as kArk EvBEÄs. I used the
in-house morphological analyser for my work. I also decided not to take the
EZjt as the morphological analyser was not yet fully functional for the
same. Further I also did not consider the q¤F EvBEÄ, since it is used to
denote a kArk as well as noun-noun relations.
I implemented approximately 15 rules for the analysis, and could run all
types of sentences satisfactorily. The appendix lists all the sentences, that
51
52 Chapter 7. Conclusion
To have a full-fledged realistic parser for Sanskrit, still there is a long way
(reset)
;(facts)
; Get the number of sup entries with praWamA
(deffunction count-praWamAnwa (?template)
(length (find-all-facts ((?fct ?template))
(= (fact-slot-value ?fct viBakwiH) 1))))
;check karwari
(deffunction karwari-vA (?template)
(any-factp ((?fct ?template))
(eq (fact-slot-value ?fct prayogaH) karwari)))
;check karmaNi
(deffunction karmaNi-vA (?template)
(any-factp ((?fct ?template))
(eq (fact-slot-value ?fct prayogaH) karmaNi)))
;===================================================
;rAmaH vexam paTawi
; rl1
(defrule assign karwA karma karwqvAcy
(declare (salience 100))
(test (> (count-praWamAnwa sup) 0))
(test (> (count-xviwIyAnwa sup) 0))
(test (eq (karwari-vA wif) TRUE))
(or (test (= (count-xvikarmaka-AkAfkRA wif) 0)) (test (< (count-xviwIyAnwa
sup) 2)))
=>
(delayed-do-for-all-facts
((?s1 sup) (?s2 sup) (?w wif))
(and (= ?s1:viBakwiH 1) (= ?s2:viBakwiH 2) (<> ?s1:id ?s2:id ?w:id)(eq
?w:prayogaH karwari))
(modify ?w (karwA pos ?s1:id) (karma pos ?s2:id))
(modify ?s2 (kAraka karma))
(modify ?s1 (kAraka karwA))
(printout bar ”(” ?s1:id ” ” ?w:word ” kA karwA rl1)” crlf)
67
)
)
;========================================================
rAmeN bANena vAliH hanyawe
(defrule assign karaNa karmavAcy
(test (> (count-wqwIyAnwa sup) 0))
=>
(delayed-do-for-all-facts
((?s1 sup) (?s2 sup) (?s3 sup) (?w wif))
(and (= ?s1:viBakwiH 3) (= ?s2:viBakwiH 3) (= ?s3:viBakwiH 1) (<> ?s1:id
?s2:id ?s3:id ?w:id)(eq ?w:prayogaH karmaNi))
(modify ?w (karwA pos ?s1:id) (karaNa pos ?s2:id)(karma pos ?s3:id))
(modify ?s1 (kAraka karwA))
(modify ?s2 (kAraka karaNa))
(modify ?s3 (kAraka karma))
(printout bar ”(” ?s1:id ” ” ?w:word ” kA karwA )” crlf)
(printout bar ”(” ?s2:id ” ” ?w:word ” kA karma )” crlf)
(printout bar ”(” ?s3:id ” ” ?w:word ” kA karaNa )” crlf)
))
;========================================================
;rl5
(defrule assign sampraxAna
(declare (salience 100))
(test (> (count-cawurWyAnwa sup) 0))
(or (test (> (count-xvikarmaka-AkAfkRA wif) 0)) (test (> (count-sampraxAna-
AkAfkRA wif) 0)))
=>
70 Chapter 7. Conclusion
(delayed-do-for-all-facts
((?s1 sup) (?s2 sup) (?w wif))
(and (<> ?s1:id ?s2:id ?w:id) (= ?s1:viBakwiH 4) (or (eq (gdbm lookup
”sampraxAna XAwu list.gdbm” ?w:XAwuH) ”1”) (eq (gdbm lookup ”xvikar-
maka XAwu list.gdbm” ?w:XAwuH) ”1”)))
;(= ?s1:viBakwiH 4)
(modify ?w (sampraxAna pos ?s1:id))
(modify ?s1 (kAraka sampraxAna))
(printout bar ”(” ?s1:id ” ” ?w:word ” kA sampraxAna rl5)” crlf)
)
)
;===================================================
;rl6
(defrule assign apAxAna
(declare (salience 100))
(test (> (count-paFcamyanwa sup) 0))
=>
(delayed-do-for-all-facts
((?s1 sup) (?w wif))
(and (<> ?s1:id ?w:id) (= ?s1:viBakwiH 5))
(modify ?w (apAxAna pos ?s1:id))
(modify ?s1 (kAraka apAxAna))
(printout bar ”(” ?s1:id ” ” ?w:word ” kA apAxAna rl6)” crlf)
)
)
;===================================================
;rl7
71
=>
(delayed-do-for-all-facts
((?s1 sup) (?w wif))
(and (= ?s1:viBakwiH 7) (<> ?s1:id ?w:id))
(modify ?w (aXikaraNa pos ?s1:id))
(modify ?s1 (kAraka aXikaraNa))
(printout bar ”(” ?s1:id ” ” ?w:word ” kA aXikaraNa rl7)” crlf)
)
)
;========================================================
;rl8
(defrule assign xviwIyA-upapaxa
(test (> (count-xviwIyAnwa sup) 0))
(test (> (count-xviwIyA-upapaxa avy) 0))
=>
(delayed-do-for-all-facts
((?a avy) (?s sup))
(and (= ?s:viBakwiH 2) (= (- ?a:id ?s:id) 1) (eq (gdbm lookup ”xviwIyA upapaxa list.gdbm”
?a:word) ”1”))
(modify ?s (kAraka xviwIyA-upapaxa viBakwiH))
(printout bar ”(” ?s:id ” ” ?a:word ” kA xviwIyA-upapaxa viBakwiH rl8)”
crlf)
)
)
72 Chapter 7. Conclusion
;===================================================
;rl9
(defrule assign wqwIyA-upapaxa
(test (> (count-wqwIyAnwa sup) 0))
(test (> (count-wqwIyA-upapaxa avy sup) 0))
=>
(delayed-do-for-all-facts
((?a avy) (?s sup))
(and (= ?s:viBakwiH 3) (= (- ?a:id ?s:id) 1) (eq (gdbm lookup ”wqwIyA upapaxa list.gdbm”
?a:word) ”1”))
(modify ?s (kAraka wqwIyA-upapaxa viBakwiH))
(printout bar ”(” ?s:id ” ” ?a:word ” kA wqwIyA-upapaxa viBakwiH rl9)”
crlf)))
;===================================================
;rl10
(defrule assign cawurWI-upapaxa
;(declare (salience 100))
(test (> (count-cawurWyAnwa sup) 0))
(test (> (count-cawurWI-upapaxa avy) 0))
=>
(delayed-do-for-all-facts
((?a avy) (?s sup))
(and (= ?s:viBakwiH 4) (= (- ?a:id ?s:id) 1) (eq (gdbm lookup ”cawurWI upapaxa list.gdbm
?a:word) ”1”))
(modify ?s (kAraka cawurWI-upapaxa viBakwiH))
(printout bar ”(” ?s:id ” ” ?a:word ” kA cawurWI-upapaxa viBakwiH rl10)”
crlf)))
;===================================================
73
;rl11
(defrule assign paFcamI-upapaxa
(test (> (count-paFcamyanwa sup) 0))
(test (> (count-paFcamI-upapaxa avy sup) 0))
=>
(delayed-do-for-all-facts
((?a avy) (?s sup))
(and (= ?s:viBakwiH 5) (= (- ?a:id ?s:id) 1) (eq (gdbm lookup ”paFcamI upapaxa list.gdbm”
?a:word) ”1”))
(modify ?s (kAraka paFcamI-upapaxa viBakwiH))
(printout bar ”(” ?s:id ” ” ?a:word ” kA paFcamI-upapaxa viBakwiH rl11)”
crlf)))
;========================================================
;rl12
(defrule assign RaRTI-upapaxa
(test (> (count-RaRTanwa sup) 0))
(test (> (count-RaRTI-upapaxa avy sup) 0))
=>
(delayed-do-for-all-facts
((?a avy) (?s sup))
(and (= ?s:viBakwiH 6) (= (- ?a:id ?s:id) 1) (eq (gdbm lookup ”RaRTI upapaxa list.gdbm”
?a:word) ”1”))
(modify ?s (kAraka RaRTI-upapaxa viBakwiH))
(printout bar ”(” ?s:id ” ” ?a:word ” kA RaRTI-upapaxa viBakwiH rl12)”
crlf)))
;========================================================
;rl13
(defrule assign sapwamI-upapaxa
74 Chapter 7. Conclusion
;========================================================
(agenda)
(run)
(facts)
(close bar)
(exit)
76 Chapter 7. Conclusion
(reset)
;(facts)
; Get the number of sup entries with samboXana
(deffunction count-supkqw-samboXana (?template)
(length (find-all-facts ((?fct ?template))
(= (fact-slot-value ?fct viBakwiH) 8))))
;
; yasya ca BAvena BAvalakRaNam
(open ”foo.txt” foo ”a”)
(open ”for kAraka.txt” bar ”w”)
(defrule sawi-sapwami
(test (>= (count-sawi-sapwami kqw) 1))
=>
; repeat for all
77
(do-for-all-facts
; kqxanwas
((?w kqw))
; and the word under consideration has at least one non-Sawq+7 analysis
(any-factp ((?w1 sup wif)) (= ?w1:id ?w:id))
)
; then retract such analysis and also save this info in a file
(retract ?w)
(printout foo ”(” ?w:id ” ” ?w:mid ”) yasya ca BAvena BAvalakRaNam” crlf
)
)
)
;========================================================
(do-for-all-facts
((?w wif)(?s sup kqw))
(and (<> ?w:id ?s:id) (or (eq ?w:lakAraH lot) (eq ?w:lakAraH viXilif)) (=
?s:viBakwiH 8) (eq ?w:vacanam ?s:vacanam) (neq ?w:puruRaH u) (or (=
(count-supkqw-samboXana sup) 1) (= (count-supkqw-samboXana kqw) 1)))
=>
; For each of the avy sup pair
79
(do-for-all-facts
((?a avy)(?s sup kqw))
(and (<> ?a:id ?s:id) (gdbm lookup ”samboXana avy wrds list.gdbm” ?a:word)
(= (count-supkqw-samboXana sup) 1) (= (count-supkqw-samboXana kqw)
1))
(printout bar ”(wif (id ” ?w:id ”) (mid ” ?w:mid ”) (word ” ?w:word ”) (rt
” ?w:rt ”)(XAwuH ” ?w:XAwuH ”)(lakAraH ” ?w:lakAraH ”)(prayogaH ”
?w:prayogaH ”)(puruRaH ” ?w:puruRaH ”)(vacanam ” ?w:vacanam ”)(paxI
” ?w:paxI ”)(gaNaH ” ?w:gaNaH ”))” crlf)
)
(do-for-all-facts
((?w kqw))
(printout bar ”(kqw (id ” ?w:id ”) (mid ” ?w:mid ”) (word ” ?w:word ”)
(kqw prawyayaH ” ?w:kqw prawyayaH ”) (lifgam ” ?w:lifgam ”) (viBakwiH
” ?w:viBakwiH ”) (vacanam ” ?w:vacanam ”) (rt ” ?w:rt ”) (XAwuH ”
?w:XAwuH ”) (gaNaH ” ?w:gaNaH ”))” crlf)
)
(do-for-all-facts
((?w avy))
(printout bar ”(avy (id ” ?w:id ”) (mid ” ?w:mid ”) (word ” ?w:word ”))”
crlf)
)
(do-for-all-facts
((?w waxXiwa))
(printout bar ”(waxXiwa (id ” ?w:id ”) (mid ” ?w:mid ”) (word ” ?w:word ”)
(rt ” ?w:rt ”)(lifgam ” ?w:lifgam ”)(viBakwiH ” ?w:viBakwiH ”)(vacanam”
?w:vacanam ”))” crlf)
)
(close foo)
(close bar)
(exit)
81
LIST OF EXAMPLES:
pA k, p Et
prfnA ECnE
þAsAdAt^ ptEt
pFW upEvfEt
pA k, aodn\ p Et
pA k, aodn\ p t
pA k n tXl, pQyt
sd, p Et
bAl, KAdEt
vV, ElKEt
sd, tXl\ p Et
bAlk, g}Am\ gQCEt
{/, kp\ KnEt
BÄ, hEr\ BjEt
rAm, bAZn rAvZ\ hEt
dvd, prfnA v"\ ECnE
bAl, pAdA<yA\ gh\ gQCEt
{/, rjkAy v-/\ ddAEt
rAjA EvþAy gA\ ddAEt
pZ v"At^ ptEt
k Z, goklAt^ aAgQCEt
pAT, pvtAt^ avrohEt
Kg, v"At^ Xyt
bAl, kV upEvfEt
sd, aodn\ -TASyA\ p Et
sd, tXl\ p Et
sdn tXl, pQyt
gO, go¤m^ aAgQCEt
CA/ Z ok, pÕt
pA k, tXlAn^ p Et
BÄ, g½A\ -pfEt
gopAl, gA\ doE`D py,
vAmn, bEl\ yA t vsDAm^
pAT, mAZvk\ pTAn\ pQCEt
82 Chapter 7. Conclusion
orAt^ EbBEt
&yAG}At^ r"Et
Ef y, upAyAyAt^ aDFt
tt<y, pV, BvEt
mAsAt^ aAr<y m G, vqEt
Etl q t{lm^ aE-t
gzZA pAW, Eyt
CA/A, fAlA\ þEvfEt
EvAs, Em/Ay p-tk\ ddAEt
bAl, vAhnAt^ ptEt
sv kV upEvfEt
mo" iQCA aE-t
gopn gA\ v}j, avzyt
pATn mAZvk, pTAn\ pQt
Ep/A mAZvk, Dmm^ uQyt
srZ sDA\ "FrEnED, mLyt
{/Z dvd, ft\ m yt
kqk Z g}Amm^ ajA nFyt
gopn vqB, go¤\ E yt
kqk Z ajA ngrm^ ut
vV, v"At^ PlAEn avE noEt
gop, vqB\ go¤ hrEt
gopAln gO, py, dt
vAmn n bEl, vsDA\ yAQyt
B(y, BAr\ vhEt
B(y, BAr\ nyEt
k\s, k ZAy yEt
sv<y, -vE-t
g}AmAt^ ur, EvAly,
v
{fAKAt^ pv, {/,
g}AmAt^ dr\ ngrm^
g}AmAt^ aEtk\ go¤m^
g}Am-y drAt^ vnm^ aE-t
v"-y EnkVAt^ gO, aE-t
g}Am-y dr\ vnm^ aE-t
84 Chapter 7. Conclusion
Bibliography
• Subba Rao Veluri, ‘The Philosophy of a sentene and its parts’, Mun-
shiram Manoharlal, New Delhi,1969
• aAndþkAf m
DATF , vAEtk - þkAf,, caukhamba samskrita samsthaan,
Varanasi, 1993.