0% found this document useful (0 votes)

13 views12 pages

20AMSPCSIC01

The document presents a spectral learning-based approach for evaluating Semantic Textual Similarity (STS) between sentence pairs, focusing on identifying semantic components that maximize correlation. The proposed method utilizes Canonical Correlation Analysis (CCA) and metrics like cosine similarity and Word Mover’s Distance, demonstrating performance comparable to complex supervised models like LSTM. This approach aims to enhance scalability and simplicity in STS tasks, contributing to various Natural Language Processing applications.

Uploaded by

Getnete degemu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views12 pages

20AMSPCSIC01

Uploaded by

Getnete degemu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/348172749

Spectral Learning of Semantic Units in a Sentence Pair to Evaluate Semantic

Textual Similarity

Chapter in Lecture Notes in Computer Science · January 2020

DOI: 10.1007/978-3-030-66665-1_4

CITATION READS

1 21

2 authors:

Akanksha Bhardwaj Krishna Asawa

Jaypee Institute of Information Technology Jaypee Institute of Information Technology
11 PUBLICATIONS 13 CITATIONS 64 PUBLICATIONS 408 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Krishna Asawa on 15 January 2024.

The user has requested enhancement of the downloaded file.

Spectral Learning of Semantic Units in a
Sentence Pair to Evaluate Semantic
Textual Similarity

Akanksha Mehndiratta(B) and Krishna Asawa

Jaypee Institute of Information Technology, Noida, India

[email protected], [email protected]

Abstract. Semantic Textual Similarity (STS) measures the degree of

semantic equivalence between two snippets of text. It has applicabil-
ity in a variety of Natural Language Processing (NLP) tasks. Due to the
wide application range of STS in many fields, there is a constant demand
for new methods as well as improvement in current methods. A surge of
unsupervised and supervised systems has been proposed in this field but
they pose a limitation in terms of scale. The restraints are caused either
by the complex, non-linear sophisticated supervised learning models or
by unsupervised learning models that employ a lexical database for word
alignment. The model proposed here provides a spectral learning-based
approach that is linear, scale-invariant, scalable, and fairly simple. The
work focuses on finding semantic similarity by identifying semantic com-
ponents from both the sentences that maximize the correlation amongst
the sentence pair. We introduce an approach based on Canonical Correla-
tion Analysis (CCA), using cosine similarity and Word Mover’s Distance
(WMD) as a calculation metric. The model performs at par with sophis-
ticated supervised techniques such as LSTM and BiLSTM and adds a
layer of semantic components that can contribute vividly to NLP tasks.

Keywords: Semantic Textual Similarity · Natural Language

Processing · Spectral learning · Semantic units · Canonical Correlation
Analysis · Word Mover’s Distance

1 Introduction
Semantic Textual Similarity (STS) determines the similarity between two pieces
of texts. It has applicability in a variety of Natural Language Processing (NLP)
tasks including textual entailment, paraphrase, machine translation, and many
more. It aims at providing a uniform structure for generation and evaluation
of various semantic components that, conventionally, were considered indepen-
dently and with a superﬁcial understanding of their impact in various NLP
applications.
The SemEval STS task is an annual event held as part of the SemEval/*SEM
family of workshops. It was one of the most awaited events for STS from 2012 to
c Springer Nature Switzerland AG 2020
L. Bellatreche et al. (Eds.): BDA 2020, LNCS 12581, pp. 49–59, 2020.
https://doi.org/10.1007/978-3-030-66665-1_4
50 A. Mehndiratta and K. Asawa

2017 [1–6], that attracted a large number of teams every year for participation.
The dataset is available publicly by the organizers containing up to 16000 sen-
tence pairs for training and testing that is annotated by humans with a rating
between 0–5 with 0 indicating highly dissimilar and 5 being highly similar.
Generally, the techniques under the umbrella of STS can be classiﬁed into
the following two categories:

1. Supervised Systems: The techniques designed in this category generate

results after conducting training with an adequate amount of data using
a machine learning or deep-learning based model [9,10]. Deep learning has
gained a lot of popularity in NLP tasks. They are extremely powerful and
expressive but are also complex and non-linear. The increased model com-
plexity makes such models much slower to train on larger datasets.
2. Unsupervised Systems: To our surprise, the basic approach of plain aver-
aging [11] and weighted averaging [12] word vectors to represent a sen-
tence and computing the degree of similarity as the cosine distance has
outperformed LSTM based techniques. Examples like these strengthen the
researchers that lean towards the simpler side and exploit techniques that
have the potential to process a large amount of text and are scalable instead
of increased model complexity. Some of the techniques under this category
may have been proposed even before the STS shared task [19,20] whiles some
during. Some of these techniques usually rely on a lexical database such as
paraphrase database (PPDB) [7,8], wordnet [21], etc. to determine contextual
dependencies amongst words.

The technique that is proposed in this study is based on spectral learning and is
fairly simple. The idea behind the approach stems from the fact that the seman-
tically equivalent sentences are dependent on a similar context. Hence goal here
is to identify semantic components that can be utilized to frame context from
both the sentences. To achieve that we propose a model that identiﬁes such
semantic units from a sentence based on its correlation from words of another
sentence. The method proposed in the study, a spectral learning-based approach
for measuring the strength of similarity amongst two sentences based on Canon-
ical Correlation Analysis (CCA) [22] uses cosine similarity and Word Mover’s
Distance (WMD) as calculation metric. The model is fast, scalable, and scale-
invariant. Also, the model is linear and have the potential to perform at par with
the non-linear supervised learning architectures such as such as LSTM and BiL-
STM. It also adds another layer by identifying semantic components from both
the sentences based on their correlation. These components can help develop a
deeper level of language understanding.

2 Canonical Correlation Analysis

Given two sets of variables, canonical correlation is the analysis of a linear rela-
tionship amongst the variables. The linear relation is captured by studying the
Spectral Learning of Semantic Units 51

latent variables (variables that are not observed directly but inferred) that rep-
resent the direct variables. It is similar to correlation analysis but multivariate.
In the statistical analysis, the term can be found in multivariate discriminant
analysis and multiple regression analysis. It is an analog to Principal Compo-
nent Analysis (PCA), for a set of outputs. PCA generates a direction of maximal
covariance amongst the elements of a matrix, in other words for a multivariate
input on a single output, whereas CCA generates a direction of maximal covari-
ance amongst the elements of a pair of matrices, in other words for a multivariate
input on a multivariate output.
Consider two random multivariable x and y. Given Cxx , Cyy , Cyx that rep-
resents the within-sets and between-sets covariance matrix of x and y and Cxy
is a transpose of Cyx , CCA tries to generate projections CV1 and CV2 , a pair
of linear transformations, using the optimization problem given by Eq. 1.

CV1T Cxy CV2

max (1)
CV1 ,CV2 CV1T Cxx CV1 CV2T Cyy CV2

Given x and y, the canonical correlations are found by exploiting the eigen-
value equations. Here the eigenvalues are the squared canonical correlations and
the eigenvectors are the normalized canonical correlation basis vectors. Other
than eigenvalues and eigenvectors, another integral piece for solving Eq. 1 is to
compute the inverse of the covariance matrices. CCA utilizes Singular value
decomposition (SVD) or eigen decomposition for performing the inverse of a
matrix. Recent advances [24] have facilitated such problems with a boost on a
larger scale. This boost is what makes CCA fast and scalable.
More specifically, consider a group of people that have been selected to par-
ticipate in two different surveys. To determine the correlation between the two
surveys CCA tries to project a linear transformation of the questions from survey
1 and questions from survey 2 that maximizes the correlation between the pro-
jections. CCA terminology identifies the questions in the survey as the variables
and the projections as variates. Hence the variates are a linear transformation
or a weighted average of the original variables. Let the questions in survey 1
be represented as x1 , x2 , x3 .... xn similarly questions in survey 2 are represented
as y1 , y2 , y3 ....ym . The first variate for survey 1 is generated using the relation
given by Eq. 2.

CV1 = a1 x1 + a2 x2 + a3 x3 + .....an xn (2)

And the ﬁrst variate for survey 2 is generated using the relation given by
Eq. 3.

CV1 = b1 y1 + b2 y2 + b3 y3 + .....bm ym (3)

Where a1 , a2 , a3 ..... an and b1 , b2 , b3 .... bm are weights that are generated in
such a way that it maximizes the correlation between CV1 and CV2 . CCA can
generate the second pair of variates using the residuals of the ﬁrst pair of variates
52 A. Mehndiratta and K. Asawa

and many more in such a way that the variates are independent of each other
i.e. the projections are orthogonal.
When applying CCA the following fundaments are needed to be taken care
of:

1. Determine the minimum number of variates pair be generated.

2. Analyze the signiﬁcance of a variate from two perspectives – one being the
magnitude of relatedness between the variate and the original variable from
which it was transformed and the magnitude of relatedness between the cor-
responding variate pair.

2.1 CCA for Computing Semantic Units

Given two views X = (X(1) , X(2) ) of the input data and a target variable Y of
interest, Foster [23] exploits CCA to generate a projection of X that reduces the
dimensionality without compromising on its predictive power. Authors assume,
as represented by Eq. 4, that the views are independent of each other conditioned
on a hidden state h, i.e.

P (X (1) , X (2) |h) = P (X (1) |h)P (X (2) |h) (4)

Here CCA utilizes the multi-view nature of data to perform dimensionality
reduction.
STS is an estimate of the prospective of a candidate sentence to be consid-
ered as a semantic counterpart of another sentence. Measuring text similarity
has had a long-serving and contributed widely in applications designed for text
processing and related areas. Text similarity has been used for machine trans-
lation, text summarization, semantic search, word sense disambiguation, and
many more. While making such an assessment is trivial for humans, making algo-
rithms and computational models that mimic human-level performance poses a
challenge. Consequently, natural language processing applications such as gen-
erative models typically assume a Hidden Markov Model (HMM) as a learning
function. HMM also indicates a multi-view nature. Hence, two sentences that
have a semantic unit(s) c with each other provide two natural views and CCA
can be capitalized, as shown in Eq. 5, to extract this relationship.

P (S1 , S2 |c) = P (S1 |c)P (S2 |c) (5)

Where S1 and S2 mean sentence one and sentence two that are supposed to
have some semantic unit(s) c. It has been discussed in the previous section that
CCA is fast and scalable. Also, CCA neither requires all the views to be of a ﬁxed
length nor have the views to be of the same length; hence it is scale-invariant
for the observations.
Spectral Learning of Semantic Units 53

3 Model
3.1 Data Collection

We test our model in three textual similarity tasks. All three of which were
published in SemEval semantic textual similarity (STS) tasks (2012–2017). The
ﬁrst dataset considered for experimenting was from SemEval -2017 Task 1 [6], an
ongoing series of evaluations of computational semantic analysis systems with
a total of 250 sentence pairs. Another data set was SemEval textual similarity
dataset 2012 with the name “OnWN” [4]. The sentence pair in the dataset is
generated from the Ontonotes and its corresponding wordnet deﬁnition. Lastly,
SemEval textual similarity dataset 2014 named “headlines” [2] that contains
sentences taken from news headlines. Both the datasets have 750 sentence pairs.
In all the three datasets a sentence pair is accompanied with a rating between
0–5 with 0 indicating highly dissimilar and 5 being highly similar. An example
of a sentence pair available in the SemEval semantic textual similarity (STS)
task is shown in Table 1.

Table 1. A sample demonstration of sentence pair available in the SemEval semantic

textual similarity (STS) task publically available dataset.

Example - 1 Example - 2
Sentence 1 Birdie is washing itself in the water The young lady enjoys listening to
basin the guitar
Sentence 2 The bird is bathing in the sink The woman is playing the violin
Similarity Score 5 (The two sentences mean the same 1 (The two sentences may be around
thing hence are completely equiva- the same topic but are not equiva-
lent) lent)

3.2 Data Preprocessing

It is important to pre-process the input data to improve the learning and elevate
the performance of the model. Before running the similarity algorithm the data
collected is pre-processed based on the following steps.

1. Tokenization - Processing one sentence at a time from the dataset the

sentence is broken into a list of words that were essential for creating word
embeddings.
2. Removing punctuations - Punctuations, exclamations, and other marks
are removed from the sentence using regular expression and replaced with
empty strings as there is no vector representation available for such marks.
3. Replacing numbers - The numerical values are converted to their corre-
sponding words, which can then be represented as embeddings.
54 A. Mehndiratta and K. Asawa

4. Removing stop words - In this step the stop words from each sentence
are removed. A stop word is a most commonly used word (such as “the”,
“a”, “an”, “in”) that do not add any valuable semantic information to our
sentence. The used list of stop words is obtained from the nltk package in
python.

3.3 Identifying Semantic Units

Our contribution to the STS task adds another layer by identifying semantic
units in a sentence. These units are identified based on their correlation with the
semantic units identified in the paired sentence. Each sentence si is represented
as a list of the word2vec embedding, where each word is represented in the m
-dimensional space using Google’s word2vec. si = (wi1 , wi2 , ..., wim ), i = 1, 2, ...,
m, where each element is the embedding counterpart of its corresponding word.
Given two sentences si and sj , CCA projects variates as linear transformation
of si and sj . The number of projections to be generated is limited to the length,
i.e. no. of words, of the smallest vector between si and sj . E.g. if the length of
si and sj is 8 and 5 respectively, the maximum number of correlation variates
outputted is 5. Conventionally, word vectors were considered independently and
with a superficial understanding of their impact in various NLP applications. But
these components obtained can contribute vividly in an NLP task. A sample of
semantic units identified on a sentence pair is shown in Table 2.

Table 2. A sample of semantic units identified on a sentence pair in the SemEval

dataset.

Sentence The group is eating while taking in a A group of people take a look at an
breath-taking view. unusual tree.
Pre-processed [‘group’, ‘eating’, ‘taking’, ‘breath- [‘group’, ‘people’, ‘take’, ‘look’,
tokens taking’, ‘view’] ‘unusual’, ‘tree’]
Correlation variates [‘group’, ‘taking’, ‘view’, ‘breathtak- [‘group’, ‘take’, ‘look’, ‘unusual, ‘peo-
ing’, ‘people’] ple’]

3.4 Formulating Similarity

The correlation variates projected by CCA are used to generate a new represen-
tation for each sentence si as a list of the word2vec vectors, si = (wi1 , wi2 , ..., win ),
i = 1, 2, ..., n, where each element is the Google’s word2vec word embedding of
its corresponding variate identiﬁed by CCA.
Given a range of variate pairs, there are two ways of generating a similarity
score for sentence si and sj :
Spectral Learning of Semantic Units 55

1. Cosine similarity: It is a very common and popular measure for similar-

ity. Given a pair of sentence represented as si = (wi1 , wi2 , ..., wim ) and
sj = (wj1 , wj2 , ..., wjm ), cosine similarity measure is deﬁned as Eq. 6
m
wik wjk
sim(si , sj ) = k=1 (6)
m m
k=1 wik k=1 wjk
2 2

Similarity score is calculated by computing the mean of cosine similarity for

each of these variate pairs.
2. Word Mover’s Distance (WMD): WMD is a method that allows us to assess
the “distance” between two documents in a meaningful way. It harnesses the
results from advanced word –embedding generation techniques like Glove [13]
or Word2Vec as embeddings generated from these techniques are semantically
superior. Also, with embeddings generated using Word2Vec or Glove it is
believed that semantically relevant words should have similar vectors. Let
T = (t1 , t2 , ..., tm ) represents a set with m diﬀerent words from a document
A. Similarly P = (p1 , p2 , ..., pn ) represents a set with n diﬀerent terms from a
document B. The minimum cumulative distance traveled amongst the word
cloud of the text document A and B becomes the distance between them.

A min-max normalization, given in Eq. 7, is applied on the similarity score

generated by cosine similarity or WMD to scale the output similarity score to 5.
x − xmin
xscaled = (7)
xmax − xmin

4 Results and Analysis

The key evaluation criterion is the Pearson’s coefficient between the predicted
scores and the ground-truth scores. The results from the “OnWN” and “Head-
lines” dataset published in SemEval semantic textual similarity (STS) task 2012
and 2014 respectively is shown in Table 3. The first three results are from the offi-
cial task rankings followed by seven models proposed by Weintings [11]. The last
two column indicate the result from the model proposed with cosine similarity
and WMD respectively. The dataset published in SemEval semantic textual sim-
ilarity (STS) tasks 2017 is identified as Semantic Textual Similarity Benchmark
(STS-B) by the General Language Understanding Evaluation (GLUE) bench-
mark [16]. The results of the official task rankings for the task STS-B are shown
in Table 4. Table 5 indicate the result from the model proposed with cosine
similarity and WMD respectively. Since the advent of GLUE, a lot models have
been proposed for the STS-B task, such as XLNet [17], ERNIE 2.0 [18] and many
more, details of these models are available on the official website of GLUE1 , that
produces result above 90% in STS-B task. But the increased model complexity
1
https://gluebenchmark.com/leaderboard.
56 A. Mehndiratta and K. Asawa

Table 3. Results on SemEval -2012 and 2014 textual similarity dataset (Pearson’s r x
100).

Dataset 50% 75% Max PP proj DAN RNN iRNN LSTM LSTM CCA CCA
(output (CoSim) (WMD)
gate)
OnWN 60.8 65.9 72.7 70.6 70.1 65.9 63.1 70.1 65.2 56.4 60.5 37.1
Headlines 67.1 75.4 78.4 69.7 70.8 69.2 57.5 70.2 57.5 50.9 62.5 55.8

Table 4. Results on STS-B task from GLUE Benchmark (Pearson’s r x 100).

Model STS-B
Single task training
BiLSTM 66.0
+ELMo [14] 64.0
+CoVe [15] 67.2
+Attn 59.3
+Attn, ELMo 55.5
+ATTN, CoVe 57.2
Multi-task training
BiLSTM 70.3
+ELMo 67.2
+CoVe 64.4
+Attn 72.8
+Attn, ELMo 74.2
+ATTN, CoVe 69.8
Pre-trained sentence representation models
CBow 61.2
Skip-Thought 71.8
Infersent 75.9
DisSent 66.1
GenSen 79.3
Note. Adapted from “Glue: A multi-task bench-
mark and analysis platform for natural language
understanding” by Wang, A., Singh, A., Michael,
J., Hill, F., Levy, O., Bowman, S. R.(2019), In:
International Conference on Learning Represen-
tations (ICLR).

makes such models much slower to train on larger datasets. The work here
focuses on ﬁnding semantic similarity by identifying semantic components using
an approach that is linear, scale-invariant, scalable, and fairly simple.
Spectral Learning of Semantic Units 57

Table 5. Results of proposed spectral learning-based model on the SemEval 2017

dataset (Pearson’s r x 100).

Model STS-B
CCA (Cosine similarity) 73.7
CCA (WMD) 76.9

5 Conclusion

We proposed a spectral learning based model namely CCA using cosine Sim-
ilarity and WMD, and compared the model on three different datasets with
various other competitive models. The model proposed utilizes a scalable algo-
rithm hence it can be included in any research that is inclined towards textual
analysis. With an added bonus that the model is simple, fast and scale-invariant
it can be an easy fit for a study.
Another important take from this study is the identification of semantic units.
The first step in any NLP task is providing a uniform structure for generation
and evaluation of various semantic units that, conventionally, were considered
independently and with a superficial understanding of their impact. Such com-
ponents can help in understanding the development of context over sentence in
a document, user reviews, question-answer and dialog session.
Even though our model couldn’t give best results it still performed better
than some models and gave competitive results for others, which shows that
there is a great scope for improvement. One of the limitations of the model is its
inability to identify semantic units larger than a word for instance, a phrase. It
will also be interesting to develop a model that is a combination of this spectral
model with a supervised or an unsupervised model. On further improvement
the model will be helpful in various ways and can be used in applications such
as document summarization, word sense disambiguation, short answer grading,
information retrieval and extraction, etc.

References
1. Agirre, E., et al.: SemEval-2015 task 2: semantic textual similarity, English, Spanish
and pilot on interpretability. In: Proceedings of the 9th International Workshop on
Semantic Evaluation (SemEval 2015), pp. 252–263. Association for Computational
Linguistics, June 2015
2. Agirre, E., et al.: SemEval-2014 task 10: multilingual semantic textual similar-
ity. In: Proceedings of the 8th International Workshop on Semantic Evaluation
(SemEval 2014), pp. 81–91. Association for Computational Linguistics, August
2014
3. Agirre, E., et al.: SemEval-2016 task 1: semantic textual similarity, monolingual
and cross-lingual evaluation. In: SemEval 2016, 10th International Workshop on
Semantic Evaluation, San Diego, CA, Stroudsburg (PA), pp. 497–511. Association
for Computational Linguistics (2016)
58 A. Mehndiratta and K. Asawa

4. Agirre, E., Bos, J., Diab, M., Manandhar, S., Marton, Y., Yuret, D.: *SEM 2012:
The First Joint Conference on Lexical and Computational Semantics-Volume 1:
Proceedings of the Main Conference and the Shared Task, and Volume 2: Proceed-
ings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012),
pp. 385–393. Association for Computational Linguistics (2012)
5. Agirre, E., Cer, D., Diab, M., Gonzalez-Agirre, A., Guo, W.: *SEM 2013 shared
task: semantic textual similarity. In: Second Joint Conference on Lexical and Com-
putational Semantics (*SEM), Volume 1: Proceedings of the Main Conference and
the Shared Task: Semantic Textual Similarity, pp. 32–43. Association for Compu-
tational Linguistics, June 2013
6. Cer, D., Diab, M., Agirre, E., Lopez-Gazpio, I., Specia, L.: SemEval-2017 task 1:
semantic textual similarity-multilingual and cross-lingual focused evaluation. In:
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval
2017), Vancouver, Canada, pp. 1–14. Association for Computational Linguistics
(2017)
7. Sultan, M.A., Bethard, S., Sumner, T.: DLS@CU: sentence similarity from word
alignment and semantic vector composition. In: Proceedings of the 9th Interna-
tional Workshop on Semantic Evaluation, pp. 148–153. Association for Computa-
tional Linguistics, June 2015
8. Wu, H., Huang, H.Y., Jian, P., Guo, Y., Su, C.: BIT at SemEval-2017 task 1: using
semantic information space to evaluate semantic textual similarity. In: Proceedings
of the 11th International Workshop on Semantic Evaluation (SemEval 2017), pp.
77–84. Association for Computational Linguistics, August 2017
9. Rychalska, B., Pakulska, K., Chodorowska, K., Walczak, W., Andruszkiewicz, P.:
Samsung Poland NLP team at SemEval-2016 task 1: necessity for diversity; combin-
ing recursive autoencoders, WordNet and ensemble methods to measure semantic
similarity. In: Proceedings of the 10th International Workshop on Semantic Eval-
uation (SemEval 2016), pp. 602–608. Association for Computational Linguistics,
June 2016
10. Brychcı́n, T., Svoboda, L.: UWB at SemEval-2016 task 1: semantic textual sim-
ilarity using lexical, syntactic, and semantic information. In: Proceedings of the
10th International Workshop on Semantic Evaluation (SemEval 2016), pp. 588–
594. Association for Computational Linguistics, June 2016
11. Wieting, J., Bansal, M., Gimpel, K., Livescu, K.: Towards universal paraphrastic
sentence embeddings. In: International Conference on Learning Representations
(ICLR) (2015)
12. Arora, S., Liang, Y., Ma, T.: A simple but tough-to-beat baseline for sentence
embeddings. In: International Conference on Learning Representations (ICLR)
(2016)
13. Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word repre-
sentation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural
Language Processing, pp. 1532–1543, October 2014
14. Peters, M.E., et al.: Deep contextualized word representations. In: Proceedings of
the North American Chapter of the Association for Computational Linguistics:
Human Language Technologies (2018)
15. McCann, B., Bradbury, J., Xiong, C., Socher, R.: Learned in translation: contex-
tualized word vectors. In: Advances in Neural Information Processing Systems, pp.
6297–6308 (2017)
16. Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., Bowman, S.R.: GLUE: a
multi-task benchmark and analysis platform for natural language understanding.
In: International Conference on Learning Representations (ICLR) (2019)
Spectral Learning of Semantic Units 59

17. Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R.R., Le, Q.V.: XLNet:
generalized autoregressive pretraining for language understanding. In: Advances in
Neural Information Processing Systems, pp. 5753–5763 (2019)
18. Sun, Y., et al.: ERNIE 2.0: a continual pre-training framework for language under-
standing. In: AAAI, pp. 8968–8975 (2020)
19. Islam, A., Inkpen, D.: Semantic text similarity using corpus-based word similarity
and string similarity. ACM Trans. Knowl. Discov. Data (TKDD) 2(2), 1–25 (2008)
20. Li, Y., McLean, D., Bandar, Z.A., O’shea, J.D., Crockett, K.: Sentence similarity
based on semantic nets and corpus statistics. IEEE Trans. Knowl. Data Eng. 18(8),
1138–1150 (2006)
21. Wu, H., Huang, H.: Sentence similarity computational model based on information
content. IEICE Trans. Inf. Syst. 99(6), 1645–1652 (2016)
22. Hotelling, H.: Canonical correlation analysis (CCA). J. Educ. Psychol. 10 (1935)
23. Foster, D.P., Kakade, S.M., Zhang, T.: Multi-view dimensionality reduction via
canonical correlation analysis (2008)
24. Golub, G.H., Reinsch, C.: Singular value decomposition and least squares solutions.
In: Bauer, F.L. (ed.) Linear Algebra, pp. 134–151. Springer, Heidelberg (1971).
https://doi.org/10.1007/978-3-662-39778-7 10

View publication stats

Evolution of Semantic Similarity - A Survey
No ratings yet
Evolution of Semantic Similarity - A Survey
35 pages
Published Paper
No ratings yet
Published Paper
12 pages
NLP Project
No ratings yet
NLP Project
16 pages
Sentence-Level Semantic Textual Similarity Using Word-Level Semantics
No ratings yet
Sentence-Level Semantic Textual Similarity Using Word-Level Semantics
4 pages
10 1002@cpe 5971
No ratings yet
10 1002@cpe 5971
17 pages
AAAI06-123 (Revisar para Referencias)
No ratings yet
AAAI06-123 (Revisar para Referencias)
6 pages
Sentence Similarity Based On Semantic Networks
No ratings yet
Sentence Similarity Based On Semantic Networks
36 pages
Semantic Similarity For English and Arabic Texts: A Review: Alzahrani 2016
No ratings yet
Semantic Similarity For English and Arabic Texts: A Review: Alzahrani 2016
29 pages
NLP Proj
No ratings yet
NLP Proj
13 pages
Short Text Similarity Calculation Based On Jaccard and Semantic Mixture
No ratings yet
Short Text Similarity Calculation Based On Jaccard and Semantic Mixture
9 pages
Semantic Textual Similarity With Siamese Neural Networks: Tharindu Ranasinghe, Constantin or Asan and Ruslan Mitkov
No ratings yet
Semantic Textual Similarity With Siamese Neural Networks: Tharindu Ranasinghe, Constantin or Asan and Ruslan Mitkov
8 pages
A Cognitive Study On Semantic Similarity Analysis
No ratings yet
A Cognitive Study On Semantic Similarity Analysis
6 pages
Evaluating of Efficacy Semantic Similarity Methods
No ratings yet
Evaluating of Efficacy Semantic Similarity Methods
8 pages
Text Similarity Using Siamese Networks and Transformers
No ratings yet
Text Similarity Using Siamese Networks and Transformers
10 pages
Sun 等 - 2022 - Sentence Similarity Based on Contexts
No ratings yet
Sun 等 - 2022 - Sentence Similarity Based on Contexts
16 pages
Review On NLP Paraphrase Detection Approaches
No ratings yet
Review On NLP Paraphrase Detection Approaches
4 pages
A Survey of Numerous Text Similarity Approach
No ratings yet
A Survey of Numerous Text Similarity Approach
10 pages
8-Measuring Text Similarity Based On Structure and Word Embedding
No ratings yet
8-Measuring Text Similarity Based On Structure and Word Embedding
20 pages
Comparable Evaluation of Contemporary Corpus-Based and Knowledge-Based Semantic Similarity Measures of Short Texts
No ratings yet
Comparable Evaluation of Contemporary Corpus-Based and Knowledge-Based Semantic Similarity Measures of Short Texts
7 pages
Expert Systems With Applications: David Sánchez, Montserrat Batet, David Isern, Aida Valls
No ratings yet
Expert Systems With Applications: David Sánchez, Montserrat Batet, David Isern, Aida Valls
11 pages
French Semantic Similarity Corpus
No ratings yet
French Semantic Similarity Corpus
6 pages
Semantic Similarity
No ratings yet
Semantic Similarity
14 pages
A Survey On Semantic Similarity Measures
No ratings yet
A Survey On Semantic Similarity Measures
5 pages
A Novel Hybrid Methodology of Measuring
No ratings yet
A Novel Hybrid Methodology of Measuring
10 pages
Document Similarity Algorithms
No ratings yet
Document Similarity Algorithms
10 pages
Measurement of Semantic Text Similarity
No ratings yet
Measurement of Semantic Text Similarity
13 pages
Semantic Similarity Between Medium-Sized Texts
No ratings yet
Semantic Similarity Between Medium-Sized Texts
13 pages
Data & Knowledge Engineering: Jesús Oliva, José Ignacio Serrano, María Dolores Del Castillo, Ángel Iglesias
No ratings yet
Data & Knowledge Engineering: Jesús Oliva, José Ignacio Serrano, María Dolores Del Castillo, Ángel Iglesias
3 pages
Paper 125
No ratings yet
Paper 125
11 pages
Semantic Kernel for Text Classification
No ratings yet
Semantic Kernel for Text Classification
19 pages
Deep Learning For Semantic Similarity
No ratings yet
Deep Learning For Semantic Similarity
7 pages
Semeval-2016 Task 1: Semantic Textual Similarity, Monolingual and Cross-Lingual Evaluation
No ratings yet
Semeval-2016 Task 1: Semantic Textual Similarity, Monolingual and Cross-Lingual Evaluation
15 pages
Semantic Textual Similarity Task
No ratings yet
Semantic Textual Similarity Task
9 pages
Expert Systems With Applications: Raja Muhammad Suleman, Ioannis Korkontzelos
No ratings yet
Expert Systems With Applications: Raja Muhammad Suleman, Ioannis Korkontzelos
9 pages
Semantic Similarity in Words
No ratings yet
Semantic Similarity in Words
10 pages
Corpus Linguistics: National Conference On Artificial Intelligence. 1, PP
No ratings yet
Corpus Linguistics: National Conference On Artificial Intelligence. 1, PP
4 pages
Vector Based Models
No ratings yet
Vector Based Models
41 pages
1677 Multiway Attention Modeli
No ratings yet
1677 Multiway Attention Modeli
9 pages
Measuring Similarity Between Question Pair in Online Forums: 1 Pramod Kumar Rai 2 Kunal Chakma
No ratings yet
Measuring Similarity Between Question Pair in Online Forums: 1 Pramod Kumar Rai 2 Kunal Chakma
5 pages
Mathematics 12 03990 v2
No ratings yet
Mathematics 12 03990 v2
20 pages
Compound Noun Semantics Analysis
No ratings yet
Compound Noun Semantics Analysis
167 pages
Lexical Text Similarity in NLP
No ratings yet
Lexical Text Similarity in NLP
16 pages
A Systematic Literature Review of Similarity Analysis Techniques For Bangla Text
No ratings yet
A Systematic Literature Review of Similarity Analysis Techniques For Bangla Text
8 pages
Compositional Word Relation Study
No ratings yet
Compositional Word Relation Study
33 pages
1 s2.0 S0020025522012531 Main
No ratings yet
1 s2.0 S0020025522012531 Main
17 pages
A Survey of Text Similarity Approaches: Wael H. Gomaa Aly A. Fahmy
No ratings yet
A Survey of Text Similarity Approaches: Wael H. Gomaa Aly A. Fahmy
6 pages
Research Article
No ratings yet
Research Article
11 pages
Measure Term Similarity Using A Semantic Network Approach
No ratings yet
Measure Term Similarity Using A Semantic Network Approach
5 pages
A Soft Introduction To NLP - Semantic Similarity Calculations Using Python - Medium
No ratings yet
A Soft Introduction To NLP - Semantic Similarity Calculations Using Python - Medium
13 pages
Bagpack: A General Framework To Represent Semantic Relations
No ratings yet
Bagpack: A General Framework To Represent Semantic Relations
8 pages
Word-Level Neutrosophic Sentiment Similarity
No ratings yet
Word-Level Neutrosophic Sentiment Similarity
36 pages
Using Similarity Network Analysis To Improve Text Similarity Calculations
No ratings yet
Using Similarity Network Analysis To Improve Text Similarity Calculations
22 pages
Semeval-2024 Task 1: Semantic Textual Relatedness For African and Asian Languages
No ratings yet
Semeval-2024 Task 1: Semantic Textual Relatedness For African and Asian Languages
16 pages
Text Semantic Similarity
No ratings yet
Text Semantic Similarity
17 pages
UMA Literature Survey
No ratings yet
UMA Literature Survey
11 pages
Annotating Training Data For Conditional Semantic Textual Similarity Measurement Using Large Language Models
No ratings yet
Annotating Training Data For Conditional Semantic Textual Similarity Measurement Using Large Language Models
13 pages
NLP Text Similarity for Experts
No ratings yet
NLP Text Similarity for Experts
31 pages
2017 2nd and 4th Class Schedule-Final
No ratings yet
2017 2nd and 4th Class Schedule-Final
2 pages
Grade 8-Performing and Visual Arts Pva - Fetena - Net - 9aeb
100% (1)
Grade 8-Performing and Visual Arts Pva - Fetena - Net - 9aeb
115 pages
Mitiku Tamirat Profile
No ratings yet
Mitiku Tamirat Profile
1 page
2017 EC Academic Calendar
No ratings yet
2017 EC Academic Calendar
3 pages
2nd Yr Maths Summer Class Sechedule
No ratings yet
2nd Yr Maths Summer Class Sechedule
1 page
Dereje Mesfin: Sno College Department Time Lab # of Stud. Invigilator Supervisor
No ratings yet
Dereje Mesfin: Sno College Department Time Lab # of Stud. Invigilator Supervisor
1 page
Utilizing Semantic Textual Similarity For Clinical Survey Data Feature Selection
No ratings yet
Utilizing Semantic Textual Similarity For Clinical Survey Data Feature Selection
9 pages
Shimaa IsmailSemanticSimilarity
No ratings yet
Shimaa IsmailSemanticSimilarity
11 pages
Applsci 12 09691 v2
No ratings yet
Applsci 12 09691 v2
35 pages
Grade 8 CTE Student Guide
No ratings yet
Grade 8 CTE Student Guide
162 pages
Text Encoders Lack Knowledge: Leveraging Generative Llms For Domain-Specific Semantic Textual Similarity
No ratings yet
Text Encoders Lack Knowledge: Leveraging Generative Llms For Domain-Specific Semantic Textual Similarity
12 pages
Grade 8-Social Studies Fetena Net 1dc2
100% (6)
Grade 8-Social Studies Fetena Net 1dc2
213 pages
HDP Work Book Final
100% (2)
HDP Work Book Final
98 pages
Grade 8 IT Textbook Ethiopia
100% (1)
Grade 8 IT Textbook Ethiopia
115 pages
2-Lecture Two - (Back Ground of NLP)
No ratings yet
2-Lecture Two - (Back Ground of NLP)
65 pages
Network Design for IT Professionals
No ratings yet
Network Design for IT Professionals
141 pages
Kaiwartya 2016
No ratings yet
Kaiwartya 2016
17 pages
Collective Human Opinions in Semantic Textual Simi
No ratings yet
Collective Human Opinions in Semantic Textual Simi
17 pages
The Final Main Thesis-Compressed
No ratings yet
The Final Main Thesis-Compressed
85 pages
Semantic Relations in Linguistics
No ratings yet
Semantic Relations in Linguistics
239 pages
POS Tagging for NLP Students
No ratings yet
POS Tagging for NLP Students
36 pages
PVA Grade 10 Student Textbook Final Version V20220802 - Compressed
50% (2)
PVA Grade 10 Student Textbook Final Version V20220802 - Compressed
144 pages
Transformer Boosting for Text Similarity
No ratings yet
Transformer Boosting for Text Similarity
6 pages
Let2 W
No ratings yet
Let2 W
46 pages
7-Information Extraction (IE) and Machine Translation (MT)
No ratings yet
7-Information Extraction (IE) and Machine Translation (MT)
46 pages
Semantic Analysis in NLP
No ratings yet
Semantic Analysis in NLP
25 pages
9 Speech Recognition
No ratings yet
9 Speech Recognition
26 pages
8-Deep Learning For NLP
No ratings yet
8-Deep Learning For NLP
49 pages
HLLT 021 Slides
No ratings yet
HLLT 021 Slides
20 pages
Case Study Grading Rubric BUSI 2101
No ratings yet
Case Study Grading Rubric BUSI 2101
2 pages
Grammar and Adjective Exercises
No ratings yet
Grammar and Adjective Exercises
2 pages
Grammar Workshop Word Form
No ratings yet
Grammar Workshop Word Form
2 pages
Mid Exam Translation and Interpreting Practice 2024 (Autorecovered) 2
No ratings yet
Mid Exam Translation and Interpreting Practice 2024 (Autorecovered) 2
4 pages
Use of Literary Device in Advrtising
No ratings yet
Use of Literary Device in Advrtising
2 pages
Language Characteristics for Students
No ratings yet
Language Characteristics for Students
13 pages
Letters and Sounds
No ratings yet
Letters and Sounds
32 pages
Ehl 45%, 60%, 80% Strategy
No ratings yet
Ehl 45%, 60%, 80% Strategy
29 pages
Case Study 1
No ratings yet
Case Study 1
16 pages
Simple Present Tense: Compare Live To Leave. I Live in Seoul. I Have To Leave Now
No ratings yet
Simple Present Tense: Compare Live To Leave. I Live in Seoul. I Have To Leave Now
1 page
Structural Analysis of Words
No ratings yet
Structural Analysis of Words
13 pages
5th Grade - 3rd Trimester 2024 - CLASS PLAN - IGLESIAS
No ratings yet
5th Grade - 3rd Trimester 2024 - CLASS PLAN - IGLESIAS
3 pages
Avant Garde Definition
100% (2)
Avant Garde Definition
2 pages
Direct Speech
0% (2)
Direct Speech
8 pages
Student Text 15
No ratings yet
Student Text 15
238 pages
Transkrip Nilai Dewi
No ratings yet
Transkrip Nilai Dewi
6 pages
New Eng File Beginner Workbook
No ratings yet
New Eng File Beginner Workbook
2 pages
CircleOfLearning Rubric For Well Written Paragraph
No ratings yet
CircleOfLearning Rubric For Well Written Paragraph
1 page
Use of English (Grammar) : Key Word Transformation Word Formation Open Close Multiple Choice
No ratings yet
Use of English (Grammar) : Key Word Transformation Word Formation Open Close Multiple Choice
1 page
Application Letter
No ratings yet
Application Letter
15 pages
New Dhammapada
No ratings yet
New Dhammapada
183 pages
Ed205 Part 4 Chomsky Montessori Freobel Bloom
No ratings yet
Ed205 Part 4 Chomsky Montessori Freobel Bloom
19 pages
Year 9 English Workbook: Letter to the Editor
No ratings yet
Year 9 English Workbook: Letter to the Editor
38 pages
A - B - C - Mathematics 26-04-2023 (ES)
No ratings yet
A - B - C - Mathematics 26-04-2023 (ES)
39 pages
Key Concepts in ELT - Noticing
No ratings yet
Key Concepts in ELT - Noticing
1 page
Iconicity in Language - An Encyclopaedic Dictionary
100% (1)
Iconicity in Language - An Encyclopaedic Dictionary
479 pages
Present Simple Tense Exercises
50% (2)
Present Simple Tense Exercises
1 page
LL 219 - Module 4.0
No ratings yet
LL 219 - Module 4.0
15 pages
General Linguistics: Prepared By: Eram Amjed
No ratings yet
General Linguistics: Prepared By: Eram Amjed
11 pages

20AMSPCSIC01

Uploaded by

20AMSPCSIC01

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Spectral Learning of Semantic Units in a Sentence Pair to Evaluate Semantic

Chapter in Lecture Notes in Computer Science · January 2020

Akanksha Bhardwaj Krishna Asawa

SEE PROFILE SEE PROFILE

The user has requested enhancement of the downloaded file.

Akanksha Mehndiratta(B) and Krishna Asawa

Jaypee Institute of Information Technology, Noida, India

Abstract. Semantic Textual Similarity (STS) measures the degree of

Keywords: Semantic Textual Similarity · Natural Language

1. Supervised Systems: The techniques designed in this category generate

2 Canonical Correlation Analysis

CV1T Cxy CV2

CV1 = a1 x1 + a2 x2 + a3 x3 + .....an xn (2)

CV1 = b1 y1 + b2 y2 + b3 y3 + .....bm ym (3)

1. Determine the minimum number of variates pair be generated.

2.1 CCA for Computing Semantic Units

P (X (1) , X (2) |h) = P (X (1) |h)P (X (2) |h) (4)

P (S1 , S2 |c) = P (S1 |c)P (S2 |c) (5)

Table 1. A sample demonstration of sentence pair available in the SemEval semantic

3.2 Data Preprocessing

1. Tokenization - Processing one sentence at a time from the dataset the

3.3 Identifying Semantic Units

Table 2. A sample of semantic units identified on a sentence pair in the SemEval

3.4 Formulating Similarity

1. Cosine similarity: It is a very common and popular measure for similar-

Similarity score is calculated by computing the mean of cosine similarity for

A min-max normalization, given in Eq. 7, is applied on the similarity score

4 Results and Analysis

Table 4. Results on STS-B task from GLUE Benchmark (Pearson’s r x 100).

Table 5. Results of proposed spectral learning-based model on the SemEval 2017

View publication stats

You might also like