C#.Net String Matching Algorithms

Welcome to the MatchingAlgos wiki!

Matching Algorithms Types

Deterministic Matching

An exact match on a data attribute. For example, if two records share the same social security number they refer to the same patient

Probabilistic Matching

A statistical approach that evaluates the probability that two records represent the same person By assigning a score to each data element and adding scores to produce a final score, matches can be made with a degree of confidence if a predetermined threshold is met !

Types of Probabilistic Matching Algorithms

Cosine similarity

Here are two very short texts to compare:

Julie loves me more than Linda loves me
Jane likes me more than Julie loves me We want to know how similar these texts are, purely in terms of word counts (and ignoring word order). We begin by making a list of the words from both texts: me Julie loves Linda than more likes Jane Now we count the number of times each of these words appears in each text: me 2 2 Jane 0 1 Julie 1 1 Linda 1 0 likes 0 1 loves 2 1 more 1 1 than 1 1 We are not interested in the words themselves though. We are interested only in those two vertical vectors of counts. For instance, there are two instances of 'me' in each text. We are going to decide how close these two texts are to each other by calculating one function of those two vectors, namely the cosine of the angle between them. The two vectors are, again: a: [2, 0, 1, 1, 0, 2, 1, 1]

b: [2, 1, 1, 0, 1, 1, 1, 1] The cosine of the angle between them is about 0.822. These vectors are 8-dimensional. A virtue of using cosine similarity is clearly that it converts a question that is beyond human ability to visualise to one that can be. In this case you can think of this as the angle of about 35 degrees which is some 'distance' from zero or perfect agreement.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Matching Algo		Matching Algo
LICENSE		LICENSE
Matching Algo.sln		Matching Algo.sln
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

C#.Net String Matching Algorithms

Matching Algorithms Types

Deterministic Matching

Probabilistic Matching

Types of Probabilistic Matching Algorithms

Cosine similarity

About

Uh oh!

Releases

Packages

Languages

License

amilathennakoon/String-Matching-Algorithms

Folders and files

Latest commit

History

Repository files navigation

C#.Net String Matching Algorithms

Matching Algorithms Types

Deterministic Matching

Probabilistic Matching

Types of Probabilistic Matching Algorithms

Cosine similarity

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages