Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
24 views3 pages

Structured Prediction

Structured prediction refers to supervised machine learning techniques that predict structured objects instead of discrete values, often using approximate inference due to computational complexity. Applications include natural language processing, bioinformatics, and computer vision, with sequence tagging being a prominent example. Techniques for structured prediction involve various models such as conditional random fields, structured support vector machines, and recurrent neural networks.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views3 pages

Structured Prediction

Structured prediction refers to supervised machine learning techniques that predict structured objects instead of discrete values, often using approximate inference due to computational complexity. Applications include natural language processing, bioinformatics, and computer vision, with sequence tagging being a prominent example. Techniques for structured prediction involve various models such as conditional random fields, structured support vector machines, and recurrent neural networks.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Structured prediction

Structured prediction or structured output learning is an umbrella term for supervised machine
learning techniques that involves predicting structured objects, rather than discrete or real values.[1]

Similar to commonly used supervised learning techniques, structured prediction models are typically
trained by means of observed data in which the predicted value is compared to the ground truth, and this
is used to adjust the model parameters. Due to the complexity of the model and the interrelations of
predicted variables, the processes of model training and inference are often computationally infeasible, so
approximate inference and learning methods are used.

Applications
An example application is the problem of translating a natural language sentence into a syntactic
representation such as a parse tree. This can be seen as a structured prediction problem[2] in which the
structured output domain is the set of all possible parse trees. Structured prediction is used in a wide
variety of domains including bioinformatics, natural language processing (NLP), speech recognition, and
computer vision.

Example: sequence tagging


Sequence tagging is a class of problems prevalent in NLP in which input data are often sequential, for
instance sentences of text. The sequence tagging problem appears in several guises, such as part-of-
speech tagging (POS tagging) and named entity recognition. In POS tagging, for example, each word in a
sequence must be 'tagged' with a class label representing the type of word:

This DT
is VBZ
a DT
tagged JJ
sentence. NN

The main challenge of this problem is to resolve ambiguity: in the above example, the words "sentence"
and "tagged" in English can also be verbs.
While this problem can be solved by simply performing classification of individual tokens, this approach
does not take into account the empirical fact that tags do not occur independently; instead, each tag
displays a strong conditional dependence on the tag of the previous word. This fact can be exploited in a
sequence model such as a hidden Markov model or conditional random field[2] that predicts the entire tag
sequence for a sentence (rather than just individual tags) via the Viterbi algorithm.

Techniques
Probabilistic graphical models form a large class of structured prediction models. In particular, Bayesian
networks and random fields are popular. Other algorithms and models for structured prediction include
inductive logic programming, case-based reasoning, structured SVMs, Markov logic networks,
Probabilistic Soft Logic, and constrained conditional models. The main techniques are:

Conditional random fields


Structured support vector machines
Structured k-nearest neighbours
Recurrent neural networks, in particular Elman networks
Transformers.

Structured perceptron
One of the easiest ways to understand algorithms for general structured prediction is the structured
perceptron by Collins.[3] This algorithm combines the perceptron algorithm for learning linear classifiers
with an inference algorithm (classically the Viterbi algorithm when used on sequence data) and can be
described abstractly as follows:

1. First, define a function that maps a training sample and a candidate prediction to
a vector of length ( and may have any structure; is problem-dependent, but must be
fixed for each model). Let be a function that generates candidate predictions.
2. Then:

Let be a weight vector of length

For a predetermined number of iterations:

For each sample in the training set with true output :

Make a prediction :

Update (from towards ): , where is


the learning rate.

In practice, finding the argmax over is done using an algorithm such as Viterbi or a max-sum,
rather than an exhaustive search through an exponentially large set of candidates.
The idea of learning is similar to that for multiclass perceptrons.

References
1. Gökhan BakIr, Ben Taskar, Thomas Hofmann, Bernhard Schölkopf, Alex Smola and SVN
Vishwanathan (2007), Predicting Structured Data (https://mitpress.mit.edu/books/predicting-
structured-data), MIT Press.
2. Lafferty, J.; McCallum, A.; Pereira, F. (2001). "Conditional random fields: Probabilistic
models for segmenting and labeling sequence data" (http://www.cis.upenn.edu/~pereira/pap
ers/crf.pdf) (PDF). Proc. 18th International Conf. on Machine Learning. pp. 282–289.
3. Collins, Michael (2002). Discriminative training methods for hidden Markov models: Theory
and experiments with perceptron algorithms (http://acl.ldc.upenn.edu/W/W02/W02-1001.pdf)
(PDF). Proc. EMNLP. Vol. 10.

Noah Smith, Linguistic Structure Prediction (https://www.cs.cmu.edu/~nasmith/LSP/), 2011.


Michael Collins, Discriminative Training Methods for Hidden Markov Models (https://www.cs.
columbia.edu/~mcollins/papers/tagperc.pdf), 2002.

External links
Implementation of Collins structured perceptron (https://github.com/ashish01/CollinsTagger)

Retrieved from "https://en.wikipedia.org/w/index.php?title=Structured_prediction&oldid=1273324430"

You might also like