Classification
Classification is the activity of assigning objects to some pre-existing classes or categories. This is
distinct from the task of establishing the classes themselves (for example through cluster
analysis).[1] Examples include diagnostic tests, identifying spam emails and deciding whether to
give someone a driving license.
As well as 'category', synonyms or near-synonyms for 'class' include 'type', 'species', 'forms', 'order',
'concept', 'taxon', 'group', 'identification' and 'division'.
The meaning of the word 'classification' (and its synonyms) may take on one of several related
meanings. It may encompass both classification and the creation of classes, as for example in 'the
task of categorizing pages in Wikipedia'; this overall activity is listed under taxonomy. It may refer
exclusively to the underlying scheme of classes (which otherwise may be called a taxonomy). Or it
may refer to the label given to an object by the classifier.
Classification is a part of many different kinds of activities and is studied from many different
points of view including medicine, philosophy,[2] law, anthropology, biology, taxonomy, cognition,
communications, knowledge organization, psychology, statistics, machine learning, economics and
mathematics.
Binary vs multi-class classification
Methodological work aimed at improving the accuracy of a classifier is commonly divided between
cases where there are exactly two classes (binary classification) and cases where there are three or
more classes (multiclass classification).
Evaluation of accuracy
Unlike in decision theory, it is assumed that a classifier repeats the classification task over and over.
And unlike a lottery, it is assumed that each classification can be either right or wrong; in the theory
of measurement, classification is understood as measurement against a nominal scale. Thus it is
possible to try to measure the accuracy of a classifier.
Measuring the accuracy of a classifier allows a choice to be made between two alternative
classifiers. This is important both when developing a classifier and in choosing which classifier to
deploy. There are however many different methods for evaluating the accuracy of a classifier and no
general method for determining which method should be used in which circumstances. Different
fields have taken different approaches, even in binary classification (see Evaluation of binary
classifiers). In pattern recognition, error rate is popular. The Gini coefficient and KS statistic are
widely used in the credit scoring industry. Sensitivity and specificity are widely used in epidemiology
and medicine. Precision and recall are widely used in information retrieval.[3]
Classifier accuracy depends greatly on the characteristics of the data to be classified. There is no
single classifier that works best on all given problems (a phenomenon that may be explained by the
no-free-lunch theorem).
See also
Help:Category, for information about Wikipedia's categories
Class (disambiguation)
Classified (disambiguation)
Classifier (disambiguation)
Cognitive categorization
Data classification (disambiguation)
Classification theorem
Folk taxonomy
Fuzzy classification
References
1. "The Classification Society | Scientific Classification Organization" (https://www.theclassificati
onsociety.org/about/) .
2. "Classification" (https://iep.utm.edu/classification-in-science/) . Internet Encyclopedia of
Philosophy. Retrieved 10 January 2025.
3. David Hand (2012). "Assessing the Performance of Classification Methods". International
Statistical Review. 80 (3): 400–414. doi:10.1111/j.1751-5823.2012.00183.x (https://doi.org/10.
1111%2Fj.1751-5823.2012.00183.x) .
External links
Media related to Classification at Wikimedia Commons