Next: Inspecting one-class classifiers
Up: Classifiers
Previous: Prtools classifiers
Contents
Index
Creating one-class classifiers
The one-class classifiers should be trained on the datasets from the
previous chapter. Many one-class classifiers do not know how to use
example outliers in their training data. They may therefore complain, or
just ignore the outlier objects completely if you supply them in your
training data. For now, I call it the responsibility of the user...
All one-class classifiers share the same characteristics:
- Their names end in dd,
- Their second argument is always the error they may make on the
target class (the fraction false negative),
- Their third argument should characterize the complexity of the
classifier. That means that for one extreme of the parameter values
the error on the targets is low, but it therefore has a high error on
the outlier class (the model is undertrained). For the other extreme
of the parameter values, the error on the target class is high, but the
error on the outliers is low (the model is overtrained). This complexity
parameter can then be optimized using consistent_occ.
- The mapping should output the labels target and outlier.
- The mapping should contain a parameter threshold which
defines the separation between the target and outlier class. In
practice, this is only interesting for programmers who want to
implement a classifier themselves. Please look at section
4.6.
An example of a one-class classifier is for instance:
>> x = target_class(gendatb([20 0]),'1');
>> w = gauss_dd(x,0.1)
This trains a classifier gauss_dd on data x (this particular
classifier just estimates a Gaussian density on the target class). A
threshold is put such that
of the training target objects will be
rejected and classified as outlier. So the fraction false negative
will be
. (Note that this is optimized on the training data. This
means that the performance on an independent test set might deviate
significantly!) After this rejection threshold, other parameters can
be given (for instance, for the
-means clustering method, it is the
number of clusters k).
These one-class classifiers are normal mappings in the Prtools
sense. So they can be plotted by plotc, can be combined with other
mappings by [], *, etc. To check if a classifier is a one-class
classifier (i.e. it labels objects as target or outlier),
use isocc.
>> x=oc_set(gendatb([50,10]),'1')
>> scatterd(x)
>> w = svdd(target_class(x),0.1,8);
>> plotc(w)
>> w = svdd(x,0.1,8);
>> plotc(w)
|
![\includegraphics[width=6cm]{examp_scat}](img6.png) |
Next: Inspecting one-class classifiers
Up: Classifiers
Previous: Prtools classifiers
Contents
Index
David M.J. Tax
2006-07-26