next up previous contents index
Next: Inspecting one-class classifiers Up: Classifiers Previous: Prtools classifiers   Contents   Index


Creating one-class classifiers

The one-class classifiers should be trained on the datasets from the previous chapter. Many one-class classifiers do not know how to use example outliers in their training data. They may therefore complain, or just ignore the outlier objects completely if you supply them in your training data. For now, I call it the responsibility of the user...

All one-class classifiers share the same characteristics:

  1. Their names end in dd,
  2. Their second argument is always the error they may make on the target class (the fraction false negative),
  3. Their third argument should characterize the complexity of the classifier. That means that for one extreme of the parameter values the error on the targets is low, but it therefore has a high error on the outlier class (the model is undertrained). For the other extreme of the parameter values, the error on the target class is high, but the error on the outliers is low (the model is overtrained). This complexity parameter can then be optimized using consistent_occ.
  4. The mapping should output the labels target and outlier.
  5. The mapping should contain a parameter threshold which defines the separation between the target and outlier class. In practice, this is only interesting for programmers who want to implement a classifier themselves. Please look at section 4.6.

An example of a one-class classifier is for instance:

  >> x = target_class(gendatb([20 0]),'1');
  >> w = gauss_dd(x,0.1)
This trains a classifier gauss_dd on data x (this particular classifier just estimates a Gaussian density on the target class). A threshold is put such that $ 10\%$ of the training target objects will be rejected and classified as outlier. So the fraction false negative will be $ 0.1$. (Note that this is optimized on the training data. This means that the performance on an independent test set might deviate significantly!) After this rejection threshold, other parameters can be given (for instance, for the $ k$-means clustering method, it is the number of clusters k).

These one-class classifiers are normal mappings in the Prtools sense. So they can be plotted by plotc, can be combined with other mappings by [], *, etc. To check if a classifier is a one-class classifier (i.e. it labels objects as target or outlier), use isocc.

>> x=oc_set(gendatb([50,10]),'1')
>> scatterd(x)
>> w = svdd(target_class(x),0.1,8);
>> plotc(w)
>> w = svdd(x,0.1,8);
>> plotc(w)
\includegraphics[width=6cm]{examp_scat}


next up previous contents index
Next: Inspecting one-class classifiers Up: Classifiers Previous: Prtools classifiers   Contents   Index
David M.J. Tax 2006-07-26