next up previous contents index
Next: Cost curve Up: Error computation Previous: Precision and recall   Contents   Index


Area under the ROC curve

In most cases we are not interested in just one single threshold (in the previous example we took an error of $ 10\%$ on the target class), we want to estimate the whole ROC-curve. This can be estimated by dd_roc:

  >> x = target_class(gendatb([50 0]),'1');
  >> w = svdd(x,0.1,7);
  >> z = oc_set(gendatb(200),'1');
  >> e = dd_roc(w,z)
  >> e = dd_roc(z*w)   % other possibility
  >> e = z*w*dd_roc    % other possibility
First the classifier is trained on x for a specific threshold. Then for varying thresholds, the classifier is evaluated on dataset z. The results are returned in a ROC curve, given in a matrix e with two columns, the first indicating the false negatives, the second the false positives.

Figure 5.1: Receiver-Operating characteristic curve, wit the operating point indicated by the dot.
\includegraphics[width=6cm]{bananaroc}

The ROC-curve can be plotted by:

  >> plotroc(e);
An example of such a ROC curve is shown in figure 5.1.

In the newest version of the toolbox, the ROC is extended to show also the operating point of the classifier. When this feature is required, you have to supply the mapping and the dataset separately:

  >> a = oc_set(gendatb,1);
  >> w = gauss_dd(a,0.1);
  >> h = plotroc(w,a)
By moving the mouse, and clicking, the user can change the position of the operating point. Inside the figure, a new mapping with this new operating point is stored. This mapping can be retrieved in the Matlab working space by:
  >> w2 = getrocw(h)
To get a feeling for this, please try the demo dd_ex8.

Because it is very hard to compare ROC curves of different classifiers, often the AUC error (Area Under the AUC curve is taken). In my definition of the AUC error, the larger the value, the better the one-class classifier. It is computed from the ROC curve values using the function dd_auc:

  >> x = target_class(gendatb([50 0]),'1');
  >> w = svdd;
  >> z = oc_set(gendatb(200),'1');
  >> e = dd_roc(w,x,z);
  >> err = dd_auc(e);

In many cases only a restricted range for the false negatives is of interest: for instance, we want to reject less than half of the target objects. In these cases one may want to set bounds on the range of the AUC error:

  >> e = dd_roc(w,x,z);
  >> err = dd_auc(e,[0.05 0.5]);


next up previous contents index
Next: Cost curve Up: Error computation Previous: Precision and recall   Contents   Index
David M.J. Tax 2006-07-26