next up previous contents index
Next: Cross-validation Up: Error computation Previous: Cost curve   Contents   Index


Generating artificial outliers

When you are not so fortunate to have example outliers available for testing, you can create them yourself. Say that z is a set of test target objects. Artificial outliers can be generated by:

  >> z_o = make_outliers(z,100)
This creates a new dataset from z, containing both the target objects from z and 100 new artificial outliers. These are generated from a uniform spherical distribution around z.

This works well in practice for low dimensional dataset. For higher dimensions, it becomes very inefficient. Most of the data will be in the 'corners' of the box. In these cases it is better to generate data uniform in a sphere.

  >> z_o = gendatout(z,100)
In this version, the most tight hypersphere around the data is fitted. Given the center and radius of this sphere, data can be uniformly generated by randsph (this is not trivial!).



David M.J. Tax 2006-07-26