When using DBIDRangeDatabaseConnection
, and not giving ELKI any actual data, the visualization cannot produce a particularly useful result (because it doesn't have the actual data, after all). Nor can the data be evaluated automatically.
The "all in one trivial clustering" is an artifact from the automatic attempts to visualize the data, but for the reasons discussed above this cannot work. This clustering is automatically added for unlabeled data, to allow some visualizations to work.
There are two things to do for you:
set an output handler. For example
-resulthandler ResultWriter
, which will produce an output similar to this:ID=0 lof-outlier=1.0
Where
ID=
is the object number, andlof-outlier=
is the LOF outlier score.Alternatively, you can implement your own output handler. An example is found here: http://elki.dbs.ifi.lmu.de/browser/elki/trunk/src/tutorial/outlier/SimpleScoreDumper.java
fix
DBIDRangeDatabaseConnection
. You are however bitten by a bug in ELKI 0.6.0~beta1: theDBIDRangeDatabaseConnection
actually doesn't initialize its parameters correctly. The trivial bug fix (parameters not initialized correctly in the constructor) is here:http://elki.dbs.ifi.lmu.de/changeset/11027/elki
Alternatively, you can create a dummy input file and use the regular text input. A file containing
0 1 2 ...
should do the trick. Use
-dbc.in numbers100.txt -dbc.filter FixedDBIDsFilter -dbc.startid 0
. The latter arguments are to have your IDs start at 0, not 1 (default).This workaround will produce a slightly different output format:
ID=0 0.0 lof-outlier=1.0
where the additional column is from the dummy file. The dummy values will not affect the algorithm result of LOF, when an external distance function is used; but this approach will use some additional memory.