Unified Online-learning Library
Documentation
icl_loadDS
[data groundTruth dim] = icl_loadDS(mode, func, ND, NG, noise, minPath)
This is the data generator module to generate learning tasks.

It supports various ways of getting benchmark data for experiments. Several synthetic sources are implemented to generate data with specific properties and a general interface to loading datasets from files is given. For synthetic datasets besides the training data, ground truth data (i.e. without noise, and equally distributed) is generated to compare the results. For datasets from files some parameters are ignored, as they only make sense for automatic generation (e.g. the number of training data). All inputs are scaled to be in [-10,10], and all outputs in [-1,1].


Inputs:
modeoperating mode (regression=1, classification=2)
funcstring, describing function to generate, you can choose from
- sine : Sine function, 1D, y = sin(x)
- linear : linear function, 1D, y = 0.1*x
- poly : 6th order polynomial function, 1D
- nonlin : nonlinear function, 1D, y = 2*abs(x).*exp(-abs(x/2))-1
- linear2 : linear function, 2D, y = 0.03*x(1)+0.07*x(2)
- twocircles : min distance to corners function, 2D, y = (min((norm(x-[-10 10])), (norm(x-[10 -10])))-11.39)/11.3
- crossedridge : crossed ridge function, 2D, y = 1.6211*max([exp(-0.3*x(1)^2), exp(-0.09*x(2)^2), 1.25*exp(-0.1*(x(1)^2+x(2)^2))])-1
- spiral : spiral loop, 2D (typical classification task)
- linear3 : linear function, 3D, y = 0.1*x(1)+0.3*x(2)-0.1*x(3)
- highdimlin : linear hyperplane, 20D, y = randn(1,dim)*x
- highdimnonlin : squareroot hyperplane, 20D, y = randn(1,dim)*sqrt(abs(x))
- relearn : 3d-order-polynomial changing after half of training data, 1D
- drift : 7d-order-polynomial drifting from one to another. Frist third first polynomial, last third second polynomial in between gradual drift.
- dataset... : if the string starts with dataset (as a subfolder), the string will be interpreted as a relativ path to a file. The file should contain columns of inputs and one column of the target output with space-seperation.
NDinteger > 0, number of data samples
NGinteger >= 0, number of ground truth data samples per dimenson -> a regular grid is generated
noisedouble >= 0, variance of additive gaussian noise in standard deviations
minPathboolean, if true, the data followos a fixed path trough the training data with minimal distance between subsequent training data, starting at x_i=-10 for all i, otherwise no sorting is done, i.e. the data is random

Outputs:
datadouble matrix, training dataset
groundTruthdouble matrix, ground truth dataset on a regular grid (only for synthetic data)
diminteger > 0, number of input dimensions

Calls:

Called by:

Authors:

Last change: 2013-1-29 - Version 1.6