The most
popular subsampling technique is crossvalidation. For an n-fold crossvalidation, the data are partitioned
into n equal parts. The first
part is used as test data set; the rest is used as calibration data set. Then,
the second part is used for the test data and the rest is used for a new calibration.
This procedure is repeated n times
and the predictions of the n test
data are averaged. It is essential that no knowledge of the models is transferred
from fold to fold. There exist no clear rules how many folds to use for the
crossvalidation, whereby the simplest and clearest way of performing crossvalidation
is to leave one sample out at a time. This special variant of crossvalidation
is also called full crossvalidation, leave-one-out or jackknifing and gives
a unique and therefore reproducible result. Yet, it has been shown that increasing
the number of crossvalidation groups results in lower root mean square errors
of predictions giving overly optimistic estimations of predictivity [13]-[16].
This deficiency is known as asymptotically inconsistency in literature [17].