liblineartrain (1)
Leading comments
Hey, EMACS:
NAME
liblineartrain  train a linear classifier and produce a modelSYNOPSIS
liblineartrain [options] training_set_file [model_file]DESCRIPTION
liblineartrain trains a linear classifier using liblinear and produces a model suitable for use with liblinearpredict(1).training_set_file is the file containing the data used for training. model_file is the file to which the model will be saved. If model_file is not provided, it defaults to training_set_file.model.
To obtain good performances, sometimes one needs to scale the data. This can be done with svmscale(1).
OPTIONS
A summary of options is included below. s type

Set the type of the solver:

0 ... L2regularized logistic regression 1 ... L2regularized L2loss support vector classification (dual) (default) 2 ... L2regularized L2loss support vector classification (primal) 3 ... L2regularized L1loss support vector classification (dual) 4 ... multiclass support vector classification 5 ... L1regularized L2loss support vector classification 6 ... L1regularized logistic regression 7 ... L2regularized logistic regression (dual)

 c cost
 Set the parameter C (default: 1)
 e epsilon

Set the tolerance of the termination criterion
For s 0 and 2:

f'(w)_2 <= epsilon*min(pos,neg)/l*f'(w0)_2, where f is the primal function and pos/neg are the number of positive/negative data (default: 0.01)


For s 1, 3, 4 and 7:
 Dual maximal violation <= epsilon; similar to libsvm (default: 0.1)
 f'(w)_inf <= epsilon*min(pos,neg)/l*f'(w0)_inf, where f is the primal function (default: 0.01)
EXAMPLES
Train a linear SVM using L2loss function:

liblineartrain data_file
Train a logistic regression model:

liblineartrain s 0 data_file
Do fivefold crossvalidation using L2loss SVM, using a smaller stopping tolerance 0.001 instead of the default 0.1 for more accurate solutions:

liblineartrain v 5 e 0.001 data_file
Conduct cross validation many times by L2loss SVM and find the parameter C which achieves the best cross validation accuracy:

train C datafile
For parameter selection by C, users can specify other solvers (currently s 0 and s 2 are supported) and different number of CV folds. Further, users can use the c option to specify the smallest C value of the search range. This setting is useful when users want to rerun the parameter selection procedure from a specified C under a different setting, such as a stricter stopping tolerance e 0.0001 in the above example.

train C s 0 v 3 c 0.5 e 0.0001 datafile
Train four classifiers:

positive negative Cp Cn
class 1 class 2,3,4 20 10
class 2 class 1,3,4 50 10
class 3 class 1,2,4 20 10
class 4 class 1,2,3 10 10

liblineartrain c 10 w1 2 w2 5 w3 2 four_class_data_file
If there are only two classes, we train ONE model. The C values for the two classes are 10 and 50:

liblineartrain c 10 w3 1 w2 5 two_class_data_file
Output probability estimates (for logistic regression only) using liblinearpredict(1):

liblinearpredict b 1 test_file data_file.model output_file
SEE ALSO
liblinearpredict(1), svmpredict(1), svmtrain(1)AUTHORS
liblineartrain was written by the LIBLINEAR authors at National Taiwan university for the LIBLINEAR Project.This manual page was written by Christian Kastner <debian@kvr.at>, for the Debian project (and may be used by others).