[ANN] release of svmwrap: a wrapper around libsvm-tools

UnixJunkie · November 15, 2021, 12:59am

Hello,

[this post is not spam, stupid bot, don’t remove it]

I am pleased to announce the availability in opam of the svmwrap package.
A wrapper around libsvm’s svm-train and svm-predict executables.
Currently, only regression modeling is supported, using the linear, RBF, sigmoid or polynomial kernel.

The quite scary usage looks like this:

usage: svmwrap
  -i <filename>: training set or DB to screen
  --feats <int>: number of features
  [-o <filename>]: predictions output file
  [-np <int>]: ncores
  [--kernel <string>] choose kernel type {Lin|RBF|Sig|Pol}
  [-c <float>]: fix C
  [-e <float>]: epsilon in the loss function of epsilon-SVR;
  (0 <= epsilon <= max_i(|y_i|))
  [-g <float>]: fix gamma (for RBF and Sig kernels)
  [-r <float>]: fix r for the Sig kernel
  [--iwn]: turn ON instance-wise-normalization
  [--scale]: turn ON [0:1] scaling (NOT PRODUCTION READY)
  [--no-plot]: no gnuplot
  [{-n|--NxCV} <int>]: folds of cross validation
  [-q]: quiet
  [-v|--verbose]: equivalent to not specifying -q
  [--seed <int>]: fix random seed
  [-p <float>]: training set portion (in [0.0:1.0])
  [--pairs]: read from .AP files (atom pairs; will offset feat. indexes by 1)
  [--train <train.liblin>]: training set (overrides -p)
  [--valid <valid.liblin>]: validation set (overrides -p)
  [--test <test.liblin>]: test set (overrides -p)
  [{-l|--load} <filename>]: prod. mode; use trained models
  [{-s|--save} <filename>]: train. mode; save trained models
  [-f]: force overwriting existing model file
  [--scan-c]: scan for best C
  [--scan-e <int>]: epsilon scan #steps for SVR
  [--scan-g]: scan for best gamma
  [--regr]: regression (SVR); also, implied by -e and --scan-e
  [--e-range <float>:<int>:<float>]: specific range for e
  (semantic=start:nsteps:stop)
  [--c-range <float,float,...>] explicit scan range for C 
  (example='0.01,0.02,0.03')
  [--g-range <float,float,...>] explicit range for gamma 
  (example='0.01,0.02,0.03')
  [--r-range <float,float,...>] explicit range for r 
  (example='0.01,0.02,0.03')

For people who know my linwrap opam package (a wrapper around liblinear tools), this is quite similar.

Regards,
F.

sid · November 15, 2021, 11:01am

Thanks for this. Seems like an interesting library to add to one’s OCaml numerical computation arsenal. When might we want to use SVM vs neural networks?

Lots of resources on the internet on SVM vs neural networks but wanted to get your perspective. Can you share some applications where you are using SVM (instead of neural networks?)

UnixJunkie · November 15, 2021, 11:44pm

Svmwrap is not a library: this is an end-user command-line program.
If you want a library, maybe Owl has some support for SVM.
I am not an expert of neural networks, so I cannot so much answer your question.
The sad truth with machine learning is that you need to try several methods on a given
dataset and task so that you can (eventually) find one which works for your problem.

pveber · November 16, 2021, 7:47pm

Not an expert myself, but a rough summary is that SVMs have very good generalization properties, but require memory space that is quadratic in the size of the learning sample (you need to store a similarity measure for all pairs of examples). So they are particularly suited when you have a small training set, but are too expensive for training sets with millions of points (except for linear SVMs, but then you don’t have the benefits of non-linearity). Also, they are very handy when you have a good idea of how to compare your patterns (and take advantage of it by defining a custom kernel function).

Neural networks on the other hand can be used on datasets with millions of examples (in particular thanks to stochastic gradient), are very cheap when predicting on new inputs (compared to SVMs). However the theory behind them is a lot less clear, and in practice they require a lot of know-how both for selecting a proper architecture and training them (many tricks that may or may not work). In contrast, learning SVMs is a lot more straightforward.

In case you’d like to try SVMs directly from ocaml, I maintain a library (initially written by Dominik Brugger and Oliver Gu) that provide access to libsvm. It is available on opam. There is also a library written by @UnixJunkie that uses R under the hood.

UnixJunkie · November 18, 2021, 8:45am

FTR, I tried to use the ocaml bindings to libsvm in the past.
My conclusion at the time was that those bindings are completely buggy / not ready for production use.
I wonder if anyone is using those bindings in the real world.

pveber · November 18, 2021, 3:40pm

Maybe you’re referring to this issue? If so I (hopefully) fixed it, and could use the binding without any problem (although it was just for a small project). The version released on OPAM includes the fix.

Topic		Replies	Views
[ANN] first release of orsvm-e1071 (SVM from R usable from OCaml) Community	0	654	February 26, 2018
The poor man's interface to some SVM package in R Community svm , machine-learning , r	0	1357	February 22, 2018
[ANN] Release of OCaml-R Community r , statistics	8	1321	November 22, 2018
Scikit-learn for OCaml Community announce , machine-learning , regression , classification , sklearn	3	1214	April 17, 2020
[ANN] 1sr release of omlr: Multiple Linear Regression modeling (using R under the carpet) Community announce , machine-learning , regression	0	674	September 14, 2020

[ANN] release of svmwrap: a wrapper around libsvm-tools

Related topics