Can I increase number of SVM training label in OpenCV?

Question

Can I increase number of SVM training label in OpenCV?

I'm building OCR application with Visual Studio 2010, C++, SVM in OpenCV. It's ok when I train SVM with under 181 different labels but fails when over 181 labels. Below is IDE and OpenCV error message and my code. Please help me, thank you so much!

IDE error message

First-chance exception at 0x771e4b32 in OCR.exe: Microsoft C++ exception: cv::Exception at memory location 0x0081da74.. The thread 'Win32 Thread' (0xdac) has exited with code -1073741510 (0xc000013a). The program '[2512] OCR.exe: Native' has exited with code -1073741510 (0xc000013a).

OpenCV error message

......\src\opencv\modules\core\src\datastructs.cpp:332: error: (-211) requested size is negative or too big

SVM's configuration

CvSVMParams params; 
params.svm_type = CvSVM::C_SVC; 
params.kernel_type = CvSVM::LINEAR; 
params.term_crit  = cvTermCriteria(CV_TERMCRIT_ITER, 100, 1e-6);

SVM.train( training_vectors, training_labels, cv::Mat(), cv::Mat(), params );

c++

opencv

memory

machine-learning

svm

asked on Stack Overflow Sep 11, 2013 by

Tri Truong • edited Sep 18, 2013 by

lejlot

1 Answer

libSVM uses "one vs all" technique to represent multi-class problem using the binary SVM classifier. It means, that if you have N (>2) labels, libSVM will generate N distinct classifiers, each with different data labeling (so it expresses the "one vs all" scheme). This can lead to the memory problems that you are experiencing. Some other models like for example neural networks or knn can represent the multi-class classification without such overhead. So if your data is to large to treat it the way libsvm does you have at least three possible options:

Change SVM to some other model, which can directly addresss mutli-label classification
Try to use other, lighter implementation of the library, especially that opencv does not use the most recent implementation of libsvm (it could help, but does not have to)
You can manually do the "one vs all" implementation and save each separate model. This way you should avoid the memory problems, as at any time you will allocate at most as much memory as it is needed for a binary problem. At the end you just load your models from file and apply simple voting scheme. If saved models are too big, it means that your model overfitted (in SVMs the overfitting is usually expressed as too big number of support vectors, which are in fact the only thing needed to define the model - so if the model has too many sv's to load them to memory it means that it is most probably incorrectly trained and you should try different parameters/kernels)

answered on Stack Overflow Sep 18, 2013 by

lejlot

User contributions licensed under CC BY-SA 3.0