I met a runtime error "double free or corruption" in my C++ program that calls a reliable library ANN and uses OpenMP to parallize a for loop.
*** glibc detected *** /home/tim/test/debug/test: double free or corruption (!prev): 0x0000000002527260 ***
Does it mean that the memory at address 0x0000000002527260 is freed more than once?
The error happens at "_search_struct->annkSearch(queryPt, k_max, nnIdx, dists, _eps);" inside function classify_various_k(), which is in turn inside the OpenMP for-loop inside function tune_complexity().
Note that the error happens when there are more than one threads for OpenMP, and does not happen in single thread case. Not sure why.
Following is my code. If it is not enough for diagnose, just let me know. Thanks for your help!
void KNNClassifier::train(int nb_examples, int dim, double **features, int * labels) {
_nPts = nb_examples;
_labels = labels;
_dataPts = features;
setting_ANN(_dist_type,1);
delete _search_struct;
if(strcmp(_search_neighbors, "brutal") == 0) {
_search_struct = new ANNbruteForce(_dataPts, _nPts, dim);
}else if(strcmp(_search_neighbors, "kdtree") == 0) {
_search_struct = new ANNkd_tree(_dataPts, _nPts, dim);
}
}
void KNNClassifier::classify_various_k(int dim, double *feature, int label, int *ks, double * errors, int nb_ks, int k_max) {
ANNpoint queryPt = 0;
ANNidxArray nnIdx = 0;
ANNdistArray dists = 0;
queryPt = feature;
nnIdx = new ANNidx[k_max];
dists = new ANNdist[k_max];
if(strcmp(_search_neighbors, "brutal") == 0) {
_search_struct->annkSearch(queryPt, k_max, nnIdx, dists, _eps);
}else if(strcmp(_search_neighbors, "kdtree") == 0) {
_search_struct->annkSearch(queryPt, k_max, nnIdx, dists, _eps); // where error occurs
}
for (int j = 0; j < nb_ks; j++)
{
scalar_t result = 0.0;
for (int i = 0; i < ks[j]; i++) {
result+=_labels[ nnIdx[i] ];
}
if (result*label<0) errors[j]++;
}
delete [] nnIdx;
delete [] dists;
}
void KNNClassifier::tune_complexity(int nb_examples, int dim, double **features, int *labels, int fold, char *method, int nb_examples_test, double **features_test, int *labels_test) {
int nb_try = (_k_max - _k_min) / scalar_t(_k_step);
scalar_t *error_validation = new scalar_t [nb_try];
int *ks = new int [nb_try];
for(int i=0; i < nb_try; i ++){
ks[i] = _k_min + _k_step * i;
}
if (strcmp(method, "ct")==0)
{
train(nb_examples, dim, features, labels );// train once for all nb of nbs in ks
for(int i=0; i < nb_try; i ++){
if (ks[i] > nb_examples){nb_try=i; break;}
error_validation[i] = 0;
}
int i = 0;
#pragma omp parallel shared(nb_examples_test, error_validation,features_test, labels_test, nb_try, ks) private(i)
{
#pragma omp for schedule(dynamic) nowait
for (i=0; i < nb_examples_test; i++)
{
classify_various_k(dim, features_test[i], labels_test[i], ks, error_validation, nb_try, ks[nb_try - 1]); // where error occurs
}
}
for (i=0; i < nb_try; i++)
{
error_validation[i]/=nb_examples_test;
}
}
......
}
UPDATE:
Thanks! I am now trying to correct the conflict of writing to same memory problem in classify_various_k() by using "#pragma omp critical":
void KNNClassifier::classify_various_k(int dim, double *feature, int label, int *ks, double * errors, int nb_ks, int k_max) {
ANNpoint queryPt = 0;
ANNidxArray nnIdx = 0;
ANNdistArray dists = 0;
queryPt = feature; //for (int i = 0; i < Vignette::size; i++){ queryPt[i] = vignette->content[i];}
nnIdx = new ANNidx[k_max];
dists = new ANNdist[k_max];
if(strcmp(_search_neighbors, "brutal") == 0) {// search
_search_struct->annkSearch(queryPt, k_max, nnIdx, dists, _eps);
}else if(strcmp(_search_neighbors, "kdtree") == 0) {
_search_struct->annkSearch(queryPt, k_max, nnIdx, dists, _eps);
}
for (int j = 0; j < nb_ks; j++)
{
scalar_t result = 0.0;
for (int i = 0; i < ks[j]; i++) {
result+=_labels[ nnIdx[i] ]; // Program received signal SIGSEGV, Segmentation fault
}
if (result*label<0)
{
#pragma omp critical
{
errors[j]++;
}
}
}
delete [] nnIdx;
delete [] dists;
}
However, there is a new segment fault error at "result+=_labels[ nnIdx[i] ];". Some idea? Thanks!
Okay, since you've stated that it works correctly on a single-thread case, then "normal" methods won't work. You need to do the following:
This is the list of candidates that are double deleted:
shared(nb_examples_test, error_validation,features_test, labels_test, nb_try, ks)
Also, this code might not be thread safe:
for (int i = 0; i < ks[j]; i++) {
result+=_labels[ nnIdx[i] ];
}
if (result*label<0) errors[j]++;
Because two or more processes may try to do a write to errors array.
And a big advice -- try not to access (especially modify!) anything while in the threaded mode, that is not a parameter to the function!
I don't know if this is your problem, but:
void KNNClassifier::train(int nb_examples, int dim, double **features, int * labels) {
...
delete _search_struct;
if(strcmp(_search_neighbors, "brutal") == 0) {
_search_struct = new ANNbruteForce(_dataPts, _nPts, dim);
}else if(strcmp(_search_neighbors, "kdtree") == 0) {
_search_struct = new ANNkd_tree(_dataPts, _nPts, dim);
}
}
What happens if you don't fall into either the if
or the else if
clauses? You've deleted _search_struct
and left it pointing to garbage. You should set it to NULL
afterward.
If this isn't the problem, you could try replacing:
delete p;
with:
assert(p != NULL);
delete p;
p = NULL;
(or similarly for delete[]
sites). (This probably would pose a problem for the first invocation of KNNClassifier::train
, however.)
Also, obligatory: do you really need to do all of these manual allocations and deallocations? Why aren't you at least using std::vector
instead of new[]
/delete[]
(which are almost always bad)?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With