Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++ OpenCV Max storage capacity of cv::Mat

In my program I load some images, extract some features from them and use a cv::Mat to store these features. Based on the number of images I know the cv::Mat will be 700.000 x 256 in size (rows x cols) which is about 720Mb. But when I run my program when it gets about 400.000 x 256 (400Mb) and tries to add more it simply crashes with a Fatal Error. Can anyone confirm that indeed 400Mb is the limit of cv::Mat's storage capacity? Should I check for more issues? Possible ways to overcome this problem?

like image 716
DimChtz Avatar asked Nov 10 '16 00:11

DimChtz


People also ask

What is a CV :: mat?

In OpenCV the main matrix class is called Mat and is contained in the OpenCV-namespace cv. This matrix is not templated but nevertheless can contain different data types. These are indicated by a certain type-number. Additionally, OpenCV provides a templated class called Mat_, which is derived from Mat.

What is Mat image in OpenCV?

The Mat class of OpenCV library is used to store the values of an image. It represents an n-dimensional array and is used to store image data of grayscale or color images, voxel volumes, vector fields, point clouds, tensors, histograms, etc. This class comprises of two data parts: the header and a pointer.

What does MAT mean in C++?

Mat is basically a class with two data parts : the matrix header (containing information such as the size of the matrix, the method used for storing, at which address is the matrix stored, and so on) and a pointer to the matrix containing the pixel values (taking any dimensionality depending on the method chosen for ...

What is Vec3b?

Vec3b is the abbreviation for "vector with 3 byte entries" Here those byte entries are unsigned char values to represent values between 0 .. 255. Each byte typically represents the intensity of a single color channel, so on default, Vec3b is a single RGB (or better BGR) pixel.


2 Answers

Digging the source code, by using the push_back:

it checks if there is enough space for a new element, if not, it reallocates the matrix, with space for (current_size * 3 + 1) / 2 (see here). In your example, by around 400,000 * 256 (total of 102,400,000 elements) it tries another allocation, so it tries to allocate space for 307,200,001 / 2 = 153,600,000 elements. But in order to move this, it needs to allocate a new space and then copy the data

From matrix.cpp:

Mat m(dims, size.p, type());
size.p[0] = r;
if( r > 0 )
{
    Mat mpart = m.rowRange(0, r);
    copyTo(mpart);
}

*this = m;

So it essentially:

  1. Allocates a new matrix, using default constructor for all newly created elements
  2. Copy over the data and then delete the old data
  3. Create a new header for this matrix, with enough columns
  4. Points this elements to the newly allocated data (freeing old allocated memory)

Meaning that, in your case, it needs enough space for (600,000 + 400,000) * 256 - 1GB of data, using 4 bytes integers. But also, it creates an auxiliary matrix of one row and, in this case, 600,000 columns, which accounts for 2,400,000 extra bytes.

So, by the next iteration, when it reaches the 600,000 columns, it tries to allocate 900,000x256 elements (900Mb) + the 600,000x256 elements (600Mb) + 600,000 (~3.4Mb). So, just by allocating this way (using push_back), you are doing several reallocations.

In other words: since you already know the approximate size of the matrix, using reserve is a must. It is several times faster (will avoid reallocations and copies).

Also, as a workaround, you could try inserting to the transposed matrix and then after the process is done, transpose it again.

Side question: shouldn't this implementation use realloc instead of malloc/memcpy?

like image 142
Bruno Ferreira Avatar answered Sep 30 '22 01:09

Bruno Ferreira


There is no strict limit on the size of a cv::Mat. You should be able to allocate memory as long as it is available.

Here is a small program that shows what can happen to the data pointer when running cv::Mat::push_back a number of times. Playing around with the values for rows and cols can result in one or many values printed for a.data before eventually an out-of-memory exception is thrown.

#include <opencv2/opencv.hpp>
int main()
{
  int rows = 1, cols = 10000;
  cv::Mat a(rows, cols, CV_8UC1);
  while(1) {
    a.push_back(cv::Mat(1, cols, CV_8UC1));
    std::cout << (size_t) a.data << std::endl;
  }
}

It really depends on the allocator on what the above code does for various values of rows and cols. So, consideration should be given for small and large initial sizes for a.

Remember that like the C++11 std::vector, the elements in a cv::Mat are contiguous. Access to the underlying data can be obtained through the cv::Mat::data member. Calling std::vector::push_back or cv::Mat::push_back continuously may result in a reallocation of the underlying memory. In this case, the memory has to be moved to a new address and roughly twice the amount of memory may be necessary to move old to new (baring any tricky algorithm that utilizes less memory).

like image 25
Robert Prévost Avatar answered Sep 30 '22 01:09

Robert Prévost