Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why function first-time calling costs much more time than the second time calling it and third and so on?

Here is my code based on OpenCV:

int main()
{
    clock_t start, stop;
    Mat img = imread("lena.jpg", IMREAD_GRAYSCALE);
    img.convertTo(img, CV_32F, 1.0);
    float *imgInP = (float *)img.data;    // get the input data point 
    Mat imgOut = Mat::zeros(Size(img.rows, img.cols), CV_32F);   // create output mat
    float *imgOutP = (float *)imgOutP.data;  // get the output data point

    // test several calling of opencv boxFilter
    start = clock();
    //blur(img, imgOut, Size(31, 31));
    boxFilter(img, imgOut, CV_32F, Size(31, 31));
    stop = clock();
    cout << "BoxFilter on OpenCV 1 : " << 1000.0 * (stop - start) / CLOCKS_PER_SEC << " ms" << endl;
    start = clock();
    //blur(img, imgOut, Size(31, 31));
    boxFilter(img, imgOut, CV_32F, Size(31, 31));
    stop = clock();
    cout << "BoxFilter on OpenCV 2 : " << 1000.0 * (stop - start) / CLOCKS_PER_SEC << " ms" << endl;
     start = clock();
    //blur(img, imgOut, Size(31, 31));
    boxFilter(img, imgOut, CV_32F, Size(31, 31));
    stop = clock();
    cout << "BoxFilter on OpenCV 3 : " << 1000.0 * (stop - start) / CLOCKS_PER_SEC << " ms" << endl;

    return 0;
}

Here is the Outputs of above program:

BoxFilter on OpenCV 1 : 72.368ms

BoxFilter on OpenCV 2 : 0.495 ms

BoxFilter on OpenCV 3 : 0.403 ms

Why the time costed by first calling boxFilter (72.368ms) is much much more than the second (0.495ms) and third one(0.403 ms).

what's more, if I change the input image at the third time calling boxFilter, the outputs didn't change as well. So, it is may not be the factor of image data cache...

Thanks for any advise.

My system is Ubuntu 14.04, i5-4460, 12GB RAM, OpenCV version : 3.1, cmake Version : 3.2, g++ version : 4.8.4

Below is my cmake file :

cmake_minimum_required(VERSION 3.7)
project(boxfilterTest)

set(CMAKE_CXX_STANDARD 11)

find_package(OpenCV REQUIRED)
include_directories(${OpenCV_INCLUDE_DIRS})

set(SOURCE_FILES main.cpp)
add_executable(boxfilterTest ${SOURCE_FILES})

target_link_libraries(boxfilterTest ${OpenCV_LIBS})

The IDE is CLion.

like image 749
smh Avatar asked Oct 09 '17 01:10

smh


2 Answers

The reason for difference is timing is due to both the instruction cache as well as data cache. The data cache can be verified by forcing the matrix to be re-allocated to a different size (e.g. resizing the image). If the image is resized between different calls to boxFilter, the execution times of boxFilter calls becomes very close to each other. Here is the example code demonstrating the said phenomenon.

#include <iostream>
#include <opencv2/opencv.hpp>

using namespace std;
using namespace cv;

int main()
{
    clock_t start, stop;
    Mat img = imread("lena.jpg", IMREAD_GRAYSCALE);
    img.convertTo(img, CV_32F, 1.0);
    float *imgInP = (float *)img.data;    // get the input data point 
    Mat imgOut = Mat::zeros(Size(img.rows, img.cols), CV_32F);   // create output mat
    float *imgOutP = (float *)imgOut.data;  // get the output data point

    // test several calling of opencv boxFilter
    start = clock();
    //blur(img, imgOut, Size(31, 31));
    boxFilter(img, imgOut, CV_32F, Size(31, 31));
    stop = clock();

    cv::resize(img, img, cv::Size(), 1.1, 1.1); //Force data re-allocation

    cout << "BoxFilter on OpenCV 1 : " << 1000.0 * (stop - start) / CLOCKS_PER_SEC << " ms" << endl;
    start = clock();
    //blur(img, imgOut, Size(31, 31));
    //GaussianBlur(img, imgOut, Size(31, 31), 0.5);
    boxFilter(img, imgOut, CV_32F, Size(31, 31));
    stop = clock();

    cv::resize(img, img, cv::Size(), 0.909, 0.909);  //Force data re-allocation

    cout << "BoxFilter on OpenCV 2 : " << 1000.0 * (stop - start) / CLOCKS_PER_SEC << " ms" << endl;
     start = clock();
    //blur(img, imgOut, Size(31, 31));
    boxFilter(img, imgOut, CV_32F, Size(31, 31));
    stop = clock();
    cout << "BoxFilter on OpenCV 3 : " << 1000.0 * (stop - start) / CLOCKS_PER_SEC << " ms" << endl;

    return 0;
}

Program Output:

Without data re-allocation:

BoxFilter on OpenCV 1 : 2.459 ms

BoxFilter on OpenCV 2 : 1.599 ms

BoxFilter on OpenCV 3 : 1.568 ms

With data re-allocation:

BoxFilter on OpenCV 1 : 2.225 ms

BoxFilter on OpenCV 2 : 2.368 ms

BoxFilter on OpenCV 3 : 2.091 ms

like image 158
sgarizvi Avatar answered Sep 27 '22 18:09

sgarizvi


Well, I think it may be caused by the instruction cache (after all, there is * MB L2 cache in CPU). But I cannot figure out how to verify it and improve it.

like image 36
smh Avatar answered Sep 27 '22 20:09

smh