Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++ CPU not full utilized

I met a strange performance reduction in C++, can someone help me to identify what's the problem?

I am doing a out-of-sample forecasting problem based on Eigen Library. I use a expanding window method to incorporate a new vector of observation and use LBFGS method to maximize the likelihood function.

Specifically, in the for loop, I choose a subset block Y of my data mY (Y = mY.block(0,0,24,756+i)), and use this subset of data to estimate the parameter by callLBFGS function and store the estimates in vP vector (I pass vP by reference). Next, I copy the value of vP into a result matrix mRet, and finlly I write the mRet to a csv file.

The reason I put the WriteCSV in the loop is that callLBFGS is very time-consuming. If the optimization fails, I want to abort the program and debugging, but I don't want to lose my estimates that are already optimized, so I write it to csv every time that it finishes a loop. I don't see the problem of putting the WriteCSV inside the loop at all, since callLBFGS usually takes up to 10 minutes, but write a 102 x 40 matrix into csv usually takes milliseconds.

Here is my problem: the callLBFGS function I write is multi-threaded using OpenMP, without putting the WriteCSV function inside the loop, the CPU utilization is 95%-100%, but with WriteCSV function inside the loop, the CPU utilization drops to 50%-60%.

This is very strange and beyond my knowledge, considering that callLBFGS consumes thousands times more of time than WriteCSV. CPU shouldn't spend much time in scheduling and forking new thread. Can someone help me to identify the problem? Many many thanks !

MatrixXd mY = mData.transpose();

double adFunc;
char *Result = "BFGSEst.csv";

VectorXd vP(102);
MatrixXd Y;
Matrix<double,102,40> mRet;
vP = IniPar.col(0);

for (int i = 0; i < 40; ++i) {

    Y = mY.block(0,0,24,756+i);
    callLBFGS(vP,&adFunc,Y,10000);
    mRet.col(i) = vP;
    WriteCSV(Result,mRet);  /*this is the killer*/
}

My function callLBFGS and WriteCSV is looks like this:

void callLBFGS(VectorXd &vP, double *adFunc, MatrixXd &Y, int MaxIter);
void WriteCSV(char *filename, MatrixXd X);

EDIT* Many thanks for all your replies. Below is my implementation of WriteCSV, quite naive. To clarify, I said callLBFGS takes up to 10 mins is that each call of callLBFGS takes up to 10 mins (multi-dimensional minimization). So the entire loop takes hours to finish. I know if callLBFGS and WriteCSV consume similar time, it is not surprise that the CPU would not be full utilized. But here is that WriteCSV takes far less time than callLBFGS, I should not expect that the CPU usage decrease to that large extent during the process of callLBFGS.

void WriteCSV(char *filename, MatrixXd X){
  ofstream myfile;
  myfile.open (filename);

  for (int i = 0; i < X.rows(); ++i)
  {
      for (int j = 0; j < X.cols(); ++j)
      {
          myfile<<X(i,j);
          if (j!=X.cols()-1)
          {
             myfile<<",";
          }else{
             myfile<<"\n";
          }
      } 
  }
  myfile.close();
}
like image 276
Michael Gong Avatar asked May 17 '26 05:05

Michael Gong


1 Answers

Sorry this should be just a comment, but I can't yet

There might be a lazy evaluation happening invisibly within MatrixXd<>.

The lazy evaluation possibly happens within WriteCSV(...) and this coupled with necessarily single threaded IO drops your total CPU utilization.

The deferred or lazy evaluation could be anywhere but is probably hidden in a simple retrieval or copy statement like:

mRet.col(i) = vP;

or

myfile<<X(i,j);

Other posters are all likely more correct and I endorse the suggestion to thread out the IO.

like image 86
JimmyNJ Avatar answered May 18 '26 19:05

JimmyNJ



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!