Recently I have been tasked with doing a few speed checks so I can tell whether is faster to use php/php-cli or c++ to insert a certain number of rows into a database.
Before we start, let me tell you a few details so everything is clear:
So, this is the process:
Both codes work exactly as expected. Here are the resulting numbers:
php:
c++:
php outperforms c++ as the lines in the file increase... At first, I suspected of the line splitting function: the splitting in php is done with "explode". The algorithm is as naive as it comes for c++... The container is passed via reference and its contents are changed on the fly. The container is traversed only once. I made sure the container "reserves()" all neccesary space (remember, I finally choose vectors) that is fixed. The container is created on the main function and then is passed by reference through the code. It is never emptied or resized: only its contents change.
template<typename container> void explode(const std::string& p_string, const char p_delimiter, container& p_result)
{
auto it=p_result.begin();
std::string::const_iterator beg=p_string.begin(), end=p_string.end();
std::string temp;
while(beg < end)
{
if( (*beg)==p_delimiter)
{
*(it)=temp;
++it;
temp="";
}
else
{
temp+=*beg;
}
++beg;
}
*(it)=temp;
}
As said before, the task performed is equivalent, but the code generating it is not. C++ code has the usual try-catch blocks for controlling the mysql interactions. As for the rest, the main loop runs until EOF is reached and every iteration checks if the insertion failed (both in c++ and php).
I have seen c++ greatly outperforming php in working with files and their contents so I expected the same to be applicable here. Somehow I suspect of the splitting algorithm but maybe it is just that the database connector is slower (still, when I disabled database interaction php still processed faster) or my code is sub par...
As far as profiling goes, gprof spat this out about the c++ code:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ns/call ns/call name
60.00 0.03 0.03 50000 600.00 600.00 void anc_str::explotar_cadena<std::vector<std::string, std::allocator<std::string> > >(std::string const&, char, std::vector<std::string, std::allocator<std::string> >&)
40.00 0.05 0.02 insertar(sql::PreparedStatement*, std::string const&, std::vector<std::string, std::allocator<std::string> >&)
0.00 0.05 0.00 1 0.00 0.00 _GLOBAL__sub_I__ZN7anc_str21obtener_linea_archivoERSt14basic_ifstreamIcSt11char_traitsIcEE
Where "explotar_cadena" is "explode" and "insertar" is "split this line and set the prepared statement up". As you can see 60% of the time is spend there (not surprising... it runs 50000 times and does this crazy splitting thing). "obtener_linea_archivo" is just "please, dump the next line into the string".
Without mysql interaction (just load the file, read the lines and split them) I get these measurements:
php
c++
Okay, both times are good and hardly noticeable for real life terms still, I am surprised... So the question here is: Am I supposed to expect this?. Anyone with prior experience willing to lend a hand?.
Thanks in advance.
Edit: Here is a quick link to a stripped down version containing input files, C++ code and php code [ http://www.datafilehost.com/d/d31034d6 ]. Notice that there is no sql interaction: only file opening, string splitting and time measuring. Please, forgive the butchered code and half spanish comments and variable names as this was done in a hurry. Also, note the gprof results above: I am no expert but I think we're trying to find a better way of splitting the string.
Some part of it might to have to do with the driver/interface used in each language. For example, with PHP/MySQL, you will probably find that mysqli is faster than mysql, which is faster than PDO. That is because the libraries are progressively more abstract (or less maintained). You might try profiling the queries themselves on the database server to see if there is any difference in execution time. Then again, there may be more going on, as other commenters have noted.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With