im using PHPExcel to read .xls files. I quite a short time i meet
Fatal error: Allowed memory size of 1073741824 bytes exhausted (tried to allocate 730624 bytes) in Excel\PHPExcel\Shared\OLERead.php on line 93
after some googling, i tried chunkReader to prevent this (mentioned even on PHPExcel homesite), but im still stucked with this error.
My thought is, that via chunk reader, i will read file part by part and my memory wont overflow. But there must be some serious memoryleak? Or im freeing some memory bad? I even tried to raise server ram to 1GB. File size, which i trying to read is about 700k, which is not so much (im also reading ~20MB pdf, xlsx, docx, doc, etc files without issue). So i assume there can be just some minor troll i overlooked.
Code looks like this
function parseXLS($fileName){
require_once dirname(__FILE__) . './sphider_design/include/Excel/PHPExcel/IOFactory.php';
require_once dirname(__FILE__) . './sphider_design/include/Excel/PHPExcel/ChunkReadFilter.php';
$inputFileType = 'Excel5';
/** Create a new Reader of the type defined in $inputFileType **/
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
/** Define how many rows we want to read for each "chunk" **/
$chunkSize = 20;
/** Create a new Instance of our Read Filter **/
$chunkFilter = new chunkReadFilter();
/** Tell the Reader that we want to use the Read Filter that we've Instantiated **/
$objReader->setReadFilter($chunkFilter);
/** Loop to read our worksheet in "chunk size" blocks **/
/** $startRow is set to 2 initially because we always read the headings in row #1 **/
for ($startRow = 2; $startRow <= 65536; $startRow += $chunkSize) {
/** Tell the Read Filter, the limits on which rows we want to read this iteration **/
$chunkFilter->setRows($startRow,$chunkSize);
/** Load only the rows that match our filter from $inputFileName to a PHPExcel Object **/
$objPHPExcel = $objReader->load($fileName);
// Do some processing here
// Free up some of the memory
$objPHPExcel->disconnectWorksheets();
unset($objPHPExcel);
}
}
And here is code for chunkReader
class chunkReadFilter implements PHPExcel_Reader_IReadFilter
{
private $_startRow = 0;
private $_endRow = 0;
/** Set the list of rows that we want to read */
public function setRows($startRow, $chunkSize) {
$this->_startRow = $startRow;
$this->_endRow = $startRow + $chunkSize;
}
public function readCell($column, $row, $worksheetName = '') {
// Only read the heading row, and the rows that are configured in $this->_startRow and $this->_endRow
if (($row == 1) || ($row >= $this->_startRow && $row < $this->_endRow)) {
return true;
}
return false;
}
}
So i found interesting solution here How to read large worksheets from large Excel files (27MB+) with PHPExcel?
as Addendum 3 in question
edit1: also with this solution, i came to chokepoint with my favourite errr message, but i found something about caching, so i implemented this
$cacheMethod = PHPExcel_CachedObjectStorageFactory::cache_to_phpTemp;
$cacheSettings = array(' memoryCacheSize ' => '8MB');
PHPExcel_Settings::setCacheStorageMethod($cacheMethod, $cacheSettings);
recently i tested it only for xls files lesser than 10MB, but it seems like to work (also i set $objReader->setReadDataOnly(true);
) and it seems like balanced enough to achieve speed and memory consumption. (i will follow my thorny path more, if its possible)
edit2:
So i made some further research and found chunk reader unnecessary in my way. (seems like to me, memory issue is same with chunk reader and without it.) So my final answer to my question is something like that, which reads .xls file (only data from cells, without formating, even filtering out formulas). When i use cache_tp_php_temp
im able to read xls files (tested to 10MB) and about 10k rows and multiple columns in matter of seconds and without memory issue
function parseXLS($fileName){
/** PHPExcel_IOFactory */
require_once dirname(__FILE__) . './sphider_design/include/Excel/PHPExcel/IOFactory.php';
require_once dirname(__FILE__) . './sphider_design/include/Excel/PHPExcel/ChunkReadFilter.php';
require_once dirname(__FILE__) . './sphider_design/include/Excel/PHPExcel.php';
$inputFileName = $fileName;
$fileContent = "";
//get inputFileType (most of time Excel5)
$inputFileType = PHPExcel_IOFactory::identify($inputFileName);
//initialize cache, so the phpExcel will not throw memory overflow
$cacheMethod = PHPExcel_CachedObjectStorageFactory::cache_to_phpTemp;
$cacheSettings = array(' memoryCacheSize ' => '8MB');
PHPExcel_Settings::setCacheStorageMethod($cacheMethod, $cacheSettings);
//initialize object reader by file type
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
//read only data (without formating) for memory and time performance
$objReader->setReadDataOnly(true);
//load file into PHPExcel object
$objPHPExcel = $objReader->load($inputFileName);
//get worksheetIterator, so we can loop sheets in workbook
$worksheetIterator = $objPHPExcel->getWorksheetIterator();
//loop all sheets
foreach ($worksheetIterator as $worksheet) {
//use worksheet rowIterator, to get content of each row
foreach ($worksheet->getRowIterator() as $row) {
//use cell iterator, to get content of each cell in row
$cellIterator = $row->getCellIterator();
//dunno
$cellIterator->setIterateOnlyExistingCells(false);
//iterate each cell
foreach ($cellIterator as $cell) {
//check if cell exists
if (!is_null($cell)) {
//get raw value (without formating, and all unnecessary trash)
$rawValue = $cell->getValue();
//if cell isnt empty, print its value
if ((trim($rawValue) <> "") and (substr(trim($rawValue),0,1) <> "=")){
$fileContent .= $rawValue . " ";
}
}
}
}
}
return $fileContent;
}
Hope following links will help :
PHPExcel runs out of 256, 512 and also 1024MB of RAM
http://phpexcel.codeplex.com/discussions/242712?ProjectName=phpexcel
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With