Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SXSSF: to where does it flush rows not in the window prior to output to file?

According to the SXSSF (Streaming Usermodel API) documentation:

SXSSF (package: org.apache.poi.xssf.streaming) is an API-compatible streaming extension of XSSF to be used when very large spreadsheets have to be produced, and heap space is limited. SXSSF achieves its low memory footprint by limiting access to the rows that are within a sliding window, while XSSF gives access to all rows in the document. Older rows that are no longer in the window become inaccessible, as they are written to the disk.

However, in the provided example the flush happens before the workbook is given the file location at which to write the file.

public static void main(String[] args) throws Throwable {
    Workbook wb = new SXSSFWorkbook(100); // keep 100 rows in memory, exceeding rows will be flushed to disk
    Sheet sh = wb.createSheet();
    for(int rownum = 0; rownum < 1000; rownum++){
        Row row = sh.createRow(rownum);
        for(int cellnum = 0; cellnum < 10; cellnum++){
            Cell cell = row.createCell(cellnum);
            String address = new CellReference(cell).formatAsString();
            cell.setCellValue(address);
        }

    }

    // Rows with rownum < 900 are flushed and not accessible
    for(int rownum = 0; rownum < 900; rownum++){
      Assert.assertNull(sh.getRow(rownum));
    }

    // ther last 100 rows are still in memory
    for(int rownum = 900; rownum < 1000; rownum++){
        Assert.assertNotNull(sh.getRow(rownum));
    }

    FileOutputStream out = new FileOutputStream("/temp/sxssf.xlsx");
    wb.write(out);
    out.close();
}

So this begs the questions:

  • Where on the file system is it storing the data?
  • Is it just creating a temp file in the default temp directory?
  • Is this safe for all / most implementations?
like image 339
John B Avatar asked Sep 14 '11 13:09

John B


People also ask

What is the difference between HSSFWorkbook and XSSFWorkbook?

HSSFWorkbook − This class has methods to read and write Microsoft Excel files in . xls format. It is compatible with MS-Office versions 97-2003. XSSFWorkbook − This class has methods to read and write Microsoft Excel and OpenOffice xml files in .

How do you skip a row in Excel in Java?

iterator(); List<Product> entities = new ArrayList<Product>(); int rowNumber = 0; while (rows. hasNext()) { Row currentRow = rows. next(); if (rowNumber == 0) { rowNumber++; continue; } Iterator<Cell> cellsInRow = currentRow.

How do I count the number of rows in Excel using Apache POI?

Apache POI provides two methods that help count rows: getLastRowNum() and getPhysicalNumberOfRows().

Can HSSF read XLSX file?

Apache POI also provides different implementation classes to handle both XLS and XLSX file format. XSSF (XML SpreadSheet Format) – Used to reading and writting Open Office XML (XLSX) format files. HSSF (Horrible SpreadSheet Format) – Use to read and write Microsoft Excel (XLS) format files.


1 Answers

The class that does the buffering is SheetDataWriter in org.apache.poi.xssf.streaming.SXSSFSheet

The magic line you're probably interested in is:

_fd = File.createTempFile("poi-sxxsf-sheet", ".xml");

In terms of is that safe, probably, but not certainly... It's likely worth opening a bug in the poi bugzilla, and requesting it be switched to using org.apache.poi.util.TempFile which allows a bit more control. In general though, as long as you specify a valid property for java.io.tmpdir (or the default is sensible for you) you should be fine

like image 113
Gagravarr Avatar answered Sep 22 '22 14:09

Gagravarr