Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading first n lines of sheet using xlsx module

I am trying to read first five rows of data from an excel sheet using xlsx module. Initially, I tried by using a sheet_to_json method which converts whole sheet data to an array of arrays.

let sheetData = xlsx.utils.sheet_to_json(workbook.Sheets[sheetsList[i]], {
  header: 1,
  defval: '',
  blankrows: true
});

But the problem (out of memory) incurred when the file size is huge(>10K records present in a sheet).

Secondly, I tried using the following link: https://github.com/SheetJS/js-xlsx/issues/214#issuecomment-96843418 But I am getting the following error:

    f:\xxx\node_modules\xlsx\xlsx.js:2774
function decode_range(range) { var x =range.split(":").map(decode_cell); return {s:x[0],e:x[x.length-1]}; }
                                            ^

TypeError: Cannot read property 'split' of undefined

How can I resolve it? or are they any other method or modules that are available such that I can get data from either csv, xlsx, xls?

Thanks!

like image 725
sreepurna Avatar asked Aug 21 '18 05:08

sreepurna


People also ask

How do I read a xlsx file in Excel?

Using xlsx package. There are two main functions in xlsx package for reading both xls and xlsx Excel files: read.xlsx() and read.xlsx2() [faster on big files compared to read.xlsx function]. The simplified formats are: read.xlsx(file, sheetIndex, header=TRUE) read.xlsx2(file, sheetIndex, header=TRUE) file: file path.

How to get all the names of all the sheets in xlsx?

Getting the names of all the sheets present in xlsx file is super easy using the openpyxl module. We can use the method called get_sheet_names () to get names of all the sheets present in the excel file. 3. Creating more than one Sheet in an Excel File

How to open XLSX files in Python?

We will need a module called openpyxl which is used to read, create and work with .xlsx files in python. There are some other modules like xlsxwriter, xlrd, xlwt, etc., but, they don't have methods for performing all the operations on excel files.

What is an xlsx file?

Xlsx files are the most widely used documents in the technology field. Data Scientists uses spreadsheets more than anyone else in the world and obivously they don't do it manually.


1 Answers

Can get the first n lines of the sheet with the help of the sheetRows option that is present.

So, the code looks as follows:

let workbook = xlsx.readFile(path, {sheetRows: 5})
 let sheetsList = workbook.SheetNames
 let sheetData = xlsx.utils.sheet_to_json(workbook.Sheets[sheetsList[i]], {
      header: 1,
      defval: '',
      blankrows: true
 });

Here I have limited to first 5 rows.

Thanks to all who tried in solving this problem. Special thanks to xlsx community member. Here is the link: https://github.com/SheetJS/js-xlsx/issues/1225

like image 74
sreepurna Avatar answered Oct 23 '22 09:10

sreepurna