Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to check if .xls and .csv files are empty

Question 1: How can I check if an entire .xls or .csv file is empty.This is the code I am using:

try:
    if os.stat(fullpath).st_size > 0:
       readfile(fullpath)
    else:
       print "empty file"
except OSError:
    print "No file"

An empty .xls file has size greater than 5.6kb so it is not obvious whether it has any contents. How can I check if an xls or csv file is empty?

Question 2: I need to check the header of the file. How can I tell python that files which are just a single row of headers are empty?

import xlrd
def readfile(fullpath)
    xls=xlrd.open_workbook(fullpath)  
    for sheet in xls.sheets():
        number_of_rows = sheet.nrows 
        number_of_columns = sheet.ncols
        sheetname = sheet.name
        header = sheet.row_values(0) #Then if it contains only headers, treat it as empty.

This is my attempt. How do I continue with this code?

Please provide a solution for both questions. Thanks in advance.

like image 242
bob marti Avatar asked Mar 01 '17 16:03

bob marti


People also ask

Is XLS and CSV are same?

The difference between CSV and XLS file formats is that CSV format is a plain text format in which values are separated by commas (Comma Separated Values), while XLS file format is an Excel Sheets binary file format which holds information about all the worksheets in a file, including both content and formatting.

Can CSV files have empty lines?

I just checked: Python's CSV parser ignores empty lines. I guess that's reasonable. Yes, I agree an empty line within a quoted field means a literal empty line.


2 Answers

This is simple in pandas with the .empty method. Do this

import pandas as pd

df = pd.read_csv(filename) # or pd.read_excel(filename) for xls file
df.empty # will return True if the dataframe is empty or False if not.

This will also return True for a file with only headers as in

>> df = pd.DataFrame(columns = ['A','B'])
>> df.empty
   True
like image 73
Некто Avatar answered Oct 17 '22 09:10

Некто


Question 1: How I check the entire .xls file are empty.

def readfile(fullpath):

    xls = xlrd.open_workbook(fullpath)

    is_empty = None

    for sheet in xls.sheets():
        number_of_rows = sheet.nrows

        if number_of_rows == 1:
            header = sheet.row_values(0)  
            # then If it contains only headers I want to treat as empty
            if header:
                is_empty = False
                break

        if number_of_rows > 1:
            is_empty = False
            break

        number_of_columns = sheet.ncols
        sheetname = sheet.name

    if is_empty:
        print('xlsx ist empty')

Question 2: How I check header of the file .If the file has only a header(I mean only a single row) I need to treat the file is empty .How can I do that.

import csv
with open('test/empty.csv', 'r') as csvfile:
    csv_dict = [row for row in csv.DictReader(csvfile)]
    if len(csv_dict) == 0:
        print('csv file is empty')

Tested with Python:3.4.2

like image 31
stovfl Avatar answered Oct 17 '22 09:10

stovfl