Question 1: How can I check if an entire .xls or .csv file is empty.This is the code I am using:
try:
if os.stat(fullpath).st_size > 0:
readfile(fullpath)
else:
print "empty file"
except OSError:
print "No file"
An empty .xls file has size greater than 5.6kb so it is not obvious whether it has any contents. How can I check if an xls or csv file is empty?
Question 2: I need to check the header of the file. How can I tell python that files which are just a single row of headers are empty?
import xlrd
def readfile(fullpath)
xls=xlrd.open_workbook(fullpath)
for sheet in xls.sheets():
number_of_rows = sheet.nrows
number_of_columns = sheet.ncols
sheetname = sheet.name
header = sheet.row_values(0) #Then if it contains only headers, treat it as empty.
This is my attempt. How do I continue with this code?
Please provide a solution for both questions. Thanks in advance.
The difference between CSV and XLS file formats is that CSV format is a plain text format in which values are separated by commas (Comma Separated Values), while XLS file format is an Excel Sheets binary file format which holds information about all the worksheets in a file, including both content and formatting.
I just checked: Python's CSV parser ignores empty lines. I guess that's reasonable. Yes, I agree an empty line within a quoted field means a literal empty line.
This is simple in pandas with the .empty method. Do this
import pandas as pd
df = pd.read_csv(filename) # or pd.read_excel(filename) for xls file
df.empty # will return True if the dataframe is empty or False if not.
This will also return True for a file with only headers as in
>> df = pd.DataFrame(columns = ['A','B'])
>> df.empty
True
Question 1: How I check the entire .xls file are empty.
def readfile(fullpath):
xls = xlrd.open_workbook(fullpath)
is_empty = None
for sheet in xls.sheets():
number_of_rows = sheet.nrows
if number_of_rows == 1:
header = sheet.row_values(0)
# then If it contains only headers I want to treat as empty
if header:
is_empty = False
break
if number_of_rows > 1:
is_empty = False
break
number_of_columns = sheet.ncols
sheetname = sheet.name
if is_empty:
print('xlsx ist empty')
Question 2: How I check header of the file .If the file has only a header(I mean only a single row) I need to treat the file is empty .How can I do that.
import csv
with open('test/empty.csv', 'r') as csvfile:
csv_dict = [row for row in csv.DictReader(csvfile)]
if len(csv_dict) == 0:
print('csv file is empty')
Tested with Python:3.4.2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With