Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Counting number of "NaN" (not zeros or blanks) in csv

Is it possible to have Python count the number of 'NaN' (as string/text) in a csv file? Tried using pandas' read_csv, but some columns which have blanks are read as NaN also. The only working method I know is to use excel find 'NaN' as values.

Anyone knows of other methods? Thanks in advance!

like image 955
WLC Avatar asked Mar 07 '23 21:03

WLC


1 Answers

You can use pd.read_csv but you will need two parameters: na_values and keep_default_na.

  1. na_values:

Additional strings to recognize as NA/NaN. If dict passed, specific per-column NA values. By default the following values are interpreted as NaN: ‘’, ‘#N/A’, ‘#N/A N/A’, ‘#NA’, ‘-1.#IND’, ‘-1.#QNAN’, ‘-NaN’, ‘-nan’, ‘1.#IND’, ‘1.#QNAN’, ‘N/A’, ‘NA’, ‘NULL’, ‘NaN’, ‘nan’`.

  1. keep_default_na:

If na_values are specified and keep_default_na is False the default NaN values are overridden, otherwise they’re appended to.

So in your case:

pd.read_csv('path/to/file.csv', na_values='NaN', keep_default_na=False)

If you want to be a bit more "liberal" then you might want something like na_values=['nan', 'NaN']--the point is these will be interpreted very strictly.

An example--say you have the following CSV file with 1 literal NaN and two blanks:

enter image description here

import pandas as pd
import numpy as np
df = pd.read_csv('input/sample.csv', na_values='NaN', keep_default_na=False)
print(np.count_nonzero(df.isnull().values))
# 1
like image 136
Brad Solomon Avatar answered Mar 19 '23 15:03

Brad Solomon