Logo Questions Linux Laravel Mysql Ubuntu Git Menu

Proper way to test for NA in Julia DataFrames

What is the proper way to test if a value in a DataFrame is NA in the Julia DataFrames package?

I have this far found out that typeof(var) == NAtype works, but is there a more elegant way of doing it?

like image 381
Skeppet Avatar asked Jan 26 '15 15:01


1 Answers

Using typeof(var) == NAtype for this is awkward, in particular because it is not vectorized.

The canonical way of testing for NA values is to use the (vectorized) function called isna.


Let's generate a toy DataFrame with some NA values in the B column:

julia> using DataFrames

julia> df = DataFrame(A = 1:10, B = 2:2:20)
10x2 DataFrame
| Row | A  | B  |
| 1   | 1  | 2  |
| 2   | 2  | 4  |
| 3   | 3  | 6  |
| 4   | 4  | 8  |
| 5   | 5  | 10 |
| 6   | 6  | 12 |
| 7   | 7  | 14 |
| 8   | 8  | 16 |
| 9   | 9  | 18 |
| 10  | 10 | 20 |

julia> df[[1,4,8],symbol("B")] = NA

julia> df
10x2 DataFrame
| Row | A  | B  |
| 1   | 1  | NA |
| 2   | 2  | 4  |
| 3   | 3  | 6  |
| 4   | 4  | NA |
| 5   | 5  | 10 |
| 6   | 6  | 12 |
| 7   | 7  | 14 |
| 8   | 8  | NA |
| 9   | 9  | 18 |
| 10  | 10 | 20 |

Now let's pretend we don't know the contents of our DataFrame and ask, for example, the following question:

Does column B contain an NA values?

The typeof approach won't work, here:

julia> typeof(df[:,symbol("B")]) == NAtype

The isna function is more adequate:

julia> any(isna(df[:,symbol("B")]))
like image 152
jub0bs Avatar answered Oct 09 '22 19:10
