What is the proper way to test if a value in a DataFrame is NA in the Julia DataFrames package?
I have this far found out that typeof(var) == NAtype
works, but is there a more elegant way of doing it?
Using typeof(var) == NAtype
for this is awkward, in particular because it is not vectorized.
The canonical way of testing for NA
values is to use the (vectorized) function called isna
.
Let's generate a toy DataFrame with some NA
values in the B
column:
julia> using DataFrames
julia> df = DataFrame(A = 1:10, B = 2:2:20)
10x2 DataFrame
| Row | A | B |
|-----|----|----|
| 1 | 1 | 2 |
| 2 | 2 | 4 |
| 3 | 3 | 6 |
| 4 | 4 | 8 |
| 5 | 5 | 10 |
| 6 | 6 | 12 |
| 7 | 7 | 14 |
| 8 | 8 | 16 |
| 9 | 9 | 18 |
| 10 | 10 | 20 |
julia> df[[1,4,8],symbol("B")] = NA
NA
julia> df
10x2 DataFrame
| Row | A | B |
|-----|----|----|
| 1 | 1 | NA |
| 2 | 2 | 4 |
| 3 | 3 | 6 |
| 4 | 4 | NA |
| 5 | 5 | 10 |
| 6 | 6 | 12 |
| 7 | 7 | 14 |
| 8 | 8 | NA |
| 9 | 9 | 18 |
| 10 | 10 | 20 |
Now let's pretend we don't know the contents of our DataFrame and ask, for example, the following question:
Does column
B
contain anNA
values?
The typeof
approach won't work, here:
julia> typeof(df[:,symbol("B")]) == NAtype
false
The isna
function is more adequate:
julia> any(isna(df[:,symbol("B")]))
true
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With