Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Select numerical columns of Julia DataFrame with missing values

I want to select all columns of a DataFrame in which the datatype is a subtype of Number. However, since there are columns with missing values, the numerical column datatypes can be something like Union{Missing, Int64}.

So far, I came up with:

using DataFrames

df = DataFrame([["a", "b"], [1, missing] ,[2, 5]])

df_numerical = df[typeintersect.(colwise(eltype, df), Number) .!= Union{}]

This yields the expected result.

Question

Is there a more simple, idiomatic way of doing this? Possibly simliar to:

df.select_dtypes(include=[np.number])

in pandas as taken from an answer to this question?

like image 814
TimD Avatar asked Jan 27 '23 11:01

TimD


1 Answers

julia> df[(<:).(eltypes(df),Union{Number,Missing})]
2×2 DataFrame
│ Row │ x2      │ x3 │
├─────┼─────────┼────┤
│ 1   │ 1       │ 2  │
│ 2   │ missing │ 5  │

Please note that the . is the broadcasting operator and hence I had to use <: operator in a functional form.

like image 112
Przemyslaw Szufel Avatar answered Jan 31 '23 09:01

Przemyslaw Szufel