Multiple column selection on a Julia DataFrame

Question

Imagine I have the following DataFrame :

10 rows x 26 columns named A to Z

What I would like to do is to make a multiple subset of the columns by their name (not the index). For instance, assume that I want columns A to D and P to Z in a new DataFrame named df2.

I tried something like this but it doesn't seem to work :

df2=df[:,[:A,:D ; :P,:Z]]

syntax: unexpected semicolon in array expression top-level scope at Slicing.jl:1

Any idea of the way to do it ? Thanks for any help

Bogumił Kamiński · Accepted Answer

df2 = select(df, Between(:A,:D), Between(:P,:Z))

or

df2 = df[:, All(Between(:A,:D), Between(:P,:Z))]

if you are sure your columns are only from :A to :Z you can also write:

df2 = select(df, Not(Between(:E, :O)))

or

df2 = df[:, Not(Between(:E, :O))]

Finally, you can easily find an index of the column using columnindex function, e.g.:

columnindex(df, :A)

and later use column numbers - if this is something what you would prefer.

Przemyslaw Szufel · Answer

In Julia you can also build Ranges with Chars and hence when your columns are named just by single letters yet another option is:

df[:, Symbol.(vcat('A':'D', 'P':'Z'))]

Multiple column selection on a Julia DataFrame

Tags:

select

dataframe

julia

Bebio

2 Answers

Bogumił Kamiński

Przemyslaw Szufel

Recent Activity

Donate For Us

Multiple column selection on a Julia DataFrame

Tags:

select

dataframe

julia

Bebio

2 Answers

Bogumił Kamiński

Przemyslaw Szufel

Related questions

Recent Activity

Donate For Us