Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Julia DataFrames Unique Rows

In DF, I have two columns (let’s call them A and B) with A having repeats, both are categorical variables. I am trying to show only the unique A rows with their corresponding B values, how can I do that?

I was able to do it when B is a continuous var by using this:

by(ptable, [:A], df -> mean(df[:B]))
like image 254
Kevin Avatar asked Feb 28 '26 04:02

Kevin


2 Answers

This worked for me

df[!nonunique(df[:,[:A]]), [:A, :B]]
like image 150
Kevin Avatar answered Mar 02 '26 14:03

Kevin


You can get the desired result like this:

by(df, :A, x -> [x.B])

now your DataFrame will have two columns :A and :x1, and column :x1 will hold all values of column :B corresponding to unique values of :A (so column :x1 will be a vector of vectors).

EDIT: as of DataFrames.jl 0.22 use the following syntax:

combine(groupby(df, :A), :B => Ref => :B)
like image 39
Bogumił Kamiński Avatar answered Mar 02 '26 14:03

Bogumił Kamiński



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!