Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a diff() function in Julia DataFrames like pandas?

I have a DataFrame in Julia and I want to create a new column that represents the difference between consecutive rows in a specific column. In python pandas, I would simply use df.series.diff(). Is there a Julia equivelant?

For example:

data
1
2
4
6
7

# in pandas

df['diff_data'] = df.data.diff()

data   diff_data
1        NaN 
2          1
4          2
6          2
7          1
like image 457
connor449 Avatar asked Jun 09 '21 12:06

connor449


1 Answers

You can use ShiftedArrays.jl like this.

Declarative style:

julia> using DataFrames, ShiftedArrays

julia> df = DataFrame(data=[1, 2, 4, 6, 7])
5×1 DataFrame
 Row │ data
     │ Int64
─────┼───────
   1 │     1
   2 │     2
   3 │     4
   4 │     6
   5 │     7

julia> transform(df, :data => (x -> x - lag(x)) => :data_diff)
5×2 DataFrame
 Row │ data   data_diff
     │ Int64  Int64?
─────┼──────────────────
   1 │     1    missing
   2 │     2          1
   3 │     4          2
   4 │     6          2
   5 │     7          1

Imperative style (in place):

julia> df = DataFrame(data=[1, 2, 4, 6, 7])
5×1 DataFrame
 Row │ data
     │ Int64
─────┼───────
   1 │     1
   2 │     2
   3 │     4
   4 │     6
   5 │     7

julia> df.data_diff = df.data - lag(df.data)
5-element Vector{Union{Missing, Int64}}:
  missing
 1
 2
 2
 1

julia> df
5×2 DataFrame
 Row │ data   data_diff
     │ Int64  Int64?
─────┼──────────────────
   1 │     1    missing
   2 │     2          1
   3 │     4          2
   4 │     6          2
   5 │     7          1

with diff you do not need extra packages and can do similarly the following:

julia> df.data_diff = [missing; diff(df.data)]
5-element Vector{Union{Missing, Int64}}:
  missing
 1
 2
 2
 1

(the issue is that diff is a general purpose function that does change the length of vector from n to n-1 so you have to add missing manually in front)

like image 105
Bogumił Kamiński Avatar answered Sep 21 '22 14:09

Bogumił Kamiński