Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Plot weekly data from DataFrame of daily data

I have a Julia DataFrame like so:

│ Row  │ date               │ users │ posts │ topics │ likes │ pageviews │
│      │ Date               │ Int64 │ Int64 │ Int64  │ Int64 │ Int64     │
├──────┼────────────────────┼───────┼───────┼────────┼───────┼───────────┤
│ 1    │ Date("2020-06-16") │  1    │  3    │ 4      │ 7     │ 10000     │
│ 2    │ Date("2020-06-15") │  2    │  2    │ 5      │ 8     │ 20000     │
│ 3    │ Date("2020-06-14") │  3    │  3    │ 6      │ 9     │ 30000     │

I would like a plot of posts vs date, but the daily data is too noisy, so I'd like to take sum the posts for every week and plot that instead? What's the easiest way to achieve that.

like image 1000
Keno Fischer Avatar asked Jun 16 '20 22:06

Keno Fischer


People also ask

How to plot a line chart in a Dataframe?

This is how the DataFrame would look like: Finally, plot the DataFrame by adding the following syntax: You’ll notice that the kind is now set to ‘line’ in order to plot the line chart. Here is the complete Python code: And once you run the code, you’ll get this line chart: Bar charts are used to display categorical data.

How to plot a Dataframe in Python?

Finally, you can plot the DataFrame by adding the following syntax: Notice that you can specify the type of chart by setting kind = ‘scatter’ You’ll also need to add the Matplotlib syntax to show the plot (ensure that the Matplotlib package is install in Python ):

How to plot a pie chart using PANDAS?

Plot a Pie Chart using Pandas Step 1: Prepare your data For demonstration purposes, the following data about the status of tasks was prepared: Tasks... Step 2: Create the DataFrame You can then create the DataFrame using this code: import pandas as pd data = {'Tasks':... Step 3: Plot the DataFrame ...

When to use a honeycomb plot in Excel?

In other words, when the number of data points is enormous, and each data point can’t be plotted separately, it’s better to use this kind of plot that represents data in the form of a honeycomb. Also, the color of each hexbin defines the density of data points in that range.


1 Answers

The TimeSeries package provides various utilities to work with TimeSeries data. In this case, you can use the collapse to convert from daily to weekly data:

julia> using TimeSeries, DataFrames

julia> ta = TimeArray(df.date, df.posts)
1311×1 TimeArray{Int64,1,Date,Array{Int64,1}} 2016-10-19 to 2020-06-16
│            │ A     │
├────────────┼───────┤
│ 2016-10-19 │ 1     │
│ 2016-10-20 │ 2     │
│ 2016-10-21 │ 3     │
│ 2016-10-23 │ 4     │
...

julia> weekly = collapse(ta, week, last, sum)
192×1 TimeArray{Int64,1,Date,Array{Int64,1}} 2016-10-23 to 2020-06-16
│            │ A     │
├────────────┼───────┤
│ 2016-10-23 │ 10    │
│ 2016-10-28 │ 22    │
│ 2016-11-06 │ 34    │
...

julia> using Gadfly
julia> plot(DataFrame(weekly)[1:end-1,:], x=:timestamp, y=:A, Geom.line(), Guide.ylabel("Weekly sum of Posts"), Guide.xlabel("Week"))
like image 98
Keno Fischer Avatar answered Oct 29 '22 19:10

Keno Fischer