I've just come back to R from a long hiatus writing and I'm having some real problems remembering how to reshape data. I know that what I want to do is easy, but for some reason I'm being dumb tonight and have confused myself with melt and reshape. If anyone could quickly point me in the right direction it would be hugely appreciated.
I have a dataframe as such:
person week year
personA 6 1
personA 22 1
personA 41 1
personA 42 1
personA 1 2
personA 23 2
personB 8 2
personB 9 2
....
personN x y
I want to end up with a count of events by year and by person: (so that I can plot a quick line graph for each person over the years )
e.g.
person year1 year2
personA 4 2
personB 0 2
Many thanks for reading.
melt() function is used to reshape a DataFrame from a wide to a long format. It is useful to get a DataFrame where one or more columns are identifier variables, and the other columns are unpivoted to the row axis leaving only two non-identifier columns named variable and value by default.
There are many different ways to reshape a pandas dataframe from long to wide form. But the pivot_table() method is the most flexible and probably the only one you need to use once you learn it well, just like how you only need to learn one method melt to reshape from wide to long (see my other post below).
The numpy. reshape() function shapes an array without changing the data of the array. Return Type: Array which is reshaped without changing the data.
I would probably use reshape2
package and the dcast
function since it handles both the reshaping and aggregation in one step:
library(reshape2)
> dcast(person ~ year, value.var = "year", data = dat)
Aggregation function missing: defaulting to length
person 1 2
1 personA 4 2
2 personB 0 2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With