Let's say I have the following data table: <pre class="prettyprint"><code>dt=data.table(type=c('big','medium','small','small' ,'medium','small','small' ,'big','medium','small','small') ,category=letters[1:11]) type category 1: big a 2: medium b 3: small c 4: small d 5: medium e 6: small f 7: small g 8: big h 9: medium i 10: small j 11: small k </code></pre> In this case I have a category hierarchy: the 'big' type is the same for all rows until a following 'big' type is seen. And the behavior is the same for every type. The reshape I want must give me the following: <pre class="prettyprint"><code>dt=data.table(type=c('big','medium','small','small' ,'medium','small','small' ,'big','medium','small','small') ,category=letters[1:11]) big medium small 1: a b c 2: a b d 3: a e f 4: a e g 5: h i j 6: h i k </code></pre> As you can see each category only changes when a register of the same category is found, the order is important to set this categories. Do you think there is a way to do this without using a for?

Here's an approach that you can use. You'll need <code>na.locf</code> from "zoo": <pre class="prettyprint"><code>library(data.table) library(zoo) </code></pre> First, we need to figure out the final rows. To do this, we need to explicitly define what the order of the types is, as you can start from the same <code>dt</code> and get different results, if the order is changed (that's what the <code>match</code> part does). Once you have the numeric order, if the diff is less than or equal to zero, that means it's going to be a new row in the new table: <pre class="prettyprint"><code>dt[, rid := match(type, c('big', 'medium', 'small'))][, row := cumsum(diff(c(0, rid)) <= 0)] </code></pre> This is what the data looks like now: <pre class="prettyprint"><code>dt # type category rid row # 1: big a 1 0 # 2: medium b 2 0 # 3: small c 3 0 # 4: small d 3 1 # 5: medium e 2 2 # 6: small f 3 2 # 7: small g 3 3 # 8: big h 1 4 # 9: medium i 2 4 #10: small j 3 4 #11: small k 3 5 </code></pre> Here it is in the form you've requested: <pre class="prettyprint"><code>na.locf(dcast(dt, row ~ type, value.var = "category")) # row big medium small # 1: 0 a b c # 2: 1 a b d # 3: 2 a e f # 4: 3 a e g # 5: 4 h i j # 6: 5 h i k </code></pre>

How can I reshape a data.table when the order of the registers determines the category?

Let's say I have the following data table:

dt=data.table(type=c('big','medium','small','small'
                     ,'medium','small','small'
                     ,'big','medium','small','small')
             ,category=letters[1:11])

      type category
 1:    big        a
 2: medium        b
 3:  small        c
 4:  small        d
 5: medium        e
 6:  small        f
 7:  small        g
 8:    big        h
 9: medium        i
10:  small        j
11:  small        k

In this case I have a category hierarchy: the 'big' type is the same for all rows until a following 'big' type is seen. And the behavior is the same for every type.

The reshape I want must give me the following:

dt=data.table(type=c('big','medium','small','small'
                     ,'medium','small','small'
                     ,'big','medium','small','small')
              ,category=letters[1:11])


   big medium small
1:   a      b     c
2:   a      b     d
3:   a      e     f
4:   a      e     g
5:   h      i     j
6:   h      i     k

As you can see each category only changes when a register of the same category is found, the order is important to set this categories.

Do you think there is a way to do this without using a for?

What does reshape mean in Stata?

Title. stata.com. reshape — Convert data from wide to long form and vice versa.

What does setDT do in R?

setDT converts lists (both named and unnamed) and data. frames to data. tables by reference. This feature was requested on Stackoverflow.

What is Dcast function in R?

dcast: Convert data between wide and long forms.

Here's an approach that you can use. You'll need na.locf from "zoo":

library(data.table)
library(zoo)

First, we need to figure out the final rows. To do this, we need to explicitly define what the order of the types is, as you can start from the same dt and get different results, if the order is changed (that's what the match part does). Once you have the numeric order, if the diff is less than or equal to zero, that means it's going to be a new row in the new table:

dt[, rid := match(type, c('big', 'medium', 'small'))][, row := cumsum(diff(c(0, rid)) <= 0)]

This is what the data looks like now:

dt
#      type category rid row
# 1:    big        a   1   0
# 2: medium        b   2   0
# 3:  small        c   3   0
# 4:  small        d   3   1
# 5: medium        e   2   2
# 6:  small        f   3   2
# 7:  small        g   3   3
# 8:    big        h   1   4
# 9: medium        i   2   4
#10:  small        j   3   4
#11:  small        k   3   5

Here it is in the form you've requested:

na.locf(dcast(dt, row ~ type, value.var = "category"))
#    row big medium small
# 1:   0   a      b     c
# 2:   1   a      b     d
# 3:   2   a      e     f
# 4:   3   a      e     g
# 5:   4   h      i     j
# 6:   5   h      i     k

How can I reshape a data.table when the order of the registers determines the category?

Tags:

r

data.table

reshape

categories

Aldo Pareja

People also ask

1 Answers

A5C1D2H2I1M1N2O1R2T1

Recent Activity

Donate For Us

How can I reshape a data.table when the order of the registers determines the category?

Tags:

r

data.table

reshape

categories

Aldo Pareja

People also ask

1 Answers

A5C1D2H2I1M1N2O1R2T1

Related questions

Recent Activity

Donate For Us