I have data like dataframe df_a, and want to have it converted to the format as in dataframe df_b. xtabs() gives similar result, but I did not find a way to access elements as in the example code below. Accessing through xa[1,1] gives no advantage since there is a weak correlation between indexing by numbers ("1") and names ("A"). As you can see there is a sort difference in the xtabs() result, so xa[2,2]=2 and not 0 as on the df_b listing. <pre class="prettyprint"><code> > df_a ItemName Feature Amount 1 First A 2 2 First B 3 3 First A 4 4 Second C 3 5 Second C 2 6 Third D 1 7 Fourth B 2 8 Fourth D 3 9 Fourth D 2 > df_b ItemName A B C D 1 First 6 3 0 0 2 Second 0 0 5 0 3 Third 0 0 0 1 4 Fourth 0 2 0 5 > df_b$A [1] 6 0 0 0 > xa<-xtabs(df_a$Amount~df_a$ItemName+df_a$Feature) > xa df_a$Feature df_a$ItemName A B C D First 6 3 0 0 Fourth 0 2 0 5 Second 0 0 5 0 Third 0 0 0 1 > xa$A Error in xa$A : $ operator is invalid for atomic vectors </code></pre> There is a way of iterative conversion with for() loops, but totally inefficient in my case because my data has millions of records. For the purpose of further processing my required output format is dataframe. If anyone solved similar problem please share.

You can just use <code>as.data.frame.matrix(xa)</code> <pre class="prettyprint"><code># output A B C D First 6 3 0 0 Fourth 0 2 0 5 Second 0 0 5 0 Third 0 0 0 1 ## or df_b <- as.data.frame.matrix(xa)[unique(df_a$ItemName), ] data.frame(ItemName = row.names(df_b), df_b, row.names = NULL) # output ItemName A B C D 1 First 6 3 0 0 2 Second 0 0 5 0 3 Third 0 0 0 1 4 Fourth 0 2 0 5 </code></pre>

Without using <code>xtabs</code> you can do something like this: <pre class="prettyprint"><code>df %>% dplyr::group_by(ItemName, Feature) %>% dplyr::summarise(Sum=sum(Amount, na.rm = T)) %>% tidyr::spread(Feature, Sum, fill=0) %>% as.data.frame() </code></pre> This will transform as you require and it stays as a <code>data.frame</code> Or, you can just <code>as.data.frame(your_xtabs_result)</code> and that should work too

How to convert the result of xtabs() into dataframe in R? [duplicate]

Tags:

dataframe

r

I have data like dataframe df_a, and want to have it converted to the format as in dataframe df_b.

xtabs() gives similar result, but I did not find a way to access elements as in the example code below. Accessing through xa[1,1] gives no advantage since there is a weak correlation between indexing by numbers ("1") and names ("A"). As you can see there is a sort difference in the xtabs() result, so xa[2,2]=2 and not 0 as on the df_b listing.

    > df_a
      ItemName Feature Amount
    1    First       A      2
    2    First       B      3
    3    First       A      4
    4   Second       C      3
    5   Second       C      2
    6    Third       D      1
    7   Fourth       B      2
    8   Fourth       D      3
    9   Fourth       D      2
    > df_b
      ItemName A B C D
    1    First 6 3 0 0
    2   Second 0 0 5 0
    3    Third 0 0 0 1
    4   Fourth 0 2 0 5
    > df_b$A
    [1] 6 0 0 0

    > xa<-xtabs(df_a$Amount~df_a$ItemName+df_a$Feature)
    > xa
                 df_a$Feature
    df_a$ItemName A B C D
           First  6 3 0 0
           Fourth 0 2 0 5
           Second 0 0 5 0
           Third  0 0 0 1
    > xa$A
    Error in xa$A : $ operator is invalid for atomic vectors

There is a way of iterative conversion with for() loops, but totally inefficient in my case because my data has millions of records.

For the purpose of further processing my required output format is dataframe. If anyone solved similar problem please share.

568

asked Jan 23 '19 09:01

cineS.

2 Answers

You can just use as.data.frame.matrix(xa)

# output
       A B C D
First  6 3 0 0
Fourth 0 2 0 5
Second 0 0 5 0
Third  0 0 0 1

## or
df_b <- as.data.frame.matrix(xa)[unique(df_a$ItemName), ]
data.frame(ItemName = row.names(df_b), df_b, row.names = NULL)
# output
  ItemName A B C D
1    First 6 3 0 0
2   Second 0 0 5 0
3    Third 0 0 0 1
4   Fourth 0 2 0 5

113

answered Nov 15 '22 00:11

nghauran

Without using xtabs you can do something like this:

df %>%
 dplyr::group_by(ItemName, Feature) %>%
 dplyr::summarise(Sum=sum(Amount, na.rm = T)) %>%
 tidyr::spread(Feature, Sum, fill=0) %>%
 as.data.frame()

This will transform as you require and it stays as a data.frame

Or, you can just as.data.frame(your_xtabs_result) and that should work too

answered Nov 14 '22 23:11

morgan121

Related questions
                            
                                R read.csv Importing Column Names Incorrectly
                            
                                How do I "flush" data to my RSQLite disk database?
                            
                                r gis: identify inner borders between polygons with sf
                            
                                Can R read html-encoded emoji characters?
                            
                                How to conditionally replace values in r data frame using if/then statement
                            
                                Converting a number into time (0,5 of an hour = 00:30:00)
                            
                                R bookdown - custom title page
                            
                                Add titles to ggplots created with map()
                            
                                Set transparency/saturation of palette in ggplot
                            
                                Creating a named vector using dplyr
                            
                                Size legend of sf object won't show correct symbols
                            
                                Stacked barplot with colour gradients for each bar
                            
                                Error in osmar::get_osm() downloading OSM data fails: SYSTEM or PUBLIC, the URI is missing
                            
                                Singularity in backsolve at level 0, block 1 in LME model
                            
                                RDS file size difference between ggplot2 objects created inside vs. outside function
                            
                                Split and re-concatenate a string
                            
                                Retrieve Census tract from Coordinates [closed]
                            
                                dplyr lag with n from column values
                            
                                Center leaflet in a rmarkdown document
                            
                                Fixing the order of a Sankey flow graph in R / networkD3 package

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With