I have a dataframe (df1) like this. <pre class="prettyprint"><code> f1 f2 f3 f4 f5 d1 1 0 1 1 1 d2 1 0 0 1 0 d3 0 0 0 1 1 d4 0 1 0 0 1 </code></pre> The d1...d4 column is the rowname, the f1...f5 row is the columnname. To do sample(df1), I get a new dataframe with count of 1 same as df1. So, the count of 1 is conserved for the whole dataframe but not for each row or each column. Is it possible to do the randomization row-wise or column-wise? I want to randomize the df1 column-wise for each column, i.e. the number of 1 in each column remains the same. and each column need to be changed by at least once. For example, I may have a randomized df2 like this: (Noted that the count of 1 in each column remains the same but the count of 1 in each row is different. <pre class="prettyprint"><code> f1 f2 f3 f4 f5 d1 1 0 0 0 1 d2 0 1 0 1 1 d3 1 0 0 1 1 d4 0 0 1 1 0 </code></pre> Likewise, I also want to randomize the df1 row-wise for each row, i.e. the no. of 1 in each row remains the same, and each row need to be changed (but the no of changed entries could be different). For example, a randomized df3 could be something like this: <pre class="prettyprint"><code> f1 f2 f3 f4 f5 d1 0 1 1 1 1 <- two entries are different d2 0 0 1 0 1 <- four entries are different d3 1 0 0 0 1 <- two entries are different d4 0 0 1 0 1 <- two entries are different </code></pre> PS. Many thanks for the help from Gavin Simpson, Joris Meys and Chase for the previous answers to my previous question on randomizing two columns.

This is another way to shuffle the <code>data.frame</code> using package <code>dplyr</code>: row-wise: <pre class="prettyprint"><code>df2 <- slice(df1, sample(1:n())) </code></pre> or <pre class="prettyprint"><code>df2 <- sample_frac(df1, 1L) </code></pre> column-wise: <pre class="prettyprint"><code>df2 <- select(df1, one_of(sample(names(df1)))) </code></pre>

How to randomize (or permute) a dataframe rowwise and columnwise?

Tags:

random

r

permutation

I have a dataframe (df1) like this.

     f1   f2   f3   f4   f5 d1   1    0    1    1    1   d2   1    0    0    1    0 d3   0    0    0    1    1 d4   0    1    0    0    1

The d1...d4 column is the rowname, the f1...f5 row is the columnname.

To do sample(df1), I get a new dataframe with count of 1 same as df1. So, the count of 1 is conserved for the whole dataframe but not for each row or each column.

Is it possible to do the randomization row-wise or column-wise?

I want to randomize the df1 column-wise for each column, i.e. the number of 1 in each column remains the same. and each column need to be changed by at least once. For example, I may have a randomized df2 like this: (Noted that the count of 1 in each column remains the same but the count of 1 in each row is different.

     f1   f2   f3   f4   f5 d1   1    0    0    0    1   d2   0    1    0    1    1 d3   1    0    0    1    1 d4   0    0    1    1    0

Likewise, I also want to randomize the df1 row-wise for each row, i.e. the no. of 1 in each row remains the same, and each row need to be changed (but the no of changed entries could be different). For example, a randomized df3 could be something like this:

     f1   f2   f3   f4   f5 d1   0    1    1    1    1  <- two entries are different d2   0    0    1    0    1  <- four entries are different d3   1    0    0    0    1  <- two entries are different d4   0    0    1    0    1  <- two entries are different

PS. Many thanks for the help from Gavin Simpson, Joris Meys and Chase for the previous answers to my previous question on randomizing two columns.

697

asked Jun 21 '11 08:06

a83

2 Answers

Given the R data.frame:

> df1   a b c 1 1 1 0 2 1 0 0 3 0 1 0 4 0 0 0

Shuffle row-wise:

> df2 <- df1[sample(nrow(df1)),] > df2   a b c 3 0 1 0 4 0 0 0 2 1 0 0 1 1 1 0

By default sample() randomly reorders the elements passed as the first argument. This means that the default size is the size of the passed array. Passing parameter replace=FALSE (the default) to sample(...) ensures that sampling is done without replacement which accomplishes a row wise shuffle.

Shuffle column-wise:

> df3 <- df1[,sample(ncol(df1))] > df3   c a b 1 0 1 1 2 0 1 0 3 0 0 1 4 0 0 0

114

answered Sep 27 '22 19:09

pms

This is another way to shuffle the data.frame using package dplyr:

row-wise:

df2 <- slice(df1, sample(1:n()))

df2 <- sample_frac(df1, 1L)

column-wise:

df2 <- select(df1, one_of(sample(names(df1))))

answered Sep 27 '22 20:09

Enrique Pérez Herrero

Related questions
                            
                                Error: gdal-config not found while installing R dependent packages whereas gdal is installed
                            
                                Easy way to export multiple data.frame to multiple Excel worksheets
                            
                                Specify custom Date format for colClasses argument in read.table/read.csv
                            
                                Sort columns of a dataframe by column name
                            
                                R: Count number of objects in list [closed]
                            
                                switch() statement usage
                            
                                Converting string to numeric [duplicate]
                            
                                R Conditional evaluation when using the pipe operator %>%
                            
                                How can I load an object into a variable name that I specify from an R data file?
                            
                                Getting the top values by group
                            
                                Remove extra legends in ggplot2
                            
                                Subset of rows containing NA (missing) values in a chosen column of a data frame
                            
                                Hosting and setting up own shiny apps without shiny server
                            
                                Define all functions in one .R file, call them from another .R file. How, if possible?
                            
                                Comma separator for numbers in R?
                            
                                List distinct values in a vector in R
                            
                                The cause of "bad magic number" error when loading a workspace and how to avoid it?
                            
                                R programming: How do I get Euler's number?
                            
                                Left align two graph edges (ggplot)
                            
                                Paste multiple columns together

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With