I know how to randomly sample few rows from a pandas data frame. Lets say I had a data frame df, then to get a fraction of rows, I can do : <pre class="prettyprint"><code>df_sample = df.sample(frac=0.007) </code></pre> However what I need is random rows as above AND also random columns from the above data frame. Df is currently 56Kx8.5k. If I want say 500x1000 where both 500 and 1000 are randomly sampled how to do this? I think one approach would be do something like df.columns to get a list of columns names. Then do some random sampling of the indices of this list of columns and use that random indices to filter out remaining columns?

Just call <code>sample</code> twice, with corresponding axis parameters: <pre class="prettyprint"><code>df.sample(n=500).sample(n=1000, axis=1) </code></pre> For the first one, axis=0 by default. The first sampling samples lines, while the second considers columns.

Random Sampling of Pandas data frame (both rows and columns)

Tags:

python

pandas

numpy

I know how to randomly sample few rows from a pandas data frame. Lets say I had a data frame df, then to get a fraction of rows, I can do :

df_sample = df.sample(frac=0.007)

However what I need is random rows as above AND also random columns from the above data frame.

Df is currently 56Kx8.5k. If I want say 500x1000 where both 500 and 1000 are randomly sampled how to do this?

I think one approach would be do something like

df.columns to get a list of columns names.

Then do some random sampling of the indices of this list of columns and use that random indices to filter out remaining columns?

734

asked Jun 28 '16 22:06

Baktaawar

1 Answers

Just call sample twice, with corresponding axis parameters:

df.sample(n=500).sample(n=1000, axis=1)

For the first one, axis=0 by default. The first sampling samples lines, while the second considers columns.

101

answered Oct 08 '22 10:10

ayhan

Related questions
                            
                                How to redirect to another form view in python code - Odoo 8
                            
                                Exporting Scrapy requests in Curl format
                            
                                What kind of scores are returned by cross_validation.cross_val_score?
                            
                                mypy explicit type hint in quotes still gives not defined error
                            
                                Can I integrate MathJax into a Python program?
                            
                                Fastest way from logic matrix to list of sets
                            
                                Accessing Big Query from Cloud DataLab using Pandas
                            
                                Tensorflow LSTM RNN output activation function
                            
                                How to change opacity of image and merge with another image in Python
                            
                                Starting jupyter notebook programmatically from another notebook
                            
                                Python Igraph community cluster colors
                            
                                Error while converting tuples to Pandas DataFrame
                            
                                Python logger not printing out 'extra' dictionary info [closed]
                            
                                Empty colorbar using basemap.pcolor in an ImageGrid
                            
                                Unable to remove legend box [python]
                            
                                How to turn a string with unquoted keys into a dict in Python
                            
                                Python Requests - authentication after redirect
                            
                                `requirements.txt` dependencies, getting only high level dependencies
                            
                                Can I sign an X509 certificate entirely in Python?
                            
                                Numeric value directly after backreference [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With