<pre class="prettyprint"><code>col1= ['A','B','A','C','A','B','A','C','A','C','A','A','A'] col2= [1,1,4,2,4,5,6,3,1,5,2,1,1] df = pd.DataFrame({'col1':col1, 'col2':col2}) </code></pre> for A we have <code>[1,4,4,6,1,2,1,1]</code>, 8 items but i want to limit the size to 5 while converting Data frame to dict/list Output: <pre class="prettyprint"><code>Dict = {'A':[1,4,4,6,1],'B':[1,5],'C':[2,3,5]} </code></pre>

Use <code>pandas.DataFrame.groupby</code> with <code>apply</code>: <pre class="prettyprint"><code>df.groupby('col1')['col2'].apply(lambda x:list(x.head(5))).to_dict() </code></pre> Output: <pre class="prettyprint"><code>{'A': [1, 4, 4, 6, 1], 'B': [1, 5], 'C': [2, 3, 5]} </code></pre>

how to limit the duplicate to 5 in pandas data frames?

Tags:

python

pandas

col1= ['A','B','A','C','A','B','A','C','A','C','A','A','A']
col2= [1,1,4,2,4,5,6,3,1,5,2,1,1]

df = pd.DataFrame({'col1':col1, 'col2':col2})

for A we have [1,4,4,6,1,2,1,1], 8 items but i want to limit the size to 5 while converting Data frame to dict/list

Output:

Dict = {'A':[1,4,4,6,1],'B':[1,5],'C':[2,3,5]}

673

asked Aug 20 '19 06:08

Sunil

1 Answers

Use pandas.DataFrame.groupby with apply:

df.groupby('col1')['col2'].apply(lambda x:list(x.head(5))).to_dict()

Output:

{'A': [1, 4, 4, 6, 1], 'B': [1, 5], 'C': [2, 3, 5]}

157

answered Oct 20 '22 00:10

Chris

Related questions
                            
                                unsupported operand type(s) for +: 'int' and 'str' using Pandas mean
                            
                                Upload CSV file using Python Flask and process it
                            
                                SQLAlchemy verify SSL connection
                            
                                Is there a pytorch method to check the number of cpus?
                            
                                Merge 'left', but override 'right' values where possible
                            
                                Resample with categories in pandas, keep non-numerical columns
                            
                                How to reshape a list without numpy
                            
                                Python console in Power BI
                            
                                BucketIterator throws 'Field' object has no attribute 'vocab'
                            
                                Is it possible to specify handle_unknown = 'ignore' for certain columns and 'error' for others inside OneHotEncoder?
                            
                                Efficient metrics evaluation in PyTorch
                            
                                tqdm: extract time passed + time remaining?
                            
                                How to pass parameters to Airflow on_success_callback and on_failure_callback
                            
                                How to find skewness and kurtosis correctly in pandas?
                            
                                Uploading a Dataframe to AWS S3 Bucket from SageMaker
                            
                                How to write cache function or equivalent in Python?
                            
                                How do you use pandas read_csv() method if the csv is stored as a variable?
                            
                                What are the differences between http and socket inside of ini file in uWSGI?
                            
                                Measuring the diameter pictures of holes in metal parts, photographed with telecentric, monochrome camera with opencv
                            
                                Pandas to_sql set column type

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With