Pandas Concat increases number of rows

Tags:

I'm concatenating two dataframes, so I want to one dataframe is located to another. But first I did some transformation to initial dataframe:

scaler = MinMaxScaler() 
real_data = pd.DataFrame(scaler.fit_transform(df[real_columns]), columns = real_columns)

And then concatenate:

categorial_data  = pd.get_dummies(df[categor_columns], prefix_sep= '__')
train = pd.concat([real_data, categorial_data], axis=1, ignore_index=True)

I dont know why, but number of rows increased:

print(df.shape, real_data.shape, categorial_data.shape, train.shape)
(1700645, 23) (1700645, 16) (1700645, 130) (1703915, 146)

What happened and how fix the problem?

As you can see number of columns for train equals to sum of columns real_data and categorial_data

916

asked May 16 '18 10:05

Rocketq

1 Answers

The problem is that sometimes when you perform several operations on a single dataframe object, the index persists in the memory. So using df.reset_index() will solve your problem.

answered Oct 12 '22 23:10

saket ram

Related questions
                            
                                Swapping/Ordering multi-index columns in pandas
                            
                                python map() on zipped object
                            
                                What is the difference between var, cvar and ivar in python's sphinx?
                            
                                python fuzzywuzzy's process.extract(): how does it work?
                            
                                Repeating letters like excel columns?
                            
                                Resample Daily Data to Monthly with Pandas (date formatting)
                            
                                IB API Python sample not using Ibpy
                            
                                Combining cv2.imshow() with matplotlib plt.show() in real time
                            
                                Numpy diff inverted operation?
                            
                                How to make numpy array column sum up to 1
                            
                                why UniqueConstraint doesn't work in flask_sqlalchemy
                            
                                Why "numpy.any" has no short-circuit mechanism?
                            
                                Can Pandas perform row-wise min() and max() functions?
                            
                                How to copy a file from host to container using docker-py (docker SDK)
                            
                                Django test Client submitting a form with a POST request
                            
                                How to remove case-insensitive duplicates from a list, while maintaining the original list order?
                            
                                Django No module named 'django.db.migrations.migration'
                            
                                Dynamic task definition in Airflow
                            
                                pipenv and pyinstaller not packaging dependencies
                            
                                How to implement deprecation in python with argument alias

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Pandas Concat increases number of rows

Tags:

python

concat

python-3.x

pandas

Rocketq

People also ask

1 Answers

saket ram

Recent Activity

Donate For Us