I am working on a ML algorithm in which I tried to convert the continuous target values into small bins to understand the problem better. Hence to make better prediction. My original problem is for regression but I convert into classification by making small bins with labels. I did as follow, <pre class="prettyprint"><code>from sklearn.preprocessing import KBinsDiscretizer est = KBinsDiscretizer(n_bins=3, encode='ordinal', strategy='uniform') s = est.fit(target) Xt = est.transform(s) </code></pre> It shows a value error like below. Then I reshaped my data into 2D. yet I could not solve it. <blockquote> ValueError: Expected 2D array, got 1D array instead: </blockquote> <pre class="prettyprint"><code>from sklearn.preprocessing import KBinsDiscretizer myData = pd.read_csv("train.csv", delimiter=",") target = myData.iloc[:,-5] # this is a continuous data which must be # converted into bins with a new column. xx = target.values.reshape(21263,1) est = KBinsDiscretizer(n_bins=3, encode='ordinal', strategy='uniform') s = est.fit(xx) Xt = est.transform(s) </code></pre> You can see my target has 21263 rows. I have to divide these into 10 equal bins and write it into a a new column in my dataframe. Thanks for the guidance. P.S.: Max target value:185.0 Min target value:0.00021

Okay I was able to solve it. In any case I post the answer if anyone else need this in the future. I used <code>pandas.qcut</code> <pre class="prettyprint"><code>target['Temp_class'] = pd.qcut(target['Temeratue'], 10, labels=False) </code></pre> This has solved my problem.

How to use KBinsDiscretizer to make continuous data into bins in Sklearn?

Tags:

python-3.x

machine-learning

numpy

scikit-learn

sklearn-pandas

I am working on a ML algorithm in which I tried to convert the continuous target values into small bins to understand the problem better. Hence to make better prediction. My original problem is for regression but I convert into classification by making small bins with labels.

I did as follow,

from sklearn.preprocessing import KBinsDiscretizer  
est = KBinsDiscretizer(n_bins=3, encode='ordinal', strategy='uniform')
s = est.fit(target) 
Xt = est.transform(s)

It shows a value error like below. Then I reshaped my data into 2D. yet I could not solve it.

ValueError: Expected 2D array, got 1D array instead:

from sklearn.preprocessing import KBinsDiscretizer

myData = pd.read_csv("train.csv", delimiter=",")
target = myData.iloc[:,-5]  # this is a continuous data which must be 
                        # converted into bins with a new column.

xx = target.values.reshape(21263,1)

est = KBinsDiscretizer(n_bins=3, encode='ordinal', strategy='uniform')
s = est.fit(xx) 
Xt = est.transform(s)

You can see my target has 21263 rows. I have to divide these into 10 equal bins and write it into a a new column in my dataframe. Thanks for the guidance.

P.S.: Max target value:185.0
Min target value:0.00021

585

asked Dec 28 '18 19:12

Mass17

1 Answers

Okay I was able to solve it. In any case I post the answer if anyone else need this in the future. I used pandas.qcut

target['Temp_class'] = pd.qcut(target['Temeratue'], 10, labels=False)

This has solved my problem.

answered Sep 19 '22 01:09

Mass17

Related questions
                            
                                Python 3 - counting matches in two lists (including duplicates)
                            
                                Tkinter Not Found
                            
                                How to upload a text file using Python-Requests without writing to disk
                            
                                Print a nested list line by line - Python
                            
                                Installing scipy in Python 3.5 on 32-bit Windows 7 Machine
                            
                                What's the best way to "periodically" replace characters in a string in Python?
                            
                                Core Reporting API - How to use multiple dimensionFilterClauses filters?
                            
                                How to create a list of a range with incremental step?
                            
                                Python SSL X509: KEY_VALUES_MISMATCH
                            
                                How to update Tensorflow on mac?
                            
                                Python 3.6.0: 'os' module does not have 'sched_getaffinity' method
                            
                                Extended interpolation not working in configparser
                            
                                How to fetch all the child nodes of an XML using python?
                            
                                Folium map not displaying in Spyder
                            
                                Calculate Distances Between One Point in Matrix From All Other Points
                            
                                How to use static type checking using Dict with different value types in Python 3.6?
                            
                                AttributeError: 'ElementTree' object has no attribute 'tag' in Python
                            
                                In pytorch how do you use add_param_group () with a optimizer?
                            
                                How can I bold text in telepot Telegram bot?
                            
                                Check if a function was called as a decorator

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to use KBinsDiscretizer to make continuous data into bins in Sklearn?

Tags:

python-3.x

machine-learning

numpy

scikit-learn

sklearn-pandas

Mass17

People also ask

1 Answers

Mass17

Recent Activity

Donate For Us