Where should we use <pre class="prettyprint"><code>X_train,X_test,y_train,y_test= train_test_split(data, test_size=0.3, random_state=42) </code></pre> and where should we use <pre class="prettyprint"><code>train, test= train_test_split(data, test_size=0.3, random_state=0). </code></pre> The former one return this: <blockquote> value error: not enough values to unpack (expected 4, got 2) </blockquote>

The first form you use if you want to split instances with features (X) and labels (y). The second form you use if you only want to split features (X). <pre class="prettyprint"><code>X_train, X_test, y_train, y_test= train_test_split(data, y, test_size=0.3, random_state=42) </code></pre> The reason why it didn' t work for you was because you didn't prodide the label data in your <code>train_test_split()</code> function. The above should work well. Just replace <code>y</code> with your label/target data.

if you have 1 data list, it split to 2, <pre class="prettyprint"><code> |---data_train data ----train_test_split()--| |---data_test </code></pre> if you have 2 data list, it split EACH of the data list to 2, that is 4 in total. <pre class="prettyprint"><code> |---data_train_x |---data_train_y data_x, data_y ----train_test_split()--| |---data_test_x |---data_test_y </code></pre> The same as n data list.

How to split training and test sets?

Tags:

python

split

Where should we use

X_train,X_test,y_train,y_test= train_test_split(data, test_size=0.3, random_state=42)

and where should we use

train, test= train_test_split(data, test_size=0.3, random_state=0).

The former one return this:

value error: not enough values to unpack (expected 4, got 2)

458

asked May 30 '18 09:05

MSG

2 Answers

The first form you use if you want to split instances with features (X) and labels (y). The second form you use if you only want to split features (X).

X_train, X_test, y_train, y_test= train_test_split(data, y, test_size=0.3, random_state=42)

The reason why it didn' t work for you was because you didn't prodide the label data in your train_test_split() function. The above should work well. Just replace y with your label/target data.

answered Oct 20 '22 19:10

MrLeeh

if you have 1 data list, it split to 2,

                             |---data_train
data ----train_test_split()--|
                             |---data_test

if you have 2 data list, it split EACH of the data list to 2, that is 4 in total.

                                       |---data_train_x
                                       |---data_train_y
data_x, data_y ----train_test_split()--|
                                       |---data_test_x
                                       |---data_test_y

The same as n data list.

answered Oct 20 '22 19:10

Leoli

Related questions
                            
                                Use JWT Token created by Python in Java
                            
                                Issue using qualitative brewer palettes in plotnine
                            
                                How to get back to default tensorflow version on google colab
                            
                                How to save Keras model progress into a file?
                            
                                Using tf.data.Dataset makes saved model bigger
                            
                                Extract only body text from arXiv articles formatted as .tex
                            
                                Python numpy: perform function on each pair of columns in a numpy 2-D array?
                            
                                zsh: /usr/local/bin/youtube-dl: bad interpreter: /usr/local/opt/python/bin/python2.7: no such file or directory
                            
                                How to batch delete buckets
                            
                                Using RandomForestClassifier.decision_path, how do I tell which samples the classifier used to make a decision?
                            
                                How to limit tensorflow memory usage?
                            
                                Sqlite database backup and restore in flask sqlalchemy
                            
                                Type hint a subclass of list
                            
                                Implementing Tags using Django rest framework
                            
                                Importing matplotlib.pyplot fails in PyCharm due to AttributeError: module 'PyQt5.QtGui' has no attribute 'QApplication'
                            
                                Return Longest Path with nodes of same value
                            
                                extracting graph from printed ecg
                            
                                Jupyter Notebook Input Line Executed Before Print Statement
                            
                                How to link python 2.7 with latest openssl version in MAC OS?
                            
                                Using Scrapy on a Google cache of a website

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With