I'm trying to perform a sentiment analysis in Python using Keras. To do so, I need to do a word embedding of my texts. The problem appears when I try to fit the data to my model: <pre class="prettyprint"><code>model_1 = Sequential() model_1.add(Embedding(1000,32, input_length = X_train.shape[0])) model_1.add(Flatten()) model_1.add(Dense(250, activation='relu')) model_1.add(Dense(1, activation='sigmoid')) model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy']) </code></pre> The shape of my train data is <pre class="prettyprint"><code>(4834,) </code></pre> And is a Pandas series object. When I try to fit my model and validate it with some other data I get this error: <pre class="prettyprint"><code>model_1.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=2, batch_size=64, verbose=2) </code></pre> <blockquote> ValueError: Error when checking model input: expected embedding_1_input to have shape (None, 4834) but got array with shape (4834, 1) </blockquote> How can I reshape my data to make it suited for Keras? I've been trying with np.reshape but I cannot place None elements with that function. Thanks in advance

You need a specific version of Pandas for this to work. If you use the current version (as of 20th Aug 2018) this will fail. Rollback your Pandas and Keras (pip uninstall ....) and then install a specific version like this <pre class="prettyprint"><code>python -m pip install pandas==0.19.2 </code></pre>

Use tf.data.Dataset.from_tensor_slices to read the values from a pandas dataframe. See https://www.tensorflow.org/tutorials/load_data/pandas_dataframe for reference how to do this properly in TF2.x

Pandas DataFrame and Keras

Tags:

python

pandas

keras

I'm trying to perform a sentiment analysis in Python using Keras. To do so, I need to do a word embedding of my texts. The problem appears when I try to fit the data to my model:

model_1 = Sequential()
model_1.add(Embedding(1000,32, input_length = X_train.shape[0]))
model_1.add(Flatten())
model_1.add(Dense(250, activation='relu'))
model_1.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

The shape of my train data is

(4834,)

And is a Pandas series object. When I try to fit my model and validate it with some other data I get this error:

model_1.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=2, batch_size=64, verbose=2)

ValueError: Error when checking model input: expected embedding_1_input to have shape (None, 4834) but got array with shape (4834, 1)

How can I reshape my data to make it suited for Keras? I've been trying with np.reshape but I cannot place None elements with that function.

Thanks in advance

457

asked May 09 '17 17:05

Gonzalo Donoso

4 Answers

None is the number of expected rows that goes into training therefore you can't define it. Also Keras needs a numpy array as input and not a pandas dataframe. First convert the df to a numpy array with df.values and then do np.reshape((-1, 4834)). Note that you should use np.float32. This is important if you train it on GPU.

149

answered Oct 24 '22 00:10

Dat Tran

https://pypi.org/project/keras-pandas/

Easiest way is having the keras_pandas package to fit a pandas dataframe to keras.The code shown below is an general example from the package docs.

from keras import Model
from keras.layers import Dense

from keras_pandas.Automater import Automater
from keras_pandas.lib import load_titanic

observations = load_titanic()

# Transform the data set, using keras_pandas
categorical_vars = ['pclass', 'sex', 'survived']
numerical_vars = ['age', 'siblings_spouses_aboard', 'parents_children_aboard', 'fare']
text_vars = ['name']

auto = Automater(categorical_vars=categorical_vars, numerical_vars=numerical_vars, text_vars=text_vars,
 response_var='survived')
X, y = auto.fit_transform(observations)

# Start model with provided input nub
x = auto.input_nub

# Fill in your own hidden layers
x = Dense(32)(x)
x = Dense(32, activation='relu')(x)
x = Dense(32)(x)

# End model with provided output nub
x = auto.output_nub(x)

model = Model(inputs=auto.input_layers, outputs=x)
model.compile(optimizer='Adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Train model
model.fit(X, y, epochs=4, validation_split=.2)

answered Oct 24 '22 00:10

Pardhu

You need a specific version of Pandas for this to work. If you use the current version (as of 20th Aug 2018) this will fail.

Rollback your Pandas and Keras (pip uninstall ....) and then install a specific version like this

python -m pip install pandas==0.19.2

answered Oct 24 '22 00:10

Tim Seed

Use tf.data.Dataset.from_tensor_slices to read the values from a pandas dataframe.

See https://www.tensorflow.org/tutorials/load_data/pandas_dataframe for reference how to do this properly in TF2.x

answered Oct 23 '22 22:10

Aleksey Vlasenko

Related questions
                            
                                Can you suggest a good minhash implementation?
                            
                                Display JSON returned from Flask in a neat way
                            
                                Divide string by line break or period with Python regular expressions
                            
                                NLTK for Persian
                            
                                Pandas groupby: get size of a group knowing its id (from .grouper.group_info[0])
                            
                                Is there a python equivalent to Laravel 4?
                            
                                Running coverage inside virtualenv
                            
                                Selecting Data between Specific hours in a pandas dataframe
                            
                                Optional dot in regex
                            
                                Combining random forest models in scikit learn
                            
                                Is there any example of cv2.KalmanFilter implementation?
                            
                                Django queryset filter filefield not empty
                            
                                Finding subsequence (nonconsecutive)
                            
                                How to train Word2vec on very large datasets?
                            
                                create date in python without time
                            
                                Why does virtualenv effectively disable Python 3 tab-completion?
                            
                                Is there a way to specify the width of a rectangle in PIL?
                            
                                Pytest - no tests ran
                            
                                SHA 256 Different Result
                            
                                How to use Basemap (Python) to plot US with 50 states?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Pandas DataFrame and Keras

Tags:

python

pandas

keras

Gonzalo Donoso

People also ask

4 Answers

Dat Tran

Pardhu

Tim Seed

Aleksey Vlasenko

Recent Activity

Donate For Us