'DataFrame' object has no attribute 'ravel' when transforming target variable?

Tags:

I was fitting a logistic regression with a subset dataset. After splitting the dataset and fitting the model, I got a error message of the following:

/Users/Eddie/anaconda/lib/python3.4/site-packages/sklearn/utils/validation.py:526: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). y = column_or_1d(y, warn=True)

So I use target_newrdn = target_newrdn.ravel() to modify my target variable but it gave me this:

AttributeError: 'DataFrame' object has no attribute 'ravel'

I am wondering what the problem was and how can I fix? Can anyone help, please?

My code:

    from sklearn.datasets import fetch_covtype
    import numpy as np
    import pandas as pd

    from sklearn.utils import shuffle
    from sklearn.model_selection import train_test_split

    cov = fetch_covtype()
    cov_data = pd.DataFrame(cov.data)
    cov_target = pd.DataFrame(cov.target)

    data_newrdn = cov_data.head(n=10000)
    target_newrdn = cov_target.head(n=10000)


    target_newrdn = target_newrdn.ravel() ## I thought this could fix it??


    X_train2, X_test2, y_train2, y_test2 = train_test_split(data_newrdn, 
    target_newrdn, random_state=42)

    scaler.fit(X_train2)
    X_train_scaled2 = scaler.transform(X_train2)

    # Logistic Regression
    param_grid = {'C': [0.001, 0.01, 0.1, 1, 10, 100, 1000]}
    print(param_grid)
    grid = GridSearchCV(LogisticRegression(), param_grid, cv=kfold) 
    grid.fit(X_train_scaled2, y_train2)
    print("Best cross-validation score w/ kfold: 
    {:.2f}".format(grid.best_score_))
    print("Best parameters: ", grid.best_params_)

910

asked Feb 17 '18 13:02

Edward Lin

1 Answers

Clearly, dataframe does not have ravel function. Try:

target_newrdn.values.ravel()

target_newrdn.values returns a numpy ndarray and you perform ravel on that. Note this returns a flattened numpy array. You may need to convert back to a dataframe.

But I think you need flatten() instead, because it returns a copy and so if you modify the array returned by ravel, it does not modify the entries in the original array.

answered Sep 23 '22 13:09

Austin

Related questions
                            
                                Can a PyMC3 trace be loaded and values accessed without the original model in memory?
                            
                                TensorFlow - tf.layers vs tf.contrib.layers
                            
                                Index out of range when using lambda [duplicate]
                            
                                Pandas - Groupby with conditional formula
                            
                                Improve performance of converting numpy array to MATLAB double
                            
                                Python static method is not always callable
                            
                                Setup in virtualenv: `pip install -e .` vs `python setup.py install`
                            
                                Sorting a list: numbers in ascending, letters in descending
                            
                                Merge MultiIndex columns together into 1 level [duplicate]
                            
                                Python Keras LSTM learning converges too fast on high loss
                            
                                python -docx to extract table from word docx
                            
                                How to get Predictions with XGBoost and XGBoost using Scikit-Learn Wrapper to match?
                            
                                Numpy: assigning values to 2d array with list of indices
                            
                                Django - Supervisor : exited too quickly
                            
                                How to setup working directory in VS Code for pylint?
                            
                                Find locations on a curve where the slope changes
                            
                                Python Pandas groupby apply lambda arguments
                            
                                Efficient way to compute the Vandermonde matrix
                            
                                How to import data into google colab from google drive?
                            
                                ImportError: No module named google.oauth2

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

'DataFrame' object has no attribute 'ravel' when transforming target variable?

Tags:

python

numpy

logistic-regression

sklearn-pandas

Edward Lin

People also ask

1 Answers

Austin

Recent Activity

Donate For Us