Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

'DataFrame' object has no attribute 'ravel' when transforming target variable?

I was fitting a logistic regression with a subset dataset. After splitting the dataset and fitting the model, I got a error message of the following:

/Users/Eddie/anaconda/lib/python3.4/site-packages/sklearn/utils/validation.py:526: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). y = column_or_1d(y, warn=True)

So I use target_newrdn = target_newrdn.ravel() to modify my target variable but it gave me this:

AttributeError: 'DataFrame' object has no attribute 'ravel'

I am wondering what the problem was and how can I fix? Can anyone help, please?

My code:

    from sklearn.datasets import fetch_covtype
    import numpy as np
    import pandas as pd

    from sklearn.utils import shuffle
    from sklearn.model_selection import train_test_split

    cov = fetch_covtype()
    cov_data = pd.DataFrame(cov.data)
    cov_target = pd.DataFrame(cov.target)

    data_newrdn = cov_data.head(n=10000)
    target_newrdn = cov_target.head(n=10000)


    target_newrdn = target_newrdn.ravel() ## I thought this could fix it??


    X_train2, X_test2, y_train2, y_test2 = train_test_split(data_newrdn, 
    target_newrdn, random_state=42)

    scaler.fit(X_train2)
    X_train_scaled2 = scaler.transform(X_train2)

    # Logistic Regression
    param_grid = {'C': [0.001, 0.01, 0.1, 1, 10, 100, 1000]}
    print(param_grid)
    grid = GridSearchCV(LogisticRegression(), param_grid, cv=kfold) 
    grid.fit(X_train_scaled2, y_train2)
    print("Best cross-validation score w/ kfold: 
    {:.2f}".format(grid.best_score_))
    print("Best parameters: ", grid.best_params_)
like image 910
Edward Lin Avatar asked Feb 17 '18 13:02

Edward Lin


People also ask

How do you resolve a DataFrame object has no attribute?

Fix error while creating the dataframe If we use dataframe it will throw an error because there is no dataframe attribute in pandas. The method is DataFrame(). We need to pass any dictionary as an argument. Since the dictionary has a key, value pairs we can pass it as an argument.

What is Ravel in pandas?

ravel() function returns the flattened underlying data as an ndarray. Syntax: Series.ravel(order='C') Parameter : order. Returns : ndarray.

Which is not an attribute of DataFrame object?

the reason of " 'DataFrame' object has no attribute 'Number'/'Close'/or any col name " is because you are looking at the col name and it seems to be "Number" but in reality it is " Number" or "Number " , that extra space is because in the excel sheet col name is written in that format.

What does pandas describe () method return?

The describe() method returns description of the data in the DataFrame. If the DataFrame contains numerical data, the description contains these information for each column: count - The number of not-empty values. mean - The average (mean) value.


1 Answers

Clearly, dataframe does not have ravel function. Try:

target_newrdn.values.ravel()

target_newrdn.values returns a numpy ndarray and you perform ravel on that. Note this returns a flattened numpy array. You may need to convert back to a dataframe.

But I think you need flatten() instead, because it returns a copy and so if you modify the array returned by ravel, it does not modify the entries in the original array. 

like image 89
Austin Avatar answered Sep 23 '22 13:09

Austin