Logo Questions Linux Laravel Mysql Ubuntu Git Menu

Panda AssertionError columns passed, passed data had 2 columns

I am working on Azure ML implementation on text analytics with NLTK, the following execution is throwing

AssertionError: 1 columns passed, passed data had 2 columns\r\nProcess returned with non-zero exit code 1

Below is the code

# The script MUST include the following function,
# which is the entry point for this module:
# Param<dataframe1>: a pandas.DataFrame
# Param<dataframe2>: a pandas.DataFrame
def azureml_main(dataframe1 = None, dataframe2 = None):
    # import required packages
    import pandas as pd
    import nltk
    import numpy as np
    # tokenize the review text and store the word corpus
    word_dict = {}
    token_list = []
    nltk.download(info_or_id='punkt', download_dir='C:/users/client/nltk_data')
    nltk.download(info_or_id='maxent_treebank_pos_tagger', download_dir='C:/users/client/nltk_data')
    for text in dataframe1["tweet_text"]:
        tokens = nltk.word_tokenize(text.decode('utf8'))
        tagged = nltk.pos_tag(tokens)

      # convert feature vector to dataframe object
    dataframe_output = pd.DataFrame(tagged, columns=['Output'])
    return [dataframe_output]

Error is throwing here

 dataframe_output = pd.DataFrame(tagged, columns=['Output'])

I suspect this to be the tagged data type passed to dataframe, can some one let me know the right approach to add this to dataframe.

like image 579
Sudheej Avatar asked Aug 12 '16 22:08


1 Answers

Try this:

dataframe_output = pd.DataFrame(tagged, columns=['Output', 'temp'])
like image 116
ragesz Avatar answered Sep 22 '22 10:09
