Unpickling dictionary that holds pandas dataframes throws AttributeError: 'Dataframe' object has no attribute '_data'

Tags:

I have a class that performs analyses and attaches the results, which are pandas dataframes, as object attributes:

>>> print(test.image.locate_DF)
              y          x       mass  ...    raw_mass        ep  frame
0     60.177142  59.788709  33.433414  ...  242.080256       NaN      0
1     60.651991  59.773904  33.724308  ...  242.355784       NaN      1
2     60.790437  60.190234  31.117164  ...  236.276671       NaN      2
3     60.771933  60.048123  33.558372  ...  240.981395       NaN      3
4     60.251282  59.775139  31.881009  ...  239.239022       NaN      4
...         ...        ...        ...  ...         ...       ...    ...
7212  68.186380  76.477449  18.122817  ...  176.523091       NaN   9410
7213  68.764444  76.574091  17.486454  ...  173.448306       NaN   9415
7214  68.191152  76.473477  17.402975  ...  172.848119  0.868326   9429
7215  67.034103  76.025885  17.010951  ...  170.928067 -0.600854   9431
7216  68.583276  75.309592  17.852992  ...  178.271558       NaN   9432

Subsequently, I save all the important object attributes in a dictionary, and pickle it for later use:

def save_parameters(self, filepath):
        
        param_dict = {}

    try:
            self.image.locate_DF
        except AttributeError:
            pass
        else:
            param_dict['optical_locate_DF'] = self.image.locate_DF

    with open(filepath, 'wb') as handle:
            pickle.dump(param_dict, handle, 5)

When trying to load that pickled file, I have no problem at all, the dataframe loads perfectly:

>>> test.save_parameters('test.pickle')
>>> with open('test.pickle', 'rb') as handle:
...     result = pickle.load(handle)
...
>>> print(result.keys())
dict_keys(['optical_path', 'optical_feature_diameter', 'optical_feature_minmass', 'optical_locate_DF', 'electrical_path', 'electrical_raw_data', 'electrical_processed_data', 'electrical_mean_voltage'])
>>> print(result['optical_locate_DF'])
              y          x       mass  ...    raw_mass        ep  frame
0     60.177142  59.788709  33.433414  ...  242.080256       NaN      0
1     60.651991  59.773904  33.724308  ...  242.355784       NaN      1
2     60.790437  60.190234  31.117164  ...  236.276671       NaN      2
3     60.771933  60.048123  33.558372  ...  240.981395       NaN      3
4     60.251282  59.775139  31.881009  ...  239.239022       NaN      4
...         ...        ...        ...  ...         ...       ...    ...
7212  68.186380  76.477449  18.122817  ...  176.523091       NaN   9410
7213  68.764444  76.574091  17.486454  ...  173.448306       NaN   9415
7214  68.191152  76.473477  17.402975  ...  172.848119  0.868326   9429
7215  67.034103  76.025885  17.010951  ...  170.928067 -0.600854   9431
7216  68.583276  75.309592  17.852992  ...  178.271558       NaN   9432

[7217 rows x 9 columns]

However, after running my analysis on a bunch of these files on a hpc, and then trying to open that same pickled file (it's named differently now but it's the same file as shown above, with the same analysis performed on it), I get thrown an attribute error by pandas. It states that the dataframe has no '_data' attribute. The dictionary has the same keys and the keys that are not a dataframe are printed without any issues:

>>> resultfile = '../results/diam_15_minmass_17_dist_50_mem_5000_tracklength_500/R9_DNA_50mV_001.pickle'
>>> with open(resultfile, 'rb') as handle:
...     result = pickle.load(handle)
...
>>> print(result.keys())
dict_keys(['optical_path', 'optical_feature_diameter', 'optical_feature_minmass', 'optical_locate_DF', 'optical_tracking_distance', 'optical_tracking_memory', 'optical_tracking_DF', 'optical_kinetics_DF', 'electrical_path', 'electrical_raw_data', 'electrical_processed_data', 'electrical_mean_voltage'])
>>> print(result['optical_locate_DF'])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/stevenvanuytsel/miniconda3/envs/simultaneous_measurements/lib/python3.8/site-packages/pandas/core/frame.py", line 680, in __repr__
    self.to_string(
  File "/Users/stevenvanuytsel/miniconda3/envs/simultaneous_measurements/lib/python3.8/site-packages/pandas/core/frame.py", line 801, in to_string
    formatter = fmt.DataFrameFormatter(
  File "/Users/stevenvanuytsel/miniconda3/envs/simultaneous_measurements/lib/python3.8/site-packages/pandas/io/formats/format.py", line 593, in __init__
    self.max_rows_displayed = min(max_rows or len(self.frame), len(self.frame))
  File "/Users/stevenvanuytsel/miniconda3/envs/simultaneous_measurements/lib/python3.8/site-packages/pandas/core/frame.py", line 1041, in __len__
    return len(self.index)
  File "/Users/stevenvanuytsel/miniconda3/envs/simultaneous_measurements/lib/python3.8/site-packages/pandas/core/generic.py", line 5270, in __getattr__
    return object.__getattribute__(self, name)
  File "pandas/_libs/properties.pyx", line 63, in pandas._libs.properties.AxisProperty.__get__
  File "/Users/stevenvanuytsel/miniconda3/envs/simultaneous_measurements/lib/python3.8/site-packages/pandas/core/generic.py", line 5270, in __getattr__
    return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute '_data'

I've looked into the pickle manual, and through a bunch of SO questions, but I can't seem to find out what is going wrong here. Does anyone have an idea how to fix this, and also whether I can still access that data?

886

asked Aug 24 '20 10:08

Steven

3 Answers

I had the same problem. I generated a Pandas dataframe in an environment with Pandas 1.1.1 and saved it to a pickle file.

with open('file.pkl', 'wb') as f:
    pickle.dump(data_frame_object, f)

After unpickling it in another session and printing the dataframe I got the same error. Some testing in different environments showed the following pattern:

environment with Pandas >= 1.1.0: works
environment with Pandas == 1.0.5: error message as above
environment with Pandas == 1.0.3: Kernel crashes

I got the same error using the HDF5 format so it seems to be a compatibility issue with the dataframe and different Pandas versions.

Updating Pandas to 1.1.1 in the affected environments solved the issue for me.

answered Oct 11 '22 06:10

BodoB

After a long and painful process of cross-checking module versions, I found out that this error was caused due to an update in the pandas version. My mac still ran pandas 1.0.5, whereas the hpc runs pandas 1.1.0. Apparently, there is a mismatch between the two (unsure whether it's just after pickling or also for other file formats used to save).

answered Oct 11 '22 05:10

Steven

Maybe the problem has been solved.
Emmm, but I still want to add some comments.

I save the pkl file on the server, but when I load it on my MAC, it crashed, showing 'Dataframe' object has no attribute '_data'

Finally, I found that pandas on my Mac is 1.0.5 but 1.1.5 on the server. When I updated it to the latest, it just worked.

answered Oct 11 '22 07:10

LimingFang

Related questions
                            
                                convert pandas dataframe column from hex string to int
                            
                                Monitoring the asyncio event loop
                            
                                Class that returns False with bool(TheClassItself)
                            
                                Python regex match middle of string
                            
                                Extending list returns None [duplicate]
                            
                                How to avoid flake8's "F821 undefined name '_'" when _ has been installed by gettext?
                            
                                Select all text in a Text widget using Python 3 with tkinter
                            
                                Why am I suddenly getting a no attribute 'CLSIDToPackageMap' error with win32com.client?
                            
                                Determining a variable's type is NoneType in python [duplicate]
                            
                                How to deal with Kivy installing error in Python 3.8?
                            
                                Open files in "rock&roll" mode
                            
                                Is cube root integer?
                            
                                Python, fastest way to iterate over regular expressions but stop on first match
                            
                                How to input 2 integers in one line in Python?
                            
                                Python Requests HTTP Response 406
                            
                                Understanding Stacks and Queues in python
                            
                                Change contrast of image in PIL
                            
                                Body of abstract method in Python 3.5 [duplicate]
                            
                                Implementing retry for requests in Python
                            
                                In Python, how do I call the super class when it's a one-off namedtuple?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Unpickling dictionary that holds pandas dataframes throws AttributeError: 'Dataframe' object has no attribute '_data'

Tags:

python-3.x

pandas

dataframe

pickle

Steven

People also ask

3 Answers

BodoB

Steven

LimingFang

Recent Activity

Donate For Us