Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can't pickle int object error when object comes from SQLAlchemy?

I am using YAML and SQLAlchemy. I defined my object, and I am able to use YAML to print that just fine. However, when I try to use YAML on the object returned from a SQLAlchemy query, it is failing with the error can't pickle int objects. I printed out the instance returned from SQLAlchemy, and it is showing the correct type. I'll let the code do the talking:

class HashPointer(Base):
    __tablename__ = 'hash_pointers'

    id = Column(Integer, primary_key=True)
    hash_code = Column(VARBINARY(64), unique=True)
    file_pointer = Column(Text)

    def __init__(self, hash_code, file_pointer):
        self.hash_code = hash_code
        self.file_pointer = file_pointer

    def __repr__(self):
        return "<HashPointer('%s', '%s')>" % (self.hash_code, self.file_pointer)

from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
Engine = create_engine("mysql://user:pass@localhost/db", echo=True)
Session = sessionmaker(bind=Engine)
session = Session()
fhash = HashPointer(0x661623708235, "c:\\test\\001.txt")

# PRINTS FINE
print(yaml.dump(fhash))

for instance in session.query(HashPointer).all():
    # PRINTS FINE AS __repr__
    print instance

    # THROWS ERROR, 'CAN'T PICKLE INT OBJECTS'
    print(yaml.dump(instance))
like image 550
esac Avatar asked Oct 09 '22 17:10

esac


2 Answers

Try adding the following to your class:

def __reduce__(self):
    'Return state information for pickling'
    return self.__class__, (int(self.hash_code), str(self.file_pointer))
like image 150
Raymond Hettinger Avatar answered Oct 12 '22 08:10

Raymond Hettinger


It turns out that the default reduce_ex method (im pretty sure this is the one in object(), but it doesn't have to be.) that comes down the line when you have sqlalchemy active, adds a _sa_instance_state member to the 'state' returned in the reduce_ex API which PyYAML uses to perform serialization.

When serializing an object coming from a SqlAlchemy query, this is essentially a hidden part of the object's metadata, which is accessible to further operations.

It is this object in which the PyYAML serializer is failing. You can verify this by running your serialization in PDB, and seeing two calls to represent_object in your call stack, even for relatively simple SQLAlchemy query object results.

This query instance link is used, as I understand it, to power methods with let you poke back at the query which generates a given object from within the same python interpreter's lifetime.

If you care about that functionality (stuff like session.new & session.dirty), you will need to implement support for that in PyYAML's serializer.

If you don't care, and just want your declared members, you can use a base class which 'hides' that linkage from calls to reduce* -- note that this will also break the SQLAlchemy serializer extension as well though, so consider your plans carefully.

An example of a base class to implement that change is:

DeclBase = declarative_base()

class Base(DeclBase):
    __abstract__ = True

    def __reduce_ex__(self, proto):
        ret = super(Base, self).__reduce_ex__(proto)
        ret = ( ret[0], ret[1], dict(ret[2]) ) + ret[3:]
        ret[2].pop('_sa_instance_state', None) # remove bad yamly from reduce state
        return ret

This will then allow you to roundtrip your objects in/out of yaml, though a round trip will disassociate them from any pending transactions or queries. This could also have interactions if you are using lazy-loaded members, for one example. Make sure you are serializing everything you expect.

NOTE/EDIT: I chose to use reduce_ex here, to be compatible with possible other base classes or mixins. According to https://docs.python.org/2/library/pickle.html#object.reduce_ex, this will produce the correct behavior for any base classes, also detecting if only reduce() was declared.

Redux... reduce will return the actual dict of the instance object -- we don't want to delete in from there, so for __reduce*, we must actually shallow copy that dict.

like image 32
Bryon Roché Avatar answered Oct 12 '22 10:10

Bryon Roché