The brief code is like this:
class Word(Base):
__tablename__ = 'word'
eng = Column(String(32),primary_key=True)
chinese = Column(String(128))
word = Word(eng='art',chinese=[u'艺术',u'美术'])
session.add(word)
session.commit()
I'm trying to store word.chinese as a string. And in python it's a list... Well, when I write sql myself I could str(word.chinese) and then insert into the database. When need to get it, I could simply eval(result) to get the original python object. But since I'm using the sqlalchemy to store my objects, I wonder where to change to reach my goal...
One of the key aspects of any data science workflow is the sourcing, cleaning, and storing of raw data in a form that can be used upstream. This process is commonly referred to as “Extract-Transform-Load,” or ETL for short.
Interesting to note that querying using bare sqlite3 is still about 3 times faster than using SQLAlchemy Core. I guess that's the price you pay for having a ResultProxy returned instead of a bare sqlite3 row. SQLAlchemy Core is about 8 times faster than using ORM. So querying using ORM is a lot slower no matter what.
To store a list in a db you could use a new table:
class Word(Base):
__tablename__ = "words"
id = Column(Integer, primary_key=True)
eng = Column(String(32), unique=True)
chinese = relationship("Chinese", backref="eng")
def __init__(self, eng, chinese):
self.eng = eng
self.chinese = map(Chinese, chinese)
class Chinese(Base):
__tablename__ = "chinese_words"
word = Column(String(128), primary_key=True)
eng_id = Column(Integer, ForeignKey('words.id'), primary_key=True)
def __init__(self, word):
self.word = word
See full example.
Don't use str()
/eval()
if you want to store chinese
as a blob you could use json.dumps()
/json.loads()
. Using suggested by @thebjorn TypeDecorator
:
class Json(TypeDecorator):
impl = String
def process_bind_param(self, value, dialect):
return json.dumps(value)
def process_result_value(self, value, dialect):
return json.loads(value)
class Word(Base):
__tablename__ = "words"
eng = Column(String(32), primary_key=True)
chinese = Column(Json(128))
See full example.
You'll find the functionality you're asking for in TypeDecorator
( http://docs.sqlalchemy.org/en/rel_0_7/core/types.html#sqlalchemy.types.TypeDecorator -- you'll have to create e.g. a subclass of list to get it to work).
However, what you're trying to do is store two different translations for the English word art (at least that's what google translate is telling me :-). Storing them as a comma-separated list in a text field is not first normal form. You should store two records
('art', u'艺术')
('art', u'美术')
and change your database structure to allow for this.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With