Is it possible to write custom collation functions with indexes in SQLAlchemy? SQLite for example allows specifying the sorting function at a C level as sqlite3_create_collation().
An implementation of some of the Unicode collation algorithm has been provided by James Tauber here, which for example sorts all the "a"'s close together whether they have accents on them or not.
Other examples of why this might be useful is for different alphabet orders (languages other than English) and sorting numeric values (sorting 10 after 9 rather than codepoint order.)
Is this possible in SQLAlchemy? If not, is it supported by the pysqlite3 or MySQLdb modules, or for any other SQL database modules supported by python for that matter?
Any information would be greatly appreciated.
Below is an example demonstrating unicode collation algorithm for sqlite:
from sqlalchemy import *
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
from pyuca import Collator
metadata = MetaData()
Base = declarative_base(metadata=metadata)
class Item(Base):
__tablename__ = 'Item'
id = Column(Integer, primary_key=True)
value = Column(String, nullable=False)
collator = Collator('allkeys.txt')
def collate_unicode(value1, value2):
return cmp(collator.sort_key(value1), collator.sort_key(value2))
engine = create_engine('sqlite://')
engine.raw_connection().create_collation('unicode', collate_unicode)
metadata.create_all(engine)
session = sessionmaker(engine)()
for word in [u"ĉambr", u"ĉar", u"car'", u"carin'", u"ĉe", u"ĉef'",
u"centjar'", u"centr'", u"cerb'", u"cert'", u"ĉes'", u"ceter'"]:
item = Item(value=word)
session.add(item)
session.commit()
for item in session.query(Item).order_by(collate(Item.value, 'unicode')):
print item.value
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With