Hi I am running a flask app with a postgreSQL database. I get LockErrors when using multiple workers. I learned that this is because the whoosh search locks the database
http://stackoverflow.com/questions/36632787/postgres-lockerror-how-to-investigate
As explained in this link I have to use BufferedWriter... I google around, but I really can't figure out how to implement it? Here is my database setup in terms of whoosh
import sys
if sys.version_info >= (3, 0):
enable_search = False
else:
enable_search = True
import flask.ext.whooshalchemy as whooshalchemy
class User(db.Model):
__searchable__ = ['username','email','position','institute','id'] # these fields will be indexed by whoosh
id = db.Column(db.Integer, primary_key=True)
username = db.Column(db.String(100), index=True)
...
def __repr__(self):
return '<User %r>' % (self.username)
if enable_search:
whooshalchemy.whoosh_index(app, User)
help is much appreciated thanks carl
EDIT: If there is no capability for parallel access in flask-whosshsqlalchemy are there any alternatives you could suggest?
As you can read here:
http://whoosh.readthedocs.io/en/latest/threads.html
Only one writer can hold lock. Buffered writer, keeps your data for sometime, but... at some point your objects are stored, and that mean - lock.
According to that document async writer is something that you are looking for, but... That would try to store your data, if fails - it will create additional thread, and retry. Let's suppose you are throwing 1000 new items. Potentially you will end up with something like 1000 threads. It can be better to treat each insert as a task, and send it to separate thread. If there are many processes, you can stack that tasks. For instance - insert 10, and wait. If that 10 are inserted as a batch, in short time? Will work - for some time...
Sample with async reader - to make buffered - simply rename import, and usage.
import os, os.path
from whoosh import index
from whoosh.fields import SchemaClass, TEXT, KEYWORD, ID
if not os.path.exists("data"):
os.mkdir("data")
# http://whoosh.readthedocs.io/en/latest/schema.html
class MySchema(SchemaClass):
path = ID(stored=True)
title = TEXT(stored=True)
icon = TEXT
content = TEXT(stored=True)
tags = KEYWORD
# http://whoosh.readthedocs.io/en/latest/indexing.html
ix = index.create_in("data", MySchema, indexname="myindex")
writer = ix.writer()
writer.add_document(title=u"My document", content=u"This is my document!",
path=u"/a", tags=u"first short", icon=u"/icons/star.png")
writer.add_document(title=u"Second try", content=u"This is the second example.",
path=u"/b", tags=u"second short", icon=u"/icons/sheep.png")
writer.add_document(title=u"Third time's the charm", content=u"Examples are many.",
path=u"/c", tags=u"short", icon=u"/icons/book.png")
writer.commit()
# needed to release lock
ix.close()
#http://whoosh.readthedocs.io/en/latest/api/writing.html#whoosh.writing.AsyncWriter
from whoosh.writing import AsyncWriter
ix = index.open_dir("data", indexname="myindex")
writer = AsyncWriter(ix)
writer.add_document(title=u"My document no 4", content=u"This is my document!",
path=u"/a", tags=u"four short", icon=u"/icons/star.png")
writer.add_document(title=u"5th try", content=u"This is the second example.",
path=u"/b", tags=u"5 short", icon=u"/icons/sheep.png")
writer.add_document(title=u"Number six is coming", content=u"Examples are many.",
path=u"/c", tags=u"short", icon=u"/icons/book.png")
writer.commit()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With