I want to write a web application using Google App Engine (so the reference language would be Python). My application needs a simple search engine, so the users would be able to find data specifying keywords.
For example, if I have one table with those rows:
1 Office space
2 2001: A space odyssey
3 Brazil
and the user queries for "space", rows 1 and 2 would be returned. If the user queries for "office space", the result should be rows 1 and 2 too (row 1 first).
What are the technical guidelines/algorithms to do this in a simple way?
Can you give me good pointers to the theory behind this?
Thanks.
Edit: I'm not looking for anything complex here (say, indexing tons of data).
Go to any cluster and select the “Search” tab to do so. From there, you can click on “Create Search Index” to launch the process. Once the index is created, you can use the $search operator to perform full-text searches.
A fulltext index uses internal tables called full-text index fragments to store the inverted index data. This view can be used to query the metadata about these fragments. This view contains a row for each full-text index fragment in every table that contains a full-text index.
Full-text search makes it easy to search the contents of a database. Users specify the search text criteria, such as keywords, and the system scans one or more indexes for matches.
Like uses wildcards only, and isn't all that powerful. Full text allows much more complex searching, including And, Or, Not, even similar sounding results (SOUNDEX) and many more items.
Read Tim Bray's series of posts on the subject.
- Background
- Usage of search engines
- Basics
- Precision and recall
- Search engne intelligence
- Tricky search terms
- Stopwords
- Metadata
- Internationalization
- Ranking results
- XML
- Robots
- Requirements list
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With