Storing Inverted Index

Tags:

I know that inverted indexing is a good way to index words, but what I'm confused about is how the search engines actually store them? For example, if a word "google" appears in document - 2, 4, 6, 8 with different frequencies, where should store them? Can a database table with one-to-many relation would do any good for storing them?

400

asked Sep 18 '14 06:09

user3036757

1 Answers

It is highly unlikely that fullfledged SQL-like databases are used for this purpose. First, it is called an inverted index because it is just an index. Each entry is just a reference. As non-relational databases and key-value stores came up as a favourite topic in relation to web technology.

You only ever have one way of accessing the data (by query word). That is why it's called an index.
Each entry is a list/array/vector of references to documents, so each element of that list is very small. The only other information besides of storing a documentID would be to store a tf-idf score for each element.

How to use it:

If you have a single query word ("google") then you look up in the inverted index in which documents this word turns up (2,4,6,8 in your example). If you have tf-idf scores, you can sort the results to report the best matching document first. You then go and look up which documents the document IDs 2,4,6,8 refer to, and report their URL as well as a snippet etc. URL, snippets etc are probably best stored in another table or key-value store.

If you have multiple query words ("google" and "altavista"), you look into the II for both query words and you get two lists of document IDs (2,4,6,8 and 3,7,8,11,19). You take the intersection of both lists, which in this case is (8), which is the list of documents in which both query words occur.

174

answered Sep 22 '22 05:09

Unapiedra

Related questions
                            
                                Properly using classes in other classes in php?
                            
                                How to debug a database query for performance
                            
                                How to compare two databases?
                            
                                How can I get the foreign keys of a table in mysql
                            
                                Storing Serializable Objects in the Database
                            
                                CodeIgniter - Blank page on database autoload
                            
                                ON DUPLICATE KEY: multi-column unique constraint
                            
                                Should every table have a primary key?
                            
                                PyMongo and toArray() method
                            
                                Codeigniter PDO integration
                            
                                How to save all versions of posts in mysql database
                            
                                How to populate production database (heroku) with development data? (Rails)
                            
                                Oracle public server
                            
                                Do Windows 8 Metro style application support SQL Server CE local database?
                            
                                oracle - integrity constraint violated - child record found
                            
                                Databases for reporting and daily transactions
                            
                                Can I dry run/sandbox sql commands?
                            
                                What is the equivalent of varchar type in PostgreSQL?
                            
                                SQL multi SET with one WHERE
                            
                                MYSQL select where field does not contain a certain string character

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Storing Inverted Index

Tags:

database

indexing

search-engine

inverted-index

user3036757

People also ask

1 Answers

Unapiedra

Recent Activity

Donate For Us