How to organize search architecture on large amount of data about 10 million in .net

Question

I have a table, which store about 10 million records(items), each record have itemtype(it's a reference key to table types). On my site i have a search functionality which based on fulltext search, it was ok, but a few days ago my customer want to show on the site not only all items but the itemtypes too.

I try to do two parallel request to different servers(one to repl server and one to main):

-- first request - group items by itemtypeid(with using fulltext search) and return list of itemtypeid

--second request - search keywords in database(fulltext search)

on the web server i aggregate results from these requests and push it to the web browser. Problem: first request run not so fast as i want :) (it's very slow) and after six(eight) month there will be more than 11 million items, so, first request will be slower and slower.

Show me, please, the right way.

apros · Accepted Answer

Except what was said by @sonyc you should also take into account:

The order in which tables should be joined. How does a database management system carry out a join between the Item and ItemType tables? It is useful to have a bit of an idea of what may happen, so you can make some informed decisions about adding indexes.

One approach to joining tables is called nested loops. This means that you scan down the rows in one table, and for each row, you look through all the rows in the other table to find matches for the join condition.

Obviously, which table is in the outside loop will make a difference. If we start scanning the ItemType table, we need to be able to quickly find the row with the matching ItemID in the Item table. If we start by choosing rows in the Item table, we need to quickly find the matching ItemID in the ItemType table. Because there will always be an index on the primary key ItemID in the Item table, the first option will always be quite efficient.
Which fields should be indexed.

Another approach to doing a join is to first sort both tables by the join field. It is then very easy to find matching rows. This is called a merge join. Sorting each table is an expensive operation. However, if the tables are already sorted (they both have a clustered index on the join field ItemID), then this merging operation is very efficient.

We looked at a couple ways the database system could carry out a join: with nested loops or with a sort and merge. Which one will occur? Fortunately, we don’t have to worry about this, as good relational database products have a query optimizer to figure out the most efficient way.

So, for futher query optimisation we should carry out using Query Optimizer. The query optimizer will take into account a number of things, such as which indexes are present, the number of rows in the tables, the length of the rows, and which fields are required in the output. An optimizer will look at all the possible steps for completing the task and assign time costs to each. It then comes up with the most efficient plan.

You can also use query plan analysis tools to investigate the effect of adding indexes to your tables. The index can speeds up your query significantly, especially when your tables get bigger.Indexes are usually automatically added for primary key fields. Indexes on fields that you want to order by or use in a select condition can also be useful. It is always worth checking the usefulness of adding an index to foreign key fields, as these are often used in join conditions. However, indexes come at a cost because they need to be updated every time a row in the table is added, deleted, or altered. This can slow some updating operations while speeding some retrieval operations. You need to decide how important the various efficiencies are for your particular situation.

How to organize search architecture on large amount of data about 10 million in .net

Tags:

.net

architecture

search-engine

Antony Blazer

1 Answers

apros

Recent Activity

Donate For Us

How to organize search architecture on large amount of data about 10 million in .net

Tags:

.net

architecture

search-engine

Antony Blazer

1 Answers

apros

Related questions

Recent Activity

Donate For Us