Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Architecture of a write-intensive feature

I use Ruby on Rails backed by Oracle database and memcached for my current project.

There is a pretty heavily used feature, which relies on a single database views as a datasource, and this data source internally has other database views and tables inside.

It's a virtual db view, to be able to access everything from one place, not a materialized db view.

Users most of the times if they are in the feature they are looking to update, so having data up to date is important.

When obtaining data from this view, I inner join security table to the view (security table is not part of the view itself) which contains some fields that we use to control data access on more granular level. For example security table has user_id, prop_1, prop_2 columns, where prop_1, prop_2 are columns available on a db view and the user_id is a logged in user. Some users have same props in the security table say prop_1 = 1 and prop_2 = 1, but also can have prop_1 like the other user but have different prop_2 like prop_1 = 2 and prop_2 = 1. There are many different combination of prop_1 and prop_2, think about them as a FK to another table, so possible to have many entries.

By now the time to retrieve the records on the app is almost 10 seconds, it's pretty slow. I'm considering alternative approach.

First thing I though of was the materialized view, but since the user do frequent updates, it might not be the best choice, as refreshing the view might take time.

Second thing I thought about was the cache, to use prop_1 and prop_2 combination as a composite key to the underlying data, as many users have the same combination and whoever has the same combination can access the same data.

However this approach might require more code rewrites and logic to save and retrieve data in fragments, rather from one location with one query like in the database view.

In your experience, how did you address same/similar issue? Or is there a better approach that I could try?

like image 674
Gandalf StormCrow Avatar asked Jun 24 '16 11:06

Gandalf StormCrow


People also ask

What is write intensive?

Writing-Intensive Courses are those in which writing is used as a central mode of learning as well as of evaluating student performance. Students in these courses are expected to write regularly, and their grades in these courses are linked to the quality and content of their written work.

What is a read-intensive workload?

A read-intensive SSD is able to use NAND flash memory with a lower endurance level than an SSD that targets write-intensive workloads, such as online transaction processing, high-performance computing, and data warehousing.

Which distribution model is suitable for read-intensive dataset?

The master services all writes; reads may come from either master or slaves. Master-slave replication is most helpful for scaling when you have a read-intensive dataset.

What is read-intensive database?

Read-intensive databases—whose users need to work with an existing data set multiple times, but not change it very often. Rack storage companies—which create dedicated appliances for video streaming, social media and read-intensive databases.


2 Answers

It is hard to give a good answer without more information about your view, but I'll give it a try.

First of all I question the use of a single very complex view. That is hard to tune and can often cause performance problems, so if it is possible to split it up in the application that would be my first bet.

Second, have you looked at the execution plan (explain plan) for the query with the security filters included? Is it using sensible indexes? If not, create them. Perhaps the security properties are not indexed, for example?

A third option may be to use PL/SQL and call a stored procedure that acts like the view. That gives you more control in the database, making it possible to control the query and split it into multiple steps, but to get the same result as Today.

Finally you may be able to rewrite the view for better performance. One often overlooked feature is the WITH clause, which makes it possible to run a query before the main query and use the result as a table. It has helped me improve performance for complex views dramatically.

DBMS_RLS is cool but can be expensive, it requires the Enterprise Edition and it wouldn't surprise me if you need a separate license too. I would go for a programmatic solution first.

like image 110
ewramner Avatar answered Oct 25 '22 10:10

ewramner


If you are enduring some latencies probably cause to your db, you may migrate some of your views to a REDIS database (in-memory data structure store) which is probably one of the most efficient in "read/write" intensive.

Concerning the update problematic, you may implement a websocket to diffuse/push precise update directly to those who need it.

I underline that this possibility required some modifications on both client & server sides, but I assume that it is the best approach to keep final user view updated with low latency.

Best regards

like image 34
A STEFANI Avatar answered Oct 25 '22 10:10

A STEFANI