Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is Data Reader better or Data Set for application where we may have Concurrency issue

I know the difference between Data Reader and Data Set.

The DataReader is a better choice for applications that require optimized read-only, fast and forward-only data access.

The Data set is better for Application wherein you can get all the data and update it according to your needs at application level and submit the change to the database.

Please clear if there is anything wrong in my understanding.

Now I had an interview there a person asked me. Is datareader or Connected architecture good for application like ticketing system. Basically she meant were many user might be trying to update the same table. Thus the concept of Concurrency comes.

We can Use Disconnected architecture to check for concurrency and let only one user update the table at a time. But dont know how it happens in terms of connected Architecture. Does the connection to the data base and particularly to the table concerned would that make only one user do the update while others who try to hit later wont be able to do that.

Wont it affect the performance if all the user have opened a connection as database will reach bottle neck.

I hope i will get the answer to understand them.

like image 903
user2432715 Avatar asked Oct 04 '22 21:10

user2432715


1 Answers

I think its not a matter of which one is better, since data is already old/invalid once it reaches the client. Showing a table of reservations can be useful to get a rough view of what reservations are made, but it might be totally different within the next second. You want to eliminate race conditions. A good architecture is necessary to start with.

One way to do this is to 'reserve' the ticket [1]. The application asks to get a ticket that is available given the matched criteria. At this point its a known fact on whether the ticket is available or not. If it was available, it was already reserved as well. This avoids multiple reservations for one ticket. The next reservation (same operation/action) will result into a different ticket being reserved. You can always add information to this ticket later (such as the owner of the ticket and his/her information) if required. Tickets that do not have information attached to it, will timeout after a certain amount of minutes and will return back to the pool. These tickets can be 'reserved' again [1].

[1] To avoid multiple assignments, use optimistic locking.

To answer the question, I would say DataReader. It keeps the database communication to a minimum (load and locks), so it can handle updates as fast as possible. Just keep in mind picking one over another doesn't solve concurrency problems. It's the total solution that matters.

Example

I don't know the requirements, but since it's an interview question I'll give an example. Don't take this as a golden rule, but off the tip of my head it would be something like this:

(if required) First the user is shown a screen that there are tickets left in the system that can be reserved. Open a connection, and a reader to read the amount of tickets available for reservation. Close the reader and connection. The user proceeds to the next screen.

SELECT COUNT(*)
FROM [Tickets]
WHERE ([LastReserved] IS NULL OR [LastReserved] <= DATEADD(MINUTE, GETDATE(), @ticketTimeout))
    AND [TickedAssignedToUserId] IS NULL;

The user requests an x-amount of tickets and proceeds to the next screen. At this moment the system checks with optimistic locking if there are enough tickets available. Simply open a connection (with transaction!) and execute the following query:

UPDATE TOP(@numberOfTicketsRequested) [Tickets]
SET [LastReserved]=GETDATE()
WHERE ([LastReserved] IS NULL OR [LastReserved] <= DATEADD(MINUTE, GETDATE(), @ticketTimeout))
    AND [TickedAssignedToUserId] IS NULL;

The number of rows affected should be the same as @numberOfTicketsRequested. If this is the case, commit the transaction and get it's ticket identifier. Otherwise rollback and tell the user that there are no tickets available anymore. At this point we need the record information, so you might want to get the identifier as well.

At this point, the user gets @ticketTimeout amount of minutes time to enter their user details. If done correctly, the following query can be executed:

UPDATE TOP(@numberOfTicketsRequested) [Tickets]
SET [TickedAssignedToUserId]=@userId
WHERE [TicketId]=@Id AND [LastReserved]=@lastReserved AND [TickedAssignedToUserId] IS NULL;

If the user took longer than, say 10 minutes, and somebody else requested the same ticket again, then the LastReserved timestamp has changed. When the first user tried to reserve the ticket with their details, the update does not match the original LastReserved timestamp anymore, and the update will show not enough rows affected (=rollback). If it matches the number of rows affected, the user successfully reserved the tickets (=commit).

Note that no ticket information except for ticket identifiers have reached the application. Nor have I included user registration. No full tables are being passed, and locks are just being used minimally (just for two short updates).

like image 75
Caramiriel Avatar answered Oct 07 '22 22:10

Caramiriel