Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

RESTful API and bulk operations

I have a middle tier which performs CRUD operations on a shared database. When I converted the product to .NET Core I thought I'd also look at using REST for the API as CRUD is supposed to be what it does well. It seems like REST is a great solution for single record operations, but what happens when I want to delete, say, 1,000 records?

Every professional multi-user application is going to have some concept of Optimistic Concurrency checking: you can't have one user wipe out the work of another user without some feedback. As I understand it, REST handles this with the HTTP ETag header record. If the ETag send by the client doesn't match the server's tag, then you issue a 412 Precondition Failed. So far, so good. But what do I use when I want to delete 1,000 records? The back-and-forth time for 1,000 individual calls is considerable, so how would REST handle a batch operation that involved Optimistic Concurrency?

like image 405
Quarkly Avatar asked Aug 05 '17 19:08

Quarkly


People also ask

What are operations in REST API?

The primary or most-commonly-used HTTP verbs (or methods, as they are properly called) are POST, GET, PUT, PATCH, and DELETE. These correspond to create, read, update, and delete (or CRUD) operations, respectively.

Can API be used for batch processing?

Batch Processing API (or shortly "batch API") enables you to request data for large areas and/or longer time periods for any Sentinel Hub supported collection, including BYOC (bring your own data). It is an asynchronous REST service.


1 Answers

RESTs focus is on resources and the decoupling of clients from servers, it is though not a simple CRUD architecture or protocol. While CRUD and REST seem to be very similar, managing resources through REST principles can often also have sideeffects. Therefore, describing REST as simple CRUD thing is an oversimplification.

In regards to batch-processing of REST resources, the underlying protocol (most often HTTP) does define the capabilities that can be used. HTTP defines a couple of operations that can be used to modify multiple resources.

POST is the all-purpose, swiss-army knife of the protocol and can be used to literally manage resources to your likings. As the semantics are defined by the developer you can use it to create, update or delete multiple resources at once.

PUT has the semantics of replacing the state of a resource obtainable at a given URI with the payload body of the request. If you send a PUT request to a "list"-resource and the payload defines a list of entries, you can achieve a batch operation as well.

The fundamental difference between the POST and PUT methods is highlighted by the different intent for the enclosed representation. The target resource in a POST request is intended to handle the enclosed representation according to the resource's own semantics, whereas the enclosed representation in a PUT request is defined as replacing the state of the target resource.

...

A PUT request applied to the target resource can have side effects on other resources. For example, an article might have a URI for identifying "the current version" (a resource) that is separate from the URIs identifying each particular version (different resources that at one point shared the same state as the current version resource). A successful PUT request on "the current version" URI might therefore create a new version resource in addition to changing the state of the target resource, and might also cause links to be added between the related resources. (Source)

PATCH (RFC 5789) is not yet included in the HTTP protocol, though supported by plenty frameworks. It is primarily used to alter multiple resources at once or to perform partial updates on resources, which PUT is also able to achieve if the updated part is a sub-resource of some other resource; in that case it has the effect of a partial update on the outer resource.

It is important to know that a PATCH request contains the necessary steps a server has to fulfill to transform a resource to its intended state. A client therefore has to grab the current state and calculate the necessary steps needed for the transformation beforehand. A very informative blog post on this topic is Don't patch like an idiot. Here JSON Patch (RFC) is a JSON based media type that visualizes the PATCH concept clearly. A patch request has to be applied either fully (each operation defined in the patch request) or applied not at all. It therefore requires a transaction scoped handling and a roll back in case any of the operations failed.

Conditional requests like ETag and IfModifiedSince headers are defined in RFC 7232 and can be used in HTTP requests to perform the modifications only if the request is applied on the most recent version of resource and therefore correlates to an optimistic locking in (distributed) databases.

So far, so good. But what do I use when I want to delete 1,000 records?

This depends on what framework you'll use. If it supports PATCH I clearly vote for PATCH. In case it does not, you are probably safer to use POST than PUT as of the very restrictive semantics PUT has, as the semantics are clearly defined by you then. In case of a batch-delete, PUT can also be used by targeting the collection resource with an empty body which has the result of removing any items in the collection and therefore clearing the whole collection. If some of the items should remain in the collection though, PATCH or POST are probably more easy to use.

like image 71
Roman Vottner Avatar answered Oct 13 '22 13:10

Roman Vottner