Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create an IQueryable Web API that can pull data from several data sources?

I'm trying to figure out how to write an IQueryable data source that can pull and combine data from multiple sources (in this case Azure Table, Azure Blobs, and ElasticSearch). I'm really having a hard time figuring out where to start with this though.

The idea is that a web service (in this case an Asp.Net Web Api) can present a queryable, OData interface, but when it gets queried it pulls data from multiple sources depending on what is requested. So large queries might hit the indexing service (ElasticSearch) which wouldn't necessarily have the full object available, but calls to get an individual object would go directly to the Azure Tables. But from the service users perspective it's always just accessing the same data source.

While I would like to just use the index as our search service and the tables as our backup, I have a design requirement that it has to pull data from multiple sources, which greatly complicates this whole thing.

I'm wondering if anyone has any guidance on this or can point me towards the right technologies. Some of the big issues I'm seeing are:

  • the backend objects aren't necessarily the same as the front end object being queried. Multiple back end objects may get combined into a single front end one, or it may have computed values. So a LINQ query would have to be translated or mapped
  • changing data sources based on query parameters

Here is a quick overview of the technology I'm working with:

  • ASP.Net Web API 2 web service running as an Azure Cloud service
  • ElasticSearch running on SUSE VMs (on Azure)
  • Azure Tables
  • Azure Blobs
like image 849
Riplikash Avatar asked Oct 31 '22 23:10

Riplikash


1 Answers

First, you need to separate the data access from the Web API project. The Web API project is merely an interface, so remove it from the equation. The solution to the problem should be the same regardless of whether it is web API or an ASP.NET web page, an MVC solution, a WPF desktop application, etc.

You can then focus on the data problem. What you need is some form of "router" to determine the data source based on the parameters that make the decision. In this case, you are talking about 1 item = azure and more than 1 item - and map reduce when more than 1 item (I would set up the rules as a strategy or similar so you can swap out if you find 1 versus 2+ is not a good condition to change routing).

Then you solve the data access problem for each methodology.

The system as a whole.

  1. User asks for data (user can be a real person or another system through the web api)
  2. Query is partially parsed to determine routing path
  3. Router sends data request to proper class that handles data access for the route
  4. Data is returned
  5. Data is routed back to the user via whatever User interface is used (in this case Web API - see paragraph 1 for other options)

One caution. Don't try to mix all types of persistence, as a generic "I can pull data or a blob or a {name your favorite other persistant storage here}" often ends up becoming a garbage can.

like image 149
Gregory A Beamer Avatar answered Nov 10 '22 10:11

Gregory A Beamer