Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do you query object collections in Java (Criteria/SQL-like)?

Suppose you have a collection of a few hundred in-memory objects and you need to query this List to return objects matching some SQL or Criteria like query. For example, you might have a List of Car objects and you want to return all cars made during the 1960s, with a license plate that starts with AZ, ordered by the name of the car model.

I know about JoSQL, has anyone used this, or have any experience with other/homegrown solutions?

like image 301
stian Avatar asked Sep 18 '08 15:09

stian


People also ask

What is query object in database?

Business Central query objects enable you to retrieve records from one or more tables and then combine the data into rows and columns in a single dataset. Query objects can also perform calculations on data, such finding the sum or average of all values in a column of the dataset.

What are query objects?

A Query Object is an interpreter [Gang of Four], that is, a structure of objects that can form itself into a SQL query. You can create this query by refer-ring to classes and fields rather than tables and columns.

Which object is used for executing queries?

The Command Object uses the connection object to execute SQL queries. The queries can be in the Form of Inline text, Stored Procedures or direct Table access. An important feature of Command object is that it can be used to execute queries and Stored Procedures with Parameters.


2 Answers

Filtering is one way to do this, as discussed in other answers.

Filtering is not scalable though. On the surface time complexity would appear to be O(n) (i.e. already not scalable if the number of objects in the collection will grow), but actually because one or more tests need to be applied to each object depending on the query, time complexity more accurately is O(n t) where t is the number of tests to apply to each object.

So performance will degrade as additional objects are added to the collection, and/or as the number of tests in the query increases.

There is another way to do this, using indexing and set theory.

One approach is to build indexes on the fields within the objects stored in your collection and which you will subsequently test in your query.

Say you have a collection of Car objects and every Car object has a field color. Say your query is the equivalent of "SELECT * FROM cars WHERE Car.color = 'blue'". You could build an index on Car.color, which would basically look like this:

'blue' -> {Car{name=blue_car_1, color='blue'}, Car{name=blue_car_2, color='blue'}} 'red'  -> {Car{name=red_car_1, color='red'}, Car{name=red_car_2, color='red'}} 

Then given a query WHERE Car.color = 'blue', the set of blue cars could be retrieved in O(1) time complexity. If there were additional tests in your query, you could then test each car in that candidate set to check if it matched the remaining tests in your query. Since the candidate set is likely to be significantly smaller than the entire collection, time complexity is less than O(n) (in the engineering sense, see comments below). Performance does not degrade as much, when additional objects are added to the collection. But this is still not perfect, read on.

Another approach, is what I would refer to as a standing query index. To explain: with conventional iteration and filtering, the collection is iterated and every object is tested to see if it matches the query. So filtering is like running a query over a collection. A standing query index would be the other way around, where the collection is instead run over the query, but only once for each object in the collection, even though the collection could be queried any number of times.

A standing query index would be like registering a query with some sort of intelligent collection, such that as objects are added to and removed from the collection, the collection would automatically test each object against all of the standing queries which have been registered with it. If an object matches a standing query then the collection could add/remove it to/from a set dedicated to storing objects matching that query. Subsequently, objects matching any of the registered queries could be retrieved in O(1) time complexity.

The information above is taken from CQEngine (Collection Query Engine). This basically is a NoSQL query engine for retrieving objects from Java collections using SQL-like queries, without the overhead of iterating through the collection. It is built around the ideas above, plus some more. Disclaimer: I am the author. It's open source and in maven central. If you find it helpful please upvote this answer!

like image 149
npgall Avatar answered Sep 22 '22 16:09

npgall


I have used Apache Commons JXPath in a production application. It allows you to apply XPath expressions to graphs of objects in Java.

like image 24
Eric Weilnau Avatar answered Sep 24 '22 16:09

Eric Weilnau