Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Querying on Java Object's without Database

Note: Little long question. I'm going to give a bounty for best answer.

What I'm trying to do is querying on Object. Here is the details. I have a file called employee.txt. So I parsed it and kept in the list

public static List<Employee> employeeList = new LinkedList<>();

Then here is my logic to query.

Take the query from user, and then parse it. The below is the logic to query through the list.

For ex: here is the query

select * from Employee where id > 10

My codes for that

String valueToCompare = split[5];  //10
EmployeeCriteria criteria = new EmployeeCriteria(
        isId, isName, isSalary, expression,
        valueToCompare);
result = EmployeeData.findAll(
        EmployeeData.employeeList, criteria);

Here is the findAll method

public static List<Employee> findAll(List<Employee> coll,
            ISearch<Employee> chk) {
        List<Employee> l = new LinkedList<Employee>();
        for (Employee obj : coll) {
            if (chk.search(new Employee(obj)))
                l.add(obj);
        }
        return l;
    }

And here is my search method

/**
     * Based on the type provided and for given expression it check against the
     * given value
     */

    @Override
    public boolean search(Employee obj) {
        if (expression.equals(EQUAL)) {
            if (isId()) {
                if (obj.getId() == Long.parseLong(valueToCompare)) {
                    return true;
                }
            } else if (isName()) {
                if (obj.getName().equals(valueToCompare)) {
                    return true;
                }
            } else if (isSalary()) {
                if (obj.getSalary() == Long.parseLong(valueToCompare)) {
                    return true;
                }
            } else {
                System.err.println(UserMessage.INVALIDCOLUMN_NAME);
            }

        } else if (expression.equals(NOT_EQUAL)) {
            if (isId()) {
                if (!(obj.getId() == Long.parseLong(valueToCompare))) {
                    return true;
                }
            } else if (isName()) {
                if (!(obj.getName().equals(valueToCompare))) {
                    return true;
                }
            } else if (isSalary()) {
                if (!(obj.getSalary() == Long.parseLong(valueToCompare))) {
                    return true;
                }
            } else {
                System.err.println(UserMessage.INVALIDCOLUMN_NAME);
            }

        } else if (expression.equals(GREATER)) {
            if (isId()) {
                if ((obj.getId() > Long.parseLong(valueToCompare))) {
                    return true;
                }
            } else if (isSalary()) {
                if ((obj.getSalary() > Long.parseLong(valueToCompare))) {
                    return true;
                }
            } else {
                System.err.println(UserMessage.INVALIDCOLUMN_NAME);
            }

        } else if (expression.equals(LESSER)) {
            if (isId()) {
                if ((obj.getId() < Long.parseLong(valueToCompare))) {
                    return true;
                }
            } else if (isSalary()) {
                if ((obj.getSalary() < Long.parseLong(valueToCompare))) {
                    return true;
                }
            } else {
                System.err.println(UserMessage.INVALID_IDENTIFIER);
            }

        }

        return false;
    }

Let me know if you want to see any other codes.

I just want to know,

In the first place LinkedList is correct data structure to use ? Is this performs well ?? Any enhancements to perform well ?

Any better way to achieve this ?

here are few example queries:

select * where ID > 100
select * where Name != Ramesh
select * where Salary < 500000
select Name order by Name
select ID

Thanks for any help. Bounty will be offered after 2 days. I can't do that now.

Note2: This is a test to check my data manage skills and I cannot use any database.

like image 542
Suresh Atta Avatar asked Sep 24 '15 02:09

Suresh Atta


3 Answers

No, this does not perform well at all. You are searching through N employees every time. So if you have 1 million employees, you will search all 1 million employees before returning the correct employee. Even worst, if it doesn't exist, you will have to search exhaustibly before you can know if it exists.

Is this for use in production? If so then just use SQLite or some other simple database. You want to write once and read multiple times with indexes. I cannot stress enough that what you are writing will have bugs and instead you should use something that was already tested.

Assuming this is not for production and you are just having fun, then you want to emulate what databases do in real life. They create indexes. Indexes are usually best described as Map<String, List<Employee>>.

The idea is that initially reading the data from disk is expensive. But you read it once. For each dimension, Name, Salary, ID, etc... you want create separate indexes.

So let's say you were creating an index of all employees by ID. You would want to do something like:

Map<String, Employee> employeesById = new HashMap<>();

for(Employee e : employees) { 
  employeesById.put(e.getId(), e);
}

The above assumes that employee ids are unique. If they are not then you would need create a List<Employee>. For example for index by name:

Map<String,List<Employee>> employeesByName = new HashMap<>();

for(Employee e : employees) { 
  employeesByName.get(e.getName()).add(e); // Make sure to create the array if it doesn't exist
}

So now, for reading, let's say you get SELECT * FROM employees where id = 123; you can simply just return employeesById.get("123").

This solution is O(1). As the file gets bigger, you will not have any performance loss. This is probably the fastest solution.

like image 198
Amir Raminfar Avatar answered Nov 09 '22 09:11

Amir Raminfar


To add onto Amir's answer, you can also use a TreeMap<> instead of a HashMap<>. Databases normally don't create indexes as hash maps but as balanced binary trees, which is what Java's TreeMap<> class is.

http://docs.oracle.com/javase/8/docs/api/java/util/TreeMap.html

HashMap<> has O(1) lookup on keys, but is exact match on key only. TreeMap<> has O(log n) lookup on keys, but allows ranged lookups higherKey, lowerKey) and map partitioning on key range (subMap).

The problem of duplicates can either be solved by not allowing them (unique key) or storing values in a list and linear search on the subset (non-unique key).

like image 2
sh0rug0ru Avatar answered Nov 09 '22 07:11

sh0rug0ru


Assuming you are not interested in any of the in-memory database solution and using Java 8, You should convert all your conditions to predicate and apply them on the stream . This will take advantage of Java 8's parallelism feature http://blog.takipi.com/new-parallelism-apis-in-java-8-behind-the-glitz-and-glamour/

So in short your code can be much cleaner and faster

 public static Predicate<Employee> isSalaryGreaterThan(Integer salary) {
        return p -> p.getSalary() > salary;
    }

Predicate predicate1 =   isSalaryGreaterThan(50000)
    employeeList.stream().filter(predicate1).filter(predicate2)...collect(Collectors.<Employee>toList())
like image 1
Chandra Avatar answered Nov 09 '22 09:11

Chandra