Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Forming triangles from points and relations

Tags:

c#

sql

sql-server

I want to generate triangles from points and optional relations between them. Not all points form triangles, but many of them do.

In the initial structure, I've got a database with the following tables:

Nodes(id, value)
Relations(id, nodeA, nodeB, value)
Triangles(id, relation1_id, relation2_id, relation3_id)

In order to generate triangles from both nodes and relations table, I've used the following query:

INSERT INTO Triangles
SELECT t1.id, t2.id , t3.id, 
FROM Relations t1, Relations t2, Relations t3
WHERE t1.id < t2.id AND t3.id > t1.id AND
(
t1.nodeA = t2.nodeA 
AND (t3.nodeA = t1.nodeB AND t3.nodeB = t2.nodeB
OR t3.nodeA = t2.nodeB AND t3.nodeB = t1.nodeB)

OR 

t1.nodeA = t2.nodeB
AND (t3.nodeA = t1.nodeB AND t3.nodeB = t2.nodeA
OR t3.nodeA = t2.nodeA AND t3.nodeB = t1.nodeB)
)

It's working perfectly on small sized data. (~< 50 points) In some cases however, I've got around 100 points all related to each other which leads to thousands of relations. So when the expected number of triangles is in the hundreds of thousands, or even in the millions, the query might take several hours.

My main problem is not in the select query, while I see it execute in Management Studio, the returned results slow. I received around 2000 rows per minute, which is not acceptable for my case.

As a matter of fact, the size of operations is being added up exponentionally and that is terribly affecting the performance.

I've tried doing it as a LINQ to object from my code, but the performance was even worse.

I've also tried using SqlBulkCopy on a reader from C# on the result, also with no luck.

So the question is... Any ideas or workarounds?

like image 832
SiN Avatar asked May 18 '10 15:05

SiN


People also ask

Can any 3 points form a triangle?

Approach: The key observation in the problem is three points form a triangle only when they don't lie on the straight line, that is an area formed by the triangle of these three points is not equal to zero. The above formula is derived from shoelace formula.

What is the rule to form triangle?

All you have to do is use the Triangle Inequality Theorem, which states that the sum of two side lengths of a triangle is always greater than the third side. If this is true for all three combinations of added side lengths, then you will have a triangle.


2 Answers

I think you just need this:

INSERT
INTO    triangles (relation1_id, relation2_id, relation3_id)
SELECT  r12.id, r23.id, r13.id
FROM    relations r12
JOIN    relations r23
ON      r23.nodeA = r12.nodeB
JOIN    relations r13
ON      r13.nodeA = r12.nodeA
        AND r13.nodeB = r23.nodeB
ON      r12.nodeA = n1.id
WHERE   NOT EXISTS
        (
        SELECT  relation1_id, relation2_id, relation3_id
        FROM    triangles
        INTERSECT
        SELECT  r12.id, r23.id, r13.id
        )

Make the following check constraint:

ALTER TABLE relations ADD CONSTRAINT CHECK (nodeA < nodeB)

This point is that since the relations are symmetric, you only need to store the relation once per pair and the triangle once per permutation.

The check constraint ensures that the relations are stored once per pair in the fixed order.

The query is designed so that the triangles are also stored in fixed order: 1 - 3 first, 2 - 3 second, 1 - 3 third, where 1, 2 and 3 are determined by the order of nodes the relations link.

Also note that the number of triangles grows as O(n^3), not exponentially (O(exp(N))).

For 100 nodes, there can be at most 100 * 99 * 98 / 6 = 161700 triangles.

like image 81
Quassnoi Avatar answered Oct 12 '22 02:10

Quassnoi


The query you are using has some problems:

  1. "t1.id < t2.id AND t3.id > t1.id" doesn't prevent t2 = t3. Change this to "WHERE t1.id < t2.id AND t3.id > t2.id"
  2. t1.nodeA might not exist in t2 at all.

Quassnoi's solution looks promising. It has problems as well. First of all, it is selecting node IDs, not relation IDs, into triangles. Second, it isn't checking that a relation between n1 and n3 exists. I'd rewrite the statement as:

INSERT
INTO    triangles (relation1_id, relation2_id, relation3_id)
SELECT  r12.id, r23.id, r13.id
FROM    relations r12
JOIN    relations r23
ON      r12.nodeB = r23.nodeA
JOIN    relations r13
ON      r12.nodeA = r13.nodeA AND r23.nodeB = r13.nodeB
WHERE   NOT EXISTS
        (
        SELECT  relation1_id, relation2_id, relation3_id
        FROM    triangles
        WHERE   relation1_id = r12.id AND relation2_id = r23.id AND relation3_id = r13.id
        )

This solution doesn't depend on any constraint on ordering of IDs in the relations table, and doesn't guarantee any order of IDs in the triangles table.

like image 30
Mark Nelson Avatar answered Oct 12 '22 03:10

Mark Nelson