Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Combining SQL Rows

I've got SQL Compact Database that contains a table of IP Packet Headers. The Table looks like this:

Table: PacketHeaders    

ID  SrcAddress  SrcPort  DestAddress  DestPort  Bytes
1   10.0.25.1   255      10.0.25.50   500       64
2   10.0.25.50  500      10.0.25.1    255       80
3   10.0.25.50  500      10.0.25.1    255       16
4   75.48.0.25  387      74.26.9.40   198       72
5   74.26.9.40  198      75.48.0.25   387       64
6   10.0.25.1   255      10.0.25.50   500       48

I need to perform a query to show 'conversations' going on across a local network. Packets going from A -> B is part of the same conversations as packets going from B -> A. I need to perform a query to show the on going conversations. Basically what I need is something that looks like this:

Returned Query:

SrcAddress  SrcPort  DestAddress  DestPort  TotalBytes  BytesA->B  BytesB->A
10.0.25.1   255      10.0.25.50   500       208         112        96
75.48.0.25  387      74.26.9.40   198       136         72         64

As you can see I need the query (or series of queries) to recognize that A->B is the same as B->A and break up the byte counts accordingly. I'm not a SQL guru by any means but any help on this would be greatly appreciated.

like image 585
lumberjack4 Avatar asked Feb 02 '10 17:02

lumberjack4


2 Answers

Try this:

SELECT
    T1.SrcAddress,
    T1.SrcPort,
    T1.DestAddress,
    T1.DestPort,
    T1.Bytes + COALESCE(T2.Bytes, 0) AS TotalBytes,
    T1.Bytes AS A_to_B,
    COALESCE(T2.Bytes, 0) AS B_to_A
FROM (
    SELECT SrcAddress, SrcPort, DestAddress, DestPort, SUM(Bytes) AS Bytes
    FROM PacketHeaders
    GROUP BY SrcAddress, SrcPort, DestAddress, DestPort) AS T1
LEFT JOIN (
    SELECT SrcAddress, SrcPort, DestAddress, DestPort, SUM(Bytes) AS Bytes
    FROM PacketHeaders
    GROUP BY SrcAddress, SrcPort, DestAddress, DestPort) AS T2
ON T1.SrcAddress = T2.DestAddress
AND T1.SrcPort = T2.DestPort
AND T1.DestAddress = T2.SrcAddress
AND T1.DestPort = T2.SrcPort
WHERE T1.SrcAddress < T1.DestAddress OR
    (T1.SrcAddress = T1.DestAddress AND T1.SrcPort = T1.DestPort) OR
    T2.DestAddress IS NULL

On this test data:

CREATE TABLE PacketHeaders (ID INT, SrcAddress NVARCHAR(100), SrcPort INT, DestAddress NVARCHAR(100), DestPort INT, Bytes INT);
INSERT INTO PacketHeaders (ID, SrcAddress, SrcPort, DestAddress, DestPort, Bytes) VALUES
(1, '10.0.25.1', 255, '10.0.25.50', 500, 64),
(2, '10.0.25.50', 500, '10.0.25.1', 255, 80),
(3, '10.0.25.50', 500, '10.0.25.1', 255, 16),
(4, '75.48.0.25', 387, '74.26.9.40', 198, 72),
(5, '74.26.9.40', 198, '75.48.0.25', 387, 64),
(6, '10.0.25.1', 255, '10.0.25.50', 500, 48),
(7, '10.0.25.2', 255, '10.0.25.50', 500, 48),
(8, '10.0.25.52', 255, '10.0.25.50', 500, 48);

This gives the following results:

'10.0.25.1', 255, '10.0.25.50', 500, 208, 112, 96
'10.0.25.2', 255, '10.0.25.50', 500, 48, 48, 0
'10.0.25.52', 255, '10.0.25.50', 500, 48, 48, 0
'74.26.9.40', 198, '75.48.0.25', 387, 136, 64, 72

The way it works is to first group one-way conversations and total the byte counts. This assures that every conversation will be represented exactly twice - once for each direction. This result is then self-joined to give the result you need, filtering the duplicates by enforcing that the (address, port) of A must be less than B. A left join is used to allow one-way conversations.

like image 130
Mark Byers Avatar answered Oct 17 '22 10:10

Mark Byers


I can see two basic ways of doing this...
1. Group it all up, ignoring the a->b and b->a and then self join the results.
2. Rearrange your data with the "lowest" ip address in the 'src' field, but also create a 'direction' field.

Option 2 is probably the way I'd go...

SELECT
    SrcAddress,
    SrcPort,
    DestAddress,
    DestPort,
    SUM(AtoB) + SUM(BtoA),
    SUM(AtoB),
    SUM(BtoA)
FROM
(
    SELECT
       CASE WHEN SrcAddress < DestAddress THEN SrcAddress  ELSE DestAddress END AS SrcAddress,
       CASE WHEN SrcAddress < DestAddress THEN SrcPort     ELSE DestPort    END AS SrcPort,
       CASE WHEN SrcAddress < DestAddress THEN DestAddress ELSE SrcAddress  END AS DestAddress,
       CASE WHEN SrcAddress < DestAddress THEN DestPort    ELSE ScrPort     END AS DestPort,
       CASE WHEN SrcAddress < DestAddress THEN Bytes       ELSE 0           END AS AtoB,
       CASE WHEN SrcAddress < DestAddress THEN 0           ELSE Bytes       END AS BtoA
    FROM
      PacketHeaders
)
    AS [data]
GROUP BY
    SrcAddress,
    SrcPort,
    DestAddress,
    DestPort

EDIT

A couple of other answers have version of what I called option 1. I'll have a go at it too rather than spamming comments on people's answers :(

SELECT
   ISNULL([AtoB].SrcAddress,  [BtoA].DestAddress)
   ISNULL([AtoB].SrcPort,     [BtoA].DestPort)
   ISNULL([AtoB].DestAddress, [BtoA].SrcAddress)
   ISNULL([AtoB].DestPort,    [BtoA].SrcPort)
   ISNULL([AtoB].Bytes,0) + ISNULL([BtoA].Bytes,0),
   ISNULL([AtoB].Bytes,0),
   ISNULL([BtoA].Bytes,0)
FROM
   (
      SELECT   SrcAddress, SrcPort, DestAddress, DestPort, SUM(Bytes) AS Bytes
      FROM     PacketHeaders
      WHERE    SrcAddress <= DestAddress
      GROUP BY SrcAddress, SrcPort, DestAddress, DestPort
   )
   AS [AtoB]
FULL OUTER JOIN
   (
      SELECT   SrcAddress, SrcPort, DestAddress, DestPort, SUM(Bytes) AS Bytes
      FROM     PacketHeaders
      WHERE    SrcAddress > DestAddress
      GROUP BY SrcAddress, SrcPort, DestAddress, DestPort
   )
   AS [BtoA]
      ON  [AtoB].SrcAddress  = [BtoA].DestPort
      AND [AtoB].SrcPort     = [BtoA].DestAddress
      AND [AtoB].DestAddress = [BtoA].SrcPort
      AND [AtoB].DestPort    = [BtoA].SrcAddress

But I did say I wouldn't do it that way...

like image 22
MatBailie Avatar answered Oct 17 '22 09:10

MatBailie