I have this table that represents packets from a traffic capture called PacketsByDirection (showing only relevant fields):
FrameNumber FrameTimeEpoch FlowID Direction
288 1430221042.150789000 29 Direction A
289 1430221042.150922000 29 Direction B
Now, this table has around 2 million rows (packets) and what I need to calculate is, for each packet, the difference of time between him and the previous packet with the same Direction and the same FlowID
I have done this with this query, plus adding indexes to the previous table to make the query faster.
SELECT t1.FrameNumber, flowid, direction,
FrameTimeEpoch - IFNULL((
SELECT MAX(FrameTimeEpoch)
FROM PacketsByDirection
WHERE flowid = t1.flowid
AND Direction LIKE t1.Direction
AND FrameNumber < t1.FrameNumber)
,FrameTimeEpoch) AS TimeFromLastPacketFromSameDirection
FROM PacketsByDirection AS t1
And the result is something like
FrameNumber FlowID Direction TimeFromLastPacketFromSameDirection
288 29 Direction A 0
289 29 Direction B 0
290 29 Direction A 5.422
291 29 Direction B 4.356
292 30 Direction A 0
293 30 Direction A 1.302
And so on. Now, this query takes around 1 hour for 600k rows, and now I'm working with millions of rows so I don't even want to try it. Here is the "explain" output of the query right now (that's a lot of iterations):
So my question is, is there another more efficient way to do this?
Thanks
EDIT: Here is the definition of the table
CREATE TABLE `packetsbydirection` (
`FrameNumber` int(11) NOT NULL DEFAULT '0',
`FrameTimeEpoch` varchar(45) NOT NULL,
`IPSrc` varchar(45) NOT NULL,
`TCPSrcPort` varchar(45) DEFAULT NULL,
`UDPSrcport` varchar(45) DEFAULT NULL,
`IPDst` varchar(45) NOT NULL,
`TCPDstport` varchar(45) DEFAULT NULL,
`UDPDstport` varchar(45) DEFAULT NULL,
`IPLength` varchar(45) NOT NULL,
`FlowID` int(11) NOT NULL,
`Direction` varchar(11) CHARACTER SET utf8 DEFAULT NULL,
KEY `Index2` (`Direction`),
KEY `Index3` (`FlowID`),
KEY `Index4` (`FrameNumber`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
Not sure if this could work at all, but maybe running numbers could be faster?
SELECT
FrameNumber,
case when FlowID <> @currflow or Direction <> @currdir then @diff := 0 else @diff := FrameTimeEpoch - @epoch end as TimeFromLastPacketFromSameDirection
, @currflow := FlowID, @currdir := Direction, @diff, @epoch := FrameTimeEpoch
FROM
packetsbydirection, (select @epoch := 0, @currflow :="", @currdir := "", @diff := 0) as tmp
ORDER BY FlowID, Direction, FrameTimeEpoch
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With