Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficiency between select() and recv with MSG_PEEK. Asynchronous

I would like to know what would be most efficient when checking for incoming data (asynchronously). Let's say I have 500 connections. I have 3 scenarios (that I can think of):

  1. Using select() to check FD_SETSIZE sockets at a time, then iterating over all of them to receive the data. (Wouldn't this require two calls to recv for each socket returned? MSG_PEEK to allocate a buffer then recv() it again which would be the same as #3)
  2. Using select() to check one socket at a time. (Wouldn't this also be like #3? It requires the two calls to recv.)
  3. Use recv() with MSG_PEEK one socket at a time, allocate a buffer then call recv() again. Wouldn't this be better because we can skip all the calls to select()? Or is the overhead of one recv() call too much?

I've already coded the situations to 1 and 2, but I'm not sure which one to use. Sorry if I'm a bit unclear.

Thanks

like image 667
Marlon Avatar asked May 20 '26 06:05

Marlon


2 Answers

FD_SETSIZE is typically 1024, so you can check all of 500 connections at once. Then, you will perform the two recv calls only on those which are ready -- say, for a very busy system, half a dozen of them each time around, for example. With the other approaches you need about 500 more syscalls (the huge amount of "failing" recv or select calls you perform on the many hundreds of sockets which will not be ready at any given time!-).

In addition, with approach 1 you can block until at least one connection is ready (no overhead in that case, which won't be rare in systems that aren't all that busy) -- with the other approaches, you'll need to be "polling", i.e., churning, continuously, burning huge amounds of CPU to no good purpose (or, if you sleep a while after each loop of checks, then you'll have a delay in responding despite the system not being at all busy -- eep!-).

That's why I consider polling to be an anti-pattern: frequently used, but nevertheless destructive. Sometimes you have absolutely no alternative (which basically tells you that you're having to interact with very badly designed systems -- alas, sometimes in this imperfect life you do have to!-), but when any decent alternative does exist, doing polling nevertheless is really a very bad design practice and should be avoided.

like image 144
Alex Martelli Avatar answered May 21 '26 19:05

Alex Martelli


you can simply do some efficiency simulation on 3 scenario where:

Scenario A (0/500 incoming data)

  • for solution #1, you only invoke single select()
  • for solution #2, you need 500 select()
  • for solution #3, you need 500 recv()

Scenario B (250/500 incoming data)

  • for solution #1, single select() + (500 recv())
  • for solution #2, 500 select() + (500 recv())
  • for solution #3, 750 recv()

**assume skipping socket with no buffer size @ no incoming data
answer is obvious :)

like image 45
YeenFei Avatar answered May 21 '26 20:05

YeenFei



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!