I'm working on an app for iOS, where one iPhone has to live stream its camera recordings to another iPhone (to keep things simple, both are in the same Wi-Fi network).
The streaming should work without a physical interconnect (e.g. a server used for routing the stream to clients). In fact, the recording iPhone should be the server which serves the other iPhone (or more other iOS devices in the network) with the live stream.
So, what I need to do is:
What I have and what I'm stuck with:
I have already solved problem 1. I use an AVCaptureSession
which is constantly returning CMSampleBufferRef
's (found here).
I'm not so sure yet what I need to do with the CMSampleBufferRef
. I do know how to transform it into a CGImage
or a UIImage
(thanks to Benjamin Loulier's great blogpost2), but I have no idea of what specifically I need to send and if I need to encode the frames somehow.
As mentioned by @jab in the above linked answer (this) it is possible to write those samples to a file with one or more AVAssetWriter
's. But then again he says those 5 sec video snippets are to be uploaded to a server which makes a streamable movie file out of them (and that movie can then be streamed to an iOS device by HTTP Live Streaming
I suppose).
As I already indicated, my app (i.e. the video capturing "server" device) has one or multiple clients connected to it and needs to send the video frames in real time to them.
One idea which came to my mind is to use a simple TCP
connection where the server sends every single frame in a serialised format to the connected clients in a loop. More specifically: when one buffered frame is successfully sent to the client, the server takes the most recent frame as the next one to be sent.
Now: is this the right thought how it should work? Or is there another protocol, which is much better suited for this kind of task?
Remember: I want to keep it simple (simple for me, i.e., so that I don't need to study too many new programming aspects) and fast. I already know some things about TCP, I wrote servers and clients with it at school in C
, so I'd prefer to apply the knowledge I have now to this project...
Last but not least, the receiving client:
I imagine, if I'm really going to use a TCP
connection, that on the client-side I receive frame after frame from the server, cast the read byte package into the used format (CMSampleBuffer
, CGImage
, UIImage
) and just display it on a CALayer
or UIImageView
, right? The effect of a movie will be gotten by just constantly keeping updated that image view.
Please give me some ideas on how to reach this goal. It is very important, because it's part of my school-graduation project... Any sample code is also appreciated ;-) Or just refer me to another site, tutorial, Stackoverflow-answer, etc.
If you have any question to this, just leave a comment and I'll update the post.
Sounds OK?
Video frames are really big. You're going to have bandwidth problems streaming video from one device to another. You can compress the frames as JPEG
s using UIImageJPEGRepresentation
from a UIImage
, but that's computationally expensive on the "server", and still may not make them small enough to stream well. You can also reduce your frame rate and/or resolution by dropping frames, downsampling the UIImage
s, and fiddling with the settings in your AVCaptureSession
. Alternately, you can send small (5-second) videos, which are hardware-compressed on the server and much easier to handle in bandwidth, but will of course give you a 5-second lag in your stream.
If you can require iOS 7, I'd suggest trying MultipeerConnectivity.framework
. It's not terribly difficult to set up, and I believe it supports multiple clients. Definitely use UDP rather than TCP if you're going to roll your own networking - this is a textbook application for UDP, and it has lower overhead.
Frame by frame, just turn the JPEG
s into UIImage
s and use UIImageView
. There's significant computation involved, but I believe you'll still be limited by bandwidth rather than CPU. If you're sending little videos, you can use MPMoviePlayerController
. There will probably be little glitches between each video as it "prepares" them for playback, which will also result in requiring 5.5 seconds or so to play each 5-second video. I wouldn't recommend using HTTP Live Streaming unless you can get a real server into the mix somewhere. Or you could use an ffmpeg
pipeline -- feed videos in and pop individual frames out -- if you can/want to compile ffmpeg
for iOS.
Let me know if you need clarification on any of these points. It's a lot of work but relatively straightforward.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With