The idea of doing remote rendering (typically for a video game) which is streamed to a client device is conceptually quite simple, barring obvious issues like lag for an interactive fast-paced game.
But - technically how could you do it? My understanding is that streaming video not only caches ahead of the current play-back position, but that video files are compressed by looking ahead many frames?
Are there libraries that would let you feed an arbitrary "display feed" into a serverside video-source, so that you could then play it on the client using regular Flash/HTML5 components? Avoiding the need for a custom app or bespoke browser-plugin would be a significant benefit... i.e. the client-side web-page doesn't know it's not a regular video.
It's a bit like a web-cam I suppose... but I want the 'camera' to be 'watching' the output of a window rendered to on the server.
I'm targeting Windows-based servers and C++ rendering apps.
Interesting problem. There are a number of aspects to consider, in no particular order:
The choice of container format for the rendered movie is very important. I think the main limitation is that the renderer is constrained to write the file sequentially. The reason is that the file needs to be streamed to clients, so while the renderer is writing the file there will be a web server process reading it at some potentially close distance from EOF. The renderer cannot use random access to write the movie file because any data that is already on disk might have already be sent to clients, so clearly everything that is written to disk has to be in final form.
It seems that the F4V format (successor of FLV from Adobe) fits the bill, as it can be written in streaming friendly fashion. This format is widely supported by clients, you just need to have the Flash player plugin installed. For iPhone/iPad you will need another alternative that does not involve Flash, so for iOS you can use MP4. Note that F4V derives from MP4, both are extremely alike.
Of course, the 3D engine running on the server will have to have the ability to render to F4V/MP4, and this may require a custom export plugin for your engine.
Your server must be able to render and encode frames at equal or faster speed than the intended playback frame rate. Hardware acceleration is your friend.
Efficient video encoding formats work by encoding only the differences between frames, so to decode any given frame you may need to decode a few others first. One of the most interesting aspects of modern encoding formats is that they not only encode differences from past frames, but also from future frames. This clearly increases latency, as the encoder needs to postpone encoding a frame until it receives a few frames more. It seems that to reduce latency you would need to limit the 'future' side of the encoding to a very short amount, and thus possibly reduce encoding efficiency and/or quality.
This is possibly a tough one if you want to avoid a custom playback plugin. Video players download streams to a buffer that is typically several seconds long, and only begin to play when the buffer is full. The idea here is that a full buffer helps ride out any network interruptions and slow downs. But unfortunately a large buffer means an increase in latency. You will need to find out how many seconds of material the client players want to have in their playback buffer, and that will determine how far ahead the server-side rendering/encoding process always needs to be. A custom playback plugin could reduce or eliminate the buffer to reduce latency, but then it will be more sensitive to network hiccups.
I'm not sure how an HTTP server will like to stream a file as it is being generated by another process. I suspect this is not something that regular servers test or intend to support. There is this not very known extension to FTP called "tail-mode FTP" which basically uses the behavior you want for this. The tail-mode enabled FTP server knows the file is growing, so it makes no assumption about size and just transfers bytes as they appear in the file. The server even waits for the file to grow if it finds it is consuming the file too fast and reaches EOF. You may need a customized HTTP server that supports a similar feature.
A dedicated streaming server may be a good option here. Links of interest are the open source Darwin Streaming Server and the QuickTime Broadcaster streaming application. For the Adobe side of things there is the Adobe Streaming Server which is a commercial product. And yet another option from Microsoft, the Smooth Streaming server extension for IIS.
You didn't say anything about this, but I would imagine a good application of this technology would allow the client to send input events back to the server, which will then use that to affect the contents of the movie. This effectively would be a game engine that is hosted entirely on the server with only the input and display components running on the client. Once again, this will be challenging to do with low enough latency for the application to feel responsive. Also you will now be having to encode per-client streams, as each client will be seeing a different version of the movie. Lots of problems here, a render/encoding farm might be necessary depending on the number of simultaneous clients that need to be supported. Having pre-rendered and pre-encoded chunks of animation that can be combined (in the style of the old Dragon's Lair games) might be a good compromise solution for this type of application.
There may not be an efficient solution to this problem in software... but there probably is in hardware: http://yhst-128017942510954.stores.yahoo.net/cube200-1ch-hdmi-enc2001.html
It should be possible to combine the H.264 encoder used by that device with a video card at much lower cost.
I'm working on a similar problem and I'll share what I've learned. While I don't know how to stream them out, I do know how to generate and encode multiple HD video streams on the server. I've tested two approaches: NVIDIA CUDA Video Encode (C Library) API and Intel Performance Primitives Video Encoder. The NVIDIA link takes you right to the example. The Intel page does not have internal anchors so you'll have to search for "Video Encoder".
Both encode video streams, up to and inlcluding HD, to H.264. Other formats are supported, but I am interested in H.264. To test performance, I setup prepared input video, in YUV format, and fed it to the encoders as fast as they would take it. Output from both encoders was 1080P.
Performance wise, the Intel video encoder could encode a single stream at 0.5X real time with about a 12.5% load on a Xeon E5520 @ 2.27GHz, i.e. one core of eight at 100% load. Newer Xeons are much faster, but I don't know if they can hit real-time yet.
The NVIDIA encoder on a GTS 450, could encode 9-10X real-time 1080P(!) with a 50% CPU load. The CPU load on the NVIDIA appear to be primarily copying data to-and-from the GPU.
What is particularly nice about the GPU solution is that it can take a GPU render surface as input; graphics are generated and encoded on the GPU, only leaving to go out to the network. For details on using a render surface and an input, see CUDA by Example, an excellent and straight-forward book on GPU programming. In that case I would expect CPU load to drop by approximately half. Since there is no point in going faster than real-time for real-time graphics, you could likely encode 8+ streams from render surfaces with adequate GPU resources, e.g. two GTS 450 cards, perhaps many more if resolution lower than 1080P is acceptable.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With