Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to efficiently implement Youtube's Thumbnail preview feature?

Tags:

I am trying to implement the Youtube's Thumbnail Preview feature for my Simple Video Player. Here's the Snap for it :

enter image description here

Good thing is : It is working smoothly once the Player fetch all the thumbnails previews from a HTTP server.

Bad thing is : Fetching of all Thumbnail previews is taking huge time (20-30 seconds). (For a video (.mp4 file) of 14 minutes (~110 MB), roughly 550 thumbnails previews (160x120) are there)

What I am doing is : When the user will start playing the video, I will make "total_thumbnails" HTTP request to Server to get all of them.

Also-Note :

  1. I will do the multiple HTTP request thingy in an Async Task.
  2. I will not do it in the fashion, make a request, wait until download complete and then make another request.
  3. I will make "total_thumbnails" HTTP requests blindly, so all the request get queued in the pipeline and later receiving responses in parallel.

Extra Details : A HTTP (lighttpd) Server will be running, from where my player will fetch all the thumbnails as soon the user select a video.mp4 to play from the list. Also, the same server will be used by the player to fetch video.mp4 using HTTP streaming.

Problem is : When I will start playing the video, and then I quickly do seeking, I end up seeing this (the white thumbnail is the default one, when the thumbnails mapped to that time is not yet fetch from Server) :

enter image description here

Question is : How efficiently can I fetch all (or some) thumbnail previews so that the User (most of the time) will get the experience of right timely mapped thumbnail?

I have seen youtube's video as soon as the video start (which is quick), the player is able to show all the timely right thumbnails (No matter you drag the thumb to the last minute or hover over the bar's last minutes, almost every time you will see the rightly mapped thumbnail).

Are they downloading all the thumbnails at the same time or downloading the compressed thumbnail previews series or some other intelligent stuff is happening out there?

Has anyone has worked out on this?

like image 601
CoolToshi45 Avatar asked May 16 '16 14:05

CoolToshi45


2 Answers

  1. Group multiple thumbnails into single container image. Use canvas (I'm not a JS developer but believe that this is the right word) to extract each of thumbnails separately on client side. For example, here is example of such container image used by youtube. This works well with all sorts of protocols (e.g. with or without keep-alive enabled).
  2. Preprocess thumbnails to reduce their size. Reduce jpeg quality as much as possible (~q=70). Also you may try to blur thumbnails a bit or to reduce number of colors.
  3. Optimize order of downloads. For example, if you have video with length 2:55:
    1. First, download container image with 8 thumbs covering full range of video time: 0:00, 0:25, 0:50, 1:15, 1:40, 2:05, 2:30, 2:55. This makes your service to appear as "working instantly" for slow clients.
    2. Next, download 4 container images with 32 thumbs total covering full range of video, but more densely: 0:00, 0:06, 0:11, 0:17, ...
    3. Now, download gradually all other thumbnails without any particular order.
like image 145
gudok Avatar answered Sep 28 '22 01:09

gudok


For a video (.mp4 file) of 14 minutes (~110 MB), roughly 550 thumbnails previews (160x120) are there

The main factor here is probably that you make 550 separate requests to the HTTP server. I assume that you do it like this: request thumbnail k, wait for it to download, then request thumbnail k+1. This is very inefficient, because the HTTP server is sitting idle while thumbnail k is being downloaded and the next request is being uploaded.

Solution 1

Combine all 550 thumbnails into one big file, and request that instead of 550 individual files.

Maybe there is a good existing file format that you could reuse for this purpose. Tar comes to mind, but it’s probably a bad choice, because (1) it doesn’t seem to support random access (i.e. get the kth thumbnail directly), and (2) it adds 512 bytes of overhead per thumbnail.

Anyway, it should be easy to come up with your own file format. Something like this:

  • the first 4 bytes give n, which is the number of thumbnails in the file;
  • the next 4×n bytes give the offset of each thumbnail relative to the beginning of the file;
  • after that, just the thumbnails themselves, stitched end-to-end.

Solution 2

Use HTTP pipelining—a feature of HTTP/1.1 where you send many requests (perhaps all 550) at once, then read many responses at once. You don’t have to wait for each thumbnail to download before you request the next one.

You will need two things for this to work.

First, your HTTP client must support HTTP pipelining. I don’t know what’s the state of the art on your platform, but in Python land this is a rare feature. One client that seems to support it is libcurl (via CURLMOPT_PIPELINING and CURLMOPT_MAX_PIPELINE_LENGTH). libcurl is available for most platforms and has bindings for most languages.

Second, you may need to change your Lighttpd configuration. By default, the server.max-keep-alive-requests variable is set to 16, which means that the server will close a connection after handling 17 requests on it, and you will have to establish a new one. You probably want this number to be larger.

Pipelining a large number of requests may (or may not) have bad side effects on both the client and the server, such as unexpected memory usage. Please test for yourself.

Also, if there are any HTTP intermediaries (like proxies) between the client and the server, they can break HTTP pipelining.

Measurements

I ran some quick and dirty tests. Lighttpd 1.4.31 serves 550 thumbnails (4.2M total), with server.max-keep-alive-requests = 600, from Berlin. Client is in Moscow. Average over 10 runs.

  • Simple approach (request-download-request, using Python’s http.client): 27.3 s.
  • Combined file (using Python’s http.client): 2.95 s.
  • Pipelining 550 requests (no real HTTP client, just raw sockets): 3.04 s.
like image 45
Vasiliy Faronov Avatar answered Sep 28 '22 01:09

Vasiliy Faronov