Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get frame by frame from MP4? (MediaCodec)

Actually I am working with OpenGL and I would like to put all my textures in MP4 in order to compress them.

Then I need to get it from MP4 on my Android

I need somehow decode MP4 and get frame by frame by request.

I found this MediaCodec

https://developer.android.com/reference/android/media/MediaCodec

and this MediaMetadataRetriever

https://developer.android.com/reference/android/media/MediaMetadataRetriever

But I did not see approach how to request frame by frame...

If there is someone who worked with MP4, please give me a way where to go.

P.S. I am working with native way (JNI), so does not matter how to do it.. Java or native, but I need to find the way.

EDIT1

I make some kind of movie (just one 3d model), so I am changing my geometry as well as textures every 32 milliseconds. So, it is seems to me reasonable to use mp4 for tex because of each new frame (32 milliseconds) very similar to privious one...

Now I use 400 frames for one model. For geometry I use .mtr and for tex I use .pkm (because it optimized for android) , so I have around 350 .mtr files(because some files include subindex) and 400 .pkm files ...

This is the reason why I am going to use mp4 for tex. Because one mp4 much more smaller than 400 .pkm

EDIT2

Plase take a look at Edit1

Actually all that I need to know is there API of Android that could read MP4 by frames? Maybe some kind of getNextFrame() method?

Something like this

MP4Player player = new MP4Player(PATH_TO_MY_MP4_FILE);

void readMP4(){
   Bitmap b;

   while(player.hasNext()){
      b = player.getNextFrame();

      ///.... my code here ...///
   }
}

EDIT3

I made such implementation on Java

public static void read(@NonNull final Context iC, @NonNull final String iPath)
{
    long time;

    int fileCount = 0;

    //Create a new Media Player
    MediaPlayer mp = MediaPlayer.create(iC, Uri.parse(iPath));
    time = mp.getDuration() * 1000;

    Log.e("TAG", String.format("TIME :: %s", time));

    MediaMetadataRetriever mRetriever = new MediaMetadataRetriever();
    mRetriever.setDataSource(iPath);

    long a = System.nanoTime();

    //frame rate 10.03/sec, 1/10.03 = in microseconds 99700
    for (int i = 99700 ; i <= time ; i = i + 99700)
    {
        Bitmap b = mRetriever.getFrameAtTime(i, MediaMetadataRetriever.OPTION_CLOSEST_SYNC);

        if (b == null)
        {
            Log.e("TAG", String.format("BITMAP STATE :: %s", "null"));
        }
        else
        {
            fileCount++;
        }

        long curTime = System.nanoTime();
        Log.e("TAG", String.format("EXECUTION TIME :: %s", curTime - a));
        a = curTime;
    }

    Log.e("TAG", String.format("COUNT :: %s", fileCount));
}

and here execution time

  E/TAG: EXECUTION TIME :: 267982039
  E/TAG: EXECUTION TIME :: 222928769
  E/TAG: EXECUTION TIME :: 289899461
  E/TAG: EXECUTION TIME :: 138265423
  E/TAG: EXECUTION TIME :: 127312577
  E/TAG: EXECUTION TIME :: 251179654
  E/TAG: EXECUTION TIME :: 133996500
  E/TAG: EXECUTION TIME :: 289730345
  E/TAG: EXECUTION TIME :: 132158270
  E/TAG: EXECUTION TIME :: 270951461
  E/TAG: EXECUTION TIME :: 116520808
  E/TAG: EXECUTION TIME :: 209071269
  E/TAG: EXECUTION TIME :: 149697230
  E/TAG: EXECUTION TIME :: 138347269

This time in nanoseconds == +/- 200 milliseconds... It is very slowly... I need around 30 milliseconds by frame.

So, I think this method is execution on CPU, so question if there a method that executing on GPU?

EDIT4

I found out that there is MediaCodec class

https://developer.android.com/reference/android/media/MediaCodec

also I found similar question here MediaCodec get all frames from video

I understood that there is a way to read by bytes, but not by frames...

So, still question - if there is a way to read mp4 video by frames?

like image 303
Aleksey Timoshchenko Avatar asked Jun 27 '19 12:06

Aleksey Timoshchenko


2 Answers

The solution would look something like the ExtractMpegFramesTest, in which MediaCodec is used to generate "external" textures from video frames. In the test code, the frames are rendered to an off-screen pbuffer and then saved as PNG. You would just render them directly.

There are a few problems with this:

  1. MPEG video isn't designed to work well as a random-access database. A common GOP (group of pictures) structure has one "key frame" (essentially a JPEG image) followed by 14 delta frames, which just hold the difference from the previous decoded frame. So if you want frame N, you may have to decode frames N-14 through N-1 first. Not a problem if you're always moving forward (playing a movie onto a texture) or you only store key frames (at which point you've invented a clumsy database of JPEG images).
  2. As mentioned in comments and answers, you're likely to get some visual artifacts. How bad these look depends on the material and your compression rate. Since you're generating the frames, you may be able to reduce this by ensuring that, whenever there's a big change, the first frame is always a key frame.
  3. The firmware that MediaCodec interfaces with may want several frames before it starts producing output, even if you start at a key frame. Seeking around in a stream has a latency cost. See e.g. this post. (Ever wonder why DVRs have smooth fast-forward, but not smooth fast-backward?)
  4. MediaCodec frames passed through SurfaceTexture become "external" textures. These have some limitations vs. normal textures -- performance may be worse, can't use as color buffer in an FBO, etc. If you're just rendering it once per frame at 30fps this shouldn't matter.
  5. MediaMetadataRetriever's getFrameAtTime() method has less-than-desirable performance for the reasons noted above. You're unlikely to get better results by writing it yourself, although you can save a bit of time by skipping the step where it creates a Bitmap object. Also, you passed OPTION_CLOSEST_SYNC in, but that will only produce the results you want if all your frames are sync frames (again, clumsy database of JPEG images). You need to use OPTION_CLOSEST.

If you're just trying to play a movie on a texture (or your problem can be reduced to that), Grafika has some examples. One that may be relevant is TextureFromCamera, which renders the camera video stream on a GLES rect that can be zoomed and rotated. You can replace the camera input with the MP4 playback code from one of the other demos. This'll work fine if you're only playing forward, but if you want to skip around or go backward you'll have trouble.

The problem you're describing sounds pretty similar to what 2D game developers deal with. Doing what they do is probably the best approach.

like image 188
fadden Avatar answered Sep 18 '22 07:09

fadden


I can see why it might seem easy to have all your textures in a single file, but this is a really really bad idea.

MP4 is a video codec it is highly optimised for a list of frames which have a high level of similarity to adjacent frames i.e. motion. It is also optimised to be decompressed in sequential order, so using a 'random access' approach will be very inefficient.

To give a bit more detail video codecs store key frames (one a second, but the rate changes) and delta frames the rest of the time. The key frames are independently compressed just like separate images, but the delta frames stored as the difference from one or more other frames. The algorithm assumes this difference will be fairly minimal, after motion compensation has been performed.

So if you want to access a single delta frame you code will have to decompress a nearby key frame and all the delta frames that connect it to the frame you want, this will be much slower than just using single frame JPEG.

In short, use JPEG or PNG to compress your textures and add them all to a single archive file to keep it tidy.

like image 34
PeteBlackerThe3rd Avatar answered Sep 21 '22 07:09

PeteBlackerThe3rd