Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculate average of pixels in the front buffer of the gpu without copying the front buffer back to system memory

I am preparing to build a clone of the ambilight for my pc. For this purpose I need a way to calculate the average color of several areas of the screen.

The fastest way I have found so far is the following:

  pd3dDevice->CreateOffscreenPlainSurface(ddm.Width, ddm.Height, D3DFMT_A8R8G8B8, D3DPOOL_SCRATCH/*D3DPOOL_SYSTEMMEM*/, &pSurface, nullptr)
  pd3dDevice->GetFrontBufferData(0, pSurface);
  D3DLOCKED_RECT lockedRect;
  pSurface->LockRect(&lockedRect, nullptr, D3DLOCK_NO_DIRTY_UPDATE|D3DLOCK_NOSYSLOCK|D3DLOCK_READONLY);
  memcpy(pBits, (unsigned char*) lockedRect.pBits, dataLength);
  pSurface->UnlockRect();
  //calculate average over of pBits

However it involes copying the whole front buffer back to the system memory which takes 33 ms on average. Obviuosly 33ms is no way near the speed that I need for a decent update rate therefore I am looking for a way to calculate the average over a region of the front buffer directly on the gpu without copying the front buffer back to the system memory.

edit: the bottleneck in the code snippet is pd3dDevice->GetFrontBufferData(0, pSurface);. The memcpy has no visible effect on performance.

edit:

Based on user3125280's answer i cooked up a pice of code that should take the top left corner of the screen and average it. However the result is always 0. What am I missing? Also notice that pSurface is now in video memory and thus GetFrontBufferData is just a memcpy in video ram which is super fast.

  pd3dDevice->CreateOffscreenPlainSurface(1, 1, D3DFMT_A8R8G8B8, D3DPOOL_SCRATCH, &pAvgSurface, nullptr);
  pd3dDevice->CreateOffscreenPlainSurface(ddm.Width, ddm.Height, D3DFMT_A8R8G8B8, D3DPOOL_DEFAULT, &pSurface, nullptr);

  pd3dDevice->GetFrontBufferData(0, pSurface);

  RECT r;
  r.right = 100;
  r.bottom = 100;
  r.left = 0;
  r.top = 0;
  pd3dDevice->StretchRect(pSurface, &r, pAvgSurface, nullptr, D3DTEXF_LINEAR);

  D3DLOCKED_RECT lockedRect;
  pAvgSurface->LockRect(&lockedRect, nullptr, D3DLOCK_NO_DIRTY_UPDATE|D3DLOCK_NOSYSLOCK|D3DLOCK_READONLY);
  unsigned int color = -1;
  memcpy((unsigned char*) &color, (unsigned char*) lockedRect.pBits, 4); //FIXME there has to be a better way than memcopy
  pAvgSurface->UnlockRect();

edit2: Apparantly GetFrontBufferData requires the target to reside in the system memory. So I am back to square one.

edit3: According to this the following should be possible in DX11.1:

  • Create a Direct3D 11.1 device. (Maybe earlier works too -- I haven't tried. I'm not sure there's a reason to use a D3D10/10.1/11 device anyway.)
  • Find the IDXGIOutput you want to duplicate, and call DuplicateOutput() to get an IDXGIOutputDuplication interface.
  • Call AcquireNextFrame() to wait for a new frame to arrive.
  • Process the received texture.
  • Call ReleaseFrame().
  • Repeat.

However due to my non existing knowledge of DirectX I am having a hard time implementing it.

edit4: DuplicateOutput is not supported in operating systems older than Windows 8 :(

edit5: I did some experiments with the classical GetPixel API thinking that it may be fast enough for random sampling. Sadly it is not. GetPixel takes the same amount of time that GetFrontBufferData takes. I guess it internally calls GetFrontBufferData.

So for now I see two solutions: * Disable Aero and use GetFrontBufferData * Switch to windows 8 Both of them are not really good :(

like image 562
Arne Böckmann Avatar asked Dec 26 '13 11:12

Arne Böckmann


1 Answers

This problem is actually common (apparently) in game code and the like. One interesting solution is the following: Efficient pixel shader sum of all pixels. This is particularly relevant to your exact situation, since you can use a larger mimmap texture to sum smaller segments of the display.

Get the screen into a texture

IDirect3DTexture9* texture; // needs to be created, of course
IDirect3DSurface9* dest = NULL; // to be our level0 surface of the texture
texture->GetSurfaceLevel(0, &dest);
pD3DDevice->StretchRect(pSurface, NULL, dest, NULL, D3DTEXF_LINEAR);

And then create a mipmap chain as here

// This code example assumes that m_d3dDevice is a
// valid pointer to a IDirect3DDevice9 interface

IDirect3DTexture9 * pMipMap;
m_pD3DDevice->CreateTexture(256, 256, 5, 0, D3DFMT_R8G8B8, 
D3DPOOL_MANAGED, &pMipMap);

Of course you don't have to access the bottom mipmap (which is the average). You could access a few levels higher to get averages of sections. Also this is quick because texture mipmapping is important in games and graphics in general. Other filtering options may be available too.

For the second edit try here - something about textures in gpu mem are read differently, and can't be locked, you'll need to use getrendertargetdata or some such. This can be used to copy your stretchrect surface to a texture created cpu side in the system pool. As far as I know gpu side textures/surfaces can't be memcpy'ed directly.

like image 123
user3125280 Avatar answered Nov 01 '22 18:11

user3125280