First, a visual example of what I am trying to achieve:
(Photo credit: https://unsplash.com/photos/pGcqw1ARGyg)
Using HTML5 video & canvas, how can I perform a 4-point perspective transform so that I can render just the "TV screen" part of the frame in the canvas? Why doesn't my implementation show the correct area?
I am trying to build a web page which works as follows:
The part I am struggling with is step 4. In order to make sure that I am only processing the relevant part of the image for each frame of the video, it is important that I "warp" the image so that it only shows the "TV screen" area and not the whole webcam picture.
Having done a bit of reading up, my understanding is that:
canvas
, I can't simply use a 3D CSS transform (e.g. https://developer.mozilla.org/en-US/docs/Web/CSS/transform-function/matrix3d). This suggests that maybe WebGL is more what I need to deal with the 3D aspect.With that in mind, I attempted the following approach:
a) Capture the webcam using a video
tag
b) Using three.js, create a 3D scene which is rendered into a canvas
element (so that I can perform my image processing on the resultant canvas contents)
c) The three.js scene consists of:
- a flat mesh containing which shows the video on one side using a VideoTexture
.
- a perspective camera, initially positioned so that it shows the whole webcam image
d) Allow the user to click the four corner points to define where their TV is, work out what the x/y coordinates are and save them
e) Calculate a perspective transform which would "stretch" the image out so that the correct area "fills the frame". In other words, stretch the four clicked "TV corner" points to the four corners of the viewport. I have been using this library: https://github.com/jlouthan/perspective-transform to calculate this.
f) My thinking is that, if the appropriate transform is applied to the mesh containing the video, and the camera stays in a fixed position, then the output canvas would contain the required image when looking at it in 2D.
Here is a link to my current attempt at the above. It shows the video and allows you to click the four corners. It seems like it works if you click points around the origin (in the centre) but the problem is that it shows the wrong area if you choose areas elsewhere in the image.
https://bitbucket.org/mattwilson1024/perspective-transform/src/master/
I'd be really grateful for any help working out why this isn't working as I expected, or for any pointers on whether there is a better/easier approach to achieve what I need.
The problem with the original implementation is in the way that transformMatrix
was being created.
I was able to make it work by changing this:
transformMatrix.set(a1, a2, a3, 0,
b1, b2, b3, 0,
c1, c2, c3, 0,
0, 0, 0, 1);
to this:
transformMatrix.set(a1, a2, 0, a3,
b1, b2, 0, b3,
0, 0, 0, 1,
c1, c2, 0, c3);
This answer on the Math StackExchange was helpful for working this out.
For the benefit of anyone finding this question in the future, I've updated the original question so that it points to an archive branch containing the broken code. The working version can be found here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With