I'm working with this sample (https://github.com/google-ar/arcore-android-sdk/tree/master/samples/hello_ar_java), and I want to provide the functionality to record a video with the AR objects placed.
I tried multiple things but to no avail, is there a recommended way to do it?
Augmented reality (AR) is a tool for annotating graphic recordings or large posters. Think of it as a “hidden digital layer” to deepen the viewer's understanding of the content. All the viewer needs to do is scan their phone across the graphic to see images, videos, and web links pop up.
ARKit tends to perform better than ARCore in terms of image tracking and recognition. If you intend to create AR apps that track user gestures to manipulate on-screen images, ARKit will usually be the more efficient option. It translates movements into data faster than Google's alternative.
Creating a video from an OpenGL surface is a little involved, but is doable. The easiest way to understand I think is to use two EGL surfaces, one for the UI and one for the media encoder. There is a good example of the EGL level calls needed in the Grafika project on GitHub. I used that as starting point to figure out the modifications needed to the HelloAR sample for ARCore. Since there are quite a few changes, I broke it down into steps.
Make changes to support writing to external storage
To save the video, you need to write the video file somewhere accessible, so you need to get this permission.
Declare the permission in the AndroidManifest.xml
file:
<uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE"/>
Then change CameraPermissionHelper.java
to request the external storage permission as well as the camera permission. To do this, make an array of the permissions and use that when requesting the permissions and iterate over it when checking the permission state:
private static final String REQUIRED_PERMISSIONS[] = {
Manifest.permission.CAMERA,
Manifest.permission.WRITE_EXTERNAL_STORAGE
};
public static void requestCameraPermission(Activity activity) {
ActivityCompat.requestPermissions(activity, REQUIRED_PERMISSIONS,
CAMERA_PERMISSION_CODE);
}
public static boolean hasCameraPermission(Activity activity) {
for(String p : REQUIRED_PERMISSIONS) {
if (ContextCompat.checkSelfPermission(activity, p) !=
PackageManager.PERMISSION_GRANTED) {
return false;
}
}
return true;
}
public static boolean shouldShowRequestPermissionRationale(Activity activity) {
for(String p : REQUIRED_PERMISSIONS) {
if (ActivityCompat.shouldShowRequestPermissionRationale(activity, p)) {
return true;
}
}
return false;
}
Add recording to HelloARActivity
Add a simple button and text view to the UI at the bottom of activity_main.xml
:
<Button
android:id="@+id/fboRecord_button"
android:layout_width="wrap_content"
android:layout_height="wrap_content"
android:layout_alignStart="@+id/surfaceview"
android:layout_alignTop="@+id/surfaceview"
android:onClick="clickToggleRecording"
android:text="@string/toggleRecordingOn"
tools:ignore="OnClick"/>
<TextView
android:id="@+id/nowRecording_text"
android:layout_width="wrap_content"
android:layout_height="wrap_content"
android:layout_alignBaseline="@+id/fboRecord_button"
android:layout_alignBottom="@+id/fboRecord_button"
android:layout_toEndOf="@+id/fboRecord_button"
android:text="" />
In HelloARActivity
add member variables for recording:
private VideoRecorder mRecorder;
private android.opengl.EGLConfig mAndroidEGLConfig;
Initialize mAndroidEGLConfig in onSurfaceCreated()
. We'll use this config object to create the encoder surface.
EGL10 egl10 = (EGL10)EGLContext.getEGL();
javax.microedition.khronos.egl.EGLDisplay display = egl10.eglGetCurrentDisplay();
int v[] = new int[2];
egl10.eglGetConfigAttrib(display,config, EGL10.EGL_CONFIG_ID, v);
EGLDisplay androidDisplay = EGL14.eglGetCurrentDisplay();
int attribs[] = {EGL14.EGL_CONFIG_ID, v[0], EGL14.EGL_NONE};
android.opengl.EGLConfig myConfig[] = new android.opengl.EGLConfig[1];
EGL14.eglChooseConfig(androidDisplay, attribs, 0, myConfig, 0, 1, v, 1);
this.mAndroidEGLConfig = myConfig[0];
Refactor the onDrawFrame()
method so all the non-drawing code is executed first, and the actual drawing is done in a method called draw()
. This way during recording, we can update the ARCore frame, process the input, then draw to the UI, and draw again to the encoder.
@Override
public void onDrawFrame(GL10 gl) {
if (mSession == null) {
return;
}
// Notify ARCore session that the view size changed so that
// the perspective matrix and
// the video background can be properly adjusted.
mDisplayRotationHelper.updateSessionIfNeeded(mSession);
try {
// Obtain the current frame from ARSession. When the
//configuration is set to
// UpdateMode.BLOCKING (it is by default), this will
// throttle the rendering to the camera framerate.
Frame frame = mSession.update();
Camera camera = frame.getCamera();
// Handle taps. Handling only one tap per frame, as taps are
// usually low frequency compared to frame rate.
MotionEvent tap = mQueuedSingleTaps.poll();
if (tap != null && camera.getTrackingState() == TrackingState.TRACKING) {
for (HitResult hit : frame.hitTest(tap)) {
// Check if any plane was hit, and if it was hit inside the plane polygon
Trackable trackable = hit.getTrackable();
if (trackable instanceof Plane
&& ((Plane) trackable).isPoseInPolygon(hit.getHitPose())) {
// Cap the number of objects created. This avoids overloading both the
// rendering system and ARCore.
if (mAnchors.size() >= 20) {
mAnchors.get(0).detach();
mAnchors.remove(0);
}
// Adding an Anchor tells ARCore that it should track this position in
// space. This anchor is created on the Plane to place the 3d model
// in the correct position relative both to the world and to the plane.
mAnchors.add(hit.createAnchor());
// Hits are sorted by depth. Consider only closest hit on a plane.
break;
}
}
}
// Get projection matrix.
float[] projmtx = new float[16];
camera.getProjectionMatrix(projmtx, 0, 0.1f, 100.0f);
// Get camera matrix and draw.
float[] viewmtx = new float[16];
camera.getViewMatrix(viewmtx, 0);
// Compute lighting from average intensity of the image.
final float lightIntensity = frame.getLightEstimate().getPixelIntensity();
// Visualize tracked points.
PointCloud pointCloud = frame.acquirePointCloud();
mPointCloud.update(pointCloud);
draw(frame,camera.getTrackingState() == TrackingState.PAUSED,
viewmtx, projmtx, camera.getDisplayOrientedPose(),lightIntensity);
if (mRecorder!= null && mRecorder.isRecording()) {
VideoRecorder.CaptureContext ctx = mRecorder.startCapture();
if (ctx != null) {
// draw again
draw(frame, camera.getTrackingState() == TrackingState.PAUSED,
viewmtx, projmtx, camera.getDisplayOrientedPose(), lightIntensity);
// restore the context
mRecorder.stopCapture(ctx, frame.getTimestamp());
}
}
// Application is responsible for releasing the point cloud resources after
// using it.
pointCloud.release();
// Check if we detected at least one plane. If so, hide the loading message.
if (mMessageSnackbar != null) {
for (Plane plane : mSession.getAllTrackables(Plane.class)) {
if (plane.getType() ==
com.google.ar.core.Plane.Type.HORIZONTAL_UPWARD_FACING
&& plane.getTrackingState() == TrackingState.TRACKING) {
hideLoadingMessage();
break;
}
}
}
} catch (Throwable t) {
// Avoid crashing the application due to unhandled exceptions.
Log.e(TAG, "Exception on the OpenGL thread", t);
}
}
private void draw(Frame frame, boolean paused,
float[] viewMatrix, float[] projectionMatrix,
Pose displayOrientedPose, float lightIntensity) {
// Clear screen to notify driver it should not load
// any pixels from previous frame.
GLES20.glClear(GLES20.GL_COLOR_BUFFER_BIT | GLES20.GL_DEPTH_BUFFER_BIT);
// Draw background.
mBackgroundRenderer.draw(frame);
// If not tracking, don't draw 3d objects.
if (paused) {
return;
}
mPointCloud.draw(viewMatrix, projectionMatrix);
// Visualize planes.
mPlaneRenderer.drawPlanes(
mSession.getAllTrackables(Plane.class),
displayOrientedPose, projectionMatrix);
// Visualize anchors created by touch.
float scaleFactor = 1.0f;
for (Anchor anchor : mAnchors) {
if (anchor.getTrackingState() != TrackingState.TRACKING) {
continue;
}
// Get the current pose of an Anchor in world space.
// The Anchor pose is
// updated during calls to session.update() as ARCore refines
// its estimate of the world.
anchor.getPose().toMatrix(mAnchorMatrix, 0);
// Update and draw the model and its shadow.
mVirtualObject.updateModelMatrix(mAnchorMatrix, scaleFactor);
mVirtualObjectShadow.updateModelMatrix(mAnchorMatrix, scaleFactor);
mVirtualObject.draw(viewMatrix, projectionMatrix, lightIntensity);
mVirtualObjectShadow.draw(viewMatrix, projectionMatrix, lightIntensity);
}
}
Handle the toggling of recording:
public void clickToggleRecording(View view) {
Log.d(TAG, "clickToggleRecording");
if (mRecorder == null) {
File outputFile = new File(Environment.getExternalStoragePublicDirectory(
Environment.DIRECTORY_PICTURES) + "/HelloAR",
"fbo-gl-" + Long.toHexString(System.currentTimeMillis()) + ".mp4");
File dir = outputFile.getParentFile();
if (!dir.exists()) {
dir.mkdirs();
}
try {
mRecorder = new VideoRecorder(mSurfaceView.getWidth(),
mSurfaceView.getHeight(),
VideoRecorder.DEFAULT_BITRATE, outputFile, this);
mRecorder.setEglConfig(mAndroidEGLConfig);
} catch (IOException e) {
Log.e(TAG,"Exception starting recording", e);
}
}
mRecorder.toggleRecording();
updateControls();
}
private void updateControls() {
Button toggleRelease = findViewById(R.id.fboRecord_button);
int id = (mRecorder != null && mRecorder.isRecording()) ?
R.string.toggleRecordingOff : R.string.toggleRecordingOn;
toggleRelease.setText(id);
TextView tv = findViewById(R.id.nowRecording_text);
if (id == R.string.toggleRecordingOff) {
tv.setText(getString(R.string.nowRecording));
} else {
tv.setText("");
}
}
Add a listener interface to receive video recording state changes:
@Override
public void onVideoRecorderEvent(VideoRecorder.VideoEvent videoEvent) {
Log.d(TAG, "VideoEvent: " + videoEvent);
updateControls();
if (videoEvent == VideoRecorder.VideoEvent.RecordingStopped) {
mRecorder = null;
}
}
Implement the VideoRecorder class to feed images to the encoder
The VideoRecorder class is used to feed the images to the media encoder. This class creates an off screen EGLSurface using the input surface of the media encoder. The general approach is during recording draw once for the UI display, and then make the same exact draw call for the media encoder surface.
The constructor takes recording parameters and a listener to push events to during the recording process.
public VideoRecorder(int width, int height, int bitrate, File outputFile,
VideoRecorderListener listener) throws IOException {
this.listener = listener;
mEncoderCore = new VideoEncoderCore(width, height, bitrate, outputFile);
mVideoRect = new Rect(0,0,width,height);
}
When recording starts, we need to create a new EGL surface for the encoder. Then notify the encoder that a new frame is available, make the encoder surface the current EGL surface, and return so the caller can make the drawing calls.
public CaptureContext startCapture() {
if (mVideoEncoder == null) {
return null;
}
if (mEncoderContext == null) {
mEncoderContext = new CaptureContext();
mEncoderContext.windowDisplay = EGL14.eglGetCurrentDisplay();
// Create a window surface, and attach it to the Surface we received.
int[] surfaceAttribs = {
EGL14.EGL_NONE
};
mEncoderContext.windowDrawSurface = EGL14.eglCreateWindowSurface(
mEncoderContext.windowDisplay,
mEGLConfig,mEncoderCore.getInputSurface(),
surfaceAttribs, 0);
mEncoderContext.windowReadSurface = mEncoderContext.windowDrawSurface;
}
CaptureContext displayContext = new CaptureContext();
displayContext.initialize();
// Draw for recording, swap.
mVideoEncoder.frameAvailableSoon();
// Make the input surface current
// mInputWindowSurface.makeCurrent();
EGL14.eglMakeCurrent(mEncoderContext.windowDisplay,
mEncoderContext.windowDrawSurface, mEncoderContext.windowReadSurface,
EGL14.eglGetCurrentContext());
// If we don't set the scissor rect, the glClear() we use to draw the
// light-grey background will draw outside the viewport and muck up our
// letterboxing. Might be better if we disabled the test immediately after
// the glClear(). Of course, if we were clearing the frame background to
// black it wouldn't matter.
//
// We do still need to clear the pixels outside the scissor rect, of course,
// or we'll get garbage at the edges of the recording. We can either clear
// the whole thing and accept that there will be a lot of overdraw, or we
// can issue multiple scissor/clear calls. Some GPUs may have a special
// optimization for zeroing out the color buffer.
//
// For now, be lazy and zero the whole thing. At some point we need to
// examine the performance here.
GLES20.glClearColor(0f, 0f, 0f, 1f);
GLES20.glClear(GLES20.GL_COLOR_BUFFER_BIT);
GLES20.glViewport(mVideoRect.left, mVideoRect.top,
mVideoRect.width(), mVideoRect.height());
GLES20.glEnable(GLES20.GL_SCISSOR_TEST);
GLES20.glScissor(mVideoRect.left, mVideoRect.top,
mVideoRect.width(), mVideoRect.height());
return displayContext;
}
When the drawing is completed, the EGLContext needs to be restored back to the UI surface:
public void stopCapture(CaptureContext oldContext, long timeStampNanos) {
if (oldContext == null) {
return;
}
GLES20.glDisable(GLES20.GL_SCISSOR_TEST);
EGLExt.eglPresentationTimeANDROID(mEncoderContext.windowDisplay,
mEncoderContext.windowDrawSurface, timeStampNanos);
EGL14.eglSwapBuffers(mEncoderContext.windowDisplay,
mEncoderContext.windowDrawSurface);
// Restore.
GLES20.glViewport(0, 0, oldContext.getWidth(), oldContext.getHeight());
EGL14.eglMakeCurrent(oldContext.windowDisplay,
oldContext.windowDrawSurface, oldContext.windowReadSurface,
EGL14.eglGetCurrentContext());
}
Add some bookkeeping methods
public boolean isRecording() {
return mRecording;
}
public void toggleRecording() {
if (isRecording()) {
stopRecording();
} else {
startRecording();
}
}
protected void startRecording() {
mRecording = true;
if (mVideoEncoder == null) {
mVideoEncoder = new TextureMovieEncoder2(mEncoderCore);
}
if (listener != null) {
listener.onVideoRecorderEvent(VideoEvent.RecordingStarted);
}
}
protected void stopRecording() {
mRecording = false;
if (mVideoEncoder != null) {
mVideoEncoder.stopRecording();
}
if (listener != null) {
listener.onVideoRecorderEvent(VideoEvent.RecordingStopped);
}
}
public void setEglConfig(EGLConfig eglConfig) {
this.mEGLConfig = eglConfig;
}
public enum VideoEvent {
RecordingStarted,
RecordingStopped
}
public interface VideoRecorderListener {
void onVideoRecorderEvent(VideoEvent videoEvent);
}
The inner class for the CaptureContext keeps track of the display and surfaces in order to easily handle multiple surfaces being used with the EGL context:
public static class CaptureContext {
EGLDisplay windowDisplay;
EGLSurface windowReadSurface;
EGLSurface windowDrawSurface;
private int mWidth;
private int mHeight;
public void initialize() {
windowDisplay = EGL14.eglGetCurrentDisplay();
windowReadSurface = EGL14.eglGetCurrentSurface(EGL14.EGL_DRAW);
windowDrawSurface = EGL14.eglGetCurrentSurface(EGL14.EGL_READ);
int v[] = new int[1];
EGL14.eglQuerySurface(windowDisplay, windowDrawSurface, EGL14.EGL_WIDTH,
v, 0);
mWidth = v[0];
v[0] = -1;
EGL14.eglQuerySurface(windowDisplay, windowDrawSurface, EGL14.EGL_HEIGHT,
v, 0);
mHeight = v[0];
}
/**
* Returns the surface's width, in pixels.
* <p>
* If this is called on a window surface, and the underlying
* surface is in the process
* of changing size, we may not see the new size right away
* (e.g. in the "surfaceChanged"
* callback). The size should match after the next buffer swap.
*/
public int getWidth() {
if (mWidth < 0) {
int v[] = new int[1];
EGL14.eglQuerySurface(windowDisplay,
windowDrawSurface, EGL14.EGL_WIDTH, v, 0);
mWidth = v[0];
}
return mWidth;
}
/**
* Returns the surface's height, in pixels.
*/
public int getHeight() {
if (mHeight < 0) {
int v[] = new int[1];
EGL14.eglQuerySurface(windowDisplay, windowDrawSurface,
EGL14.EGL_HEIGHT, v, 0);
mHeight = v[0];
}
return mHeight;
}
}
Add VideoEncoder classes
The VideoEncoderCore class is copied from Grafika, as well as the TextureMovieEncoder2 class.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With