I have an app that streams video using Kickflip and ButterflyTV libRTMP
Now for 99% percent of the time the app is working ok, but from time to time I get a native segmentation fault that I am not able to debug, since messages are too cryptic:
01-24 10:52:25.576 199-199/? A/DEBUG: *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
01-24 10:52:25.576 199-199/? A/DEBUG: Build fingerprint: 'google/hammerhead/hammerhead:6.0.1/M4B30Z/3437181:user/release-keys'
01-24 10:52:25.576 199-199/? A/DEBUG: Revision: '11'
01-24 10:52:25.576 199-199/? A/DEBUG: ABI: 'arm'
01-24 10:52:25.576 199-199/? A/DEBUG: pid: 14302, tid: 14382, name: MuxerThread >>> tv.myapp.broadcast.dev <<<
01-24 10:52:25.576 199-199/? A/DEBUG: signal 11 (SIGSEGV), code 2 (SEGV_ACCERR), fault addr 0x9fef1000
01-24 10:52:25.636 199-199/? A/DEBUG: Abort message: 'Setting to ready!'
01-24 10:52:25.636 199-199/? A/DEBUG: r0 9c6f9500 r1 9c6f94fc r2 9fee900c r3 00007ff4
01-24 10:52:25.636 199-199/? A/DEBUG: r4 9fee9010 r5 9fef0ffd r6 00007ff1 r7 9fef0d88
01-24 10:52:25.636 199-199/? A/DEBUG: r8 cfe40980 r9 9e0a6900 sl 00007ff4 fp 9c6f94fc
01-24 10:52:25.636 199-199/? A/DEBUG: ip 9c6f9058 sp 9c6f94dc lr 000000e9 pc b3a33cb6 cpsr 800f0030
01-24 10:52:25.650 199-199/? A/DEBUG: backtrace:
01-24 10:52:25.651 199-199/? A/DEBUG: #00 pc 00004cb6 /data/app/tv.myapp.broadcast.dev-2/lib/arm/librtmp-jni.so
01-24 10:52:25.651 199-199/? A/DEBUG: #01 pc 00005189 /data/app/tv.myapp.broadcast.dev-2/lib/arm/librtmp-jni.so (rtmp_sender_write_video_frame+28)
01-24 10:52:25.651 199-199/? A/DEBUG: #02 pc 00005599 /data/app/tv.myapp.broadcast.dev-2/lib/arm/librtmp-jni.so (Java_net_butterflytv_rtmp_1client_RTMPMuxer_writeVideo+60)
01-24 10:52:25.651 199-199/? A/DEBUG: #03 pc 014e84e7 /data/app/tv.myapp.broadcast.dev-2/oat/arm/base.odex (offset 0xa66000) (int net.butterflytv.rtmp_client.RTMPMuxer.writeVideo(byte[], int, int, int)+122)
01-24 10:52:25.651 199-199/? A/DEBUG: #04 pc 014dbd55 /data/app/tv.myapp.broadcast.dev-2/oat/arm/base.odex (offset 0xa66000) (void io.kickflip.sdk.av.muxer.RtmpMuxerMix.writeThread()+2240)
01-24 10:52:25.651 199-199/? A/DEBUG: #05 pc 014d8c41 /data/app/tv.myapp.broadcast.dev-2/oat/arm/base.odex (offset 0xa66000) (void io.kickflip.sdk.av.muxer.RtmpMuxerMix.access$000(io.kickflip.sdk.av.muxer.RtmpMuxerMix)+60)
01-24 10:52:25.651 199-199/? A/DEBUG: #06 pc 014d819f /data/app/tv.myapp.broadcast.dev-2/oat/arm/base.odex (offset 0xa66000) (void io.kickflip.sdk.av.muxer.RtmpMuxerMix$1.run()+98)
01-24 10:52:25.651 199-199/? A/DEBUG: #07 pc 721e78d1 /data/dalvik-cache/arm/system@[email protected] (offset 0x1ed6000)
Again, in a 2 hour stream this might not ever happen or it might happen 10 minutes into the stream. It is super hard to debug because I cannot force the bug to happen.
Is there any way to improve the debugging information I get? What exactly does SEGV_ACCER mean? I've read that this "means you tried to access an address that you don't have permission to access." but I am unsure as what that means, as I can stream for hours without the bug happening.
Is there any way to catch the signal and just continue?
EDIT: to add more information, this is the part of the native library where the app crashes (found using ndk-stack):
JNIEXPORT jint JNICALL
Java_net_butterflytv_rtmp_1client_RTMPMuxer_writeVideo(JNIEnv *env, jobject instance,
jbyteArray data_, jint offset, jint length,
jint timestamp) {
jbyte *data = (*env)->GetByteArrayElements(env, data_, NULL);
jint result = rtmp_sender_write_video_frame(data, length, timestamp, 0, 0);
(*env)->ReleaseByteArrayElements(env, data_, data, 0);
return result;
}
int rtmp_sender_write_video_frame(uint8_t *data,
int size,
uint64_t dts_us,
int key,
uint32_t abs_ts)
{
uint8_t * buf;
uint8_t * buf_offset;
int val = 0;
int total;
uint32_t ts;
uint32_t nal_len;
uint32_t nal_len_n;
uint8_t *nal;
uint8_t *nal_n;
char *output ;
uint32_t offset = 0;
uint32_t body_len;
uint32_t output_len;
buf = data;
buf_offset = data;
total = size;
ts = (uint32_t)dts_us;
//ts = RTMP_GetTime() - start_time;
offset = 0;
nal = get_nal(&nal_len, &buf_offset, buf, total);
(...)
}
static uint8_t * get_nal(uint32_t *len, uint8_t **offset, uint8_t *start, uint32_t total)
{
uint32_t info;
uint8_t *q ;
uint8_t *p = *offset;
*len = 0;
if ((p - start) >= total)
return NULL;
while(1) {
info = find_start_code(p, 3);
if (info == 1)
break;
p++;
if ((p - start) >= total)
return NULL;
}
q = p + 4;
p = q;
while(1) {
info = find_start_code(p, 3);
if (info == 1)
break;
p++;
if ((p - start) >= total)
//return NULL;
break;
}
*len = (p - q);
*offset = p;
return q;
}
static uint32_t find_start_code(uint8_t *buf, uint32_t zeros_in_startcode)
{
uint32_t info;
uint32_t i;
info = 1;
if ((info = (buf[zeros_in_startcode] != 1)? 0: 1) == 0)
return 0;
for (i = 0; i < zeros_in_startcode; i++)
if (buf[i] != 0)
{
info = 0;
break;
};
return info;
}
Crash happens at buf[zeros_in_startcode]
in find_start_code
. I have removed a few android_log lines as well (dont think this matters?).
To my understanding, this buffer should be accessible, it makes no sense that it crashes only "sometimes".
PS. this is where I call the native code from Java:
private void writeThread() {
while (true) {
Frame frame = null;
synchronized (mBufferLock) {
if (!mConfigBuffer.isEmpty()) {
frame = mConfigBuffer.peek();
} else if (!mBuffer.isEmpty()) {
frame = mBuffer.remove();
}
if (frame == null) {
try {
mBufferLock.wait();
} catch (InterruptedException e) {
}
}
}
if (frame == null) {
continue;
} else if (frame instanceof Sentinel) {
break;
}
int writeResult = 0;
synchronized (mWriteFence) {
if (!mConnected) {
debug(WARN, "Skipping frame due to disconnection");
continue;
}
if (frame.getFrameType() == Frame.VIDEO_FRAME) {
writeResult = mRTMPMuxer.writeVideo(frame.getData(), frame.getOffset(), frame.getSize(), frame.getTime());
} else if (frame.getFrameType() == Frame.AUDIO_FRAME) {
writeResult = mRTMPMuxer.writeAudio(frame.getData(), frame.getOffset(), frame.getSize(), frame.getTime());
}
if (writeResult < 0) {
mRtmpListener.onDisconnected();
mConnected = false;
} else {
//Now we remove the config frame, only if sending was successful!
if (frame.isConfig()) {
synchronized (mBufferLock) {
mConfigBuffer.remove();
}
}
}
}
}
}
Note that the crash happens even when I dont send audio at all.
"You can store the data in a
byte[]
. This allows very fast access from managed code. On the native side, however, you're not guaranteed to be able to access the data without having to copy it."
See https://developer.android.com/training/articles/perf-jni.html
Some musings and things to try:
frame
data has been removed/damaged/locked/movedframe
variable info (using ByteBuffer
) to mRTMPMuxer.writeVideo
byte
buffers,in ByteBuffer
the storage is not allocated on the managed heap
, and can always be accessed directly from native code.//allocates memory from the native heap ByteBuffer data = ByteBuffer.allocateDirect(frame.getData().length); data.clear(); //System.gc(); //copy data data.get(frame.getData(), 0, frame.getData().length); //data = (frame.getData() == null) ? null : frame.getData().clone(); int offset = frame.getOffset(); int size = frame.getSize(); int time = frame.getTime(); writeResult = mRTMPMuxer.writeVideo(data , offset, size, time); JNIEXPORT jint JNICALL Java_net_butterflytv_rtmp_1client_RTMPMuxer_writeVideo( JNIEnv *env, jobject instance, jobject data_, //NOT jbyteArray data_, jint offset, jint length, jint timestamp) { jbyte *data = env->GetDirectBufferAddress(env, data);//GetDirectBufferAddress NOT GetByteArrayElements jint result = rtmp_sender_write_video_frame(data, length, timestamp, 0, 0); //(*env)->ReleaseByteArrayElements(env, data_, data, 0);//???? return result; }
Debugging
Some code from SO Catching exceptions thrown from native code:
static uint32_t find_start_code(uint8_t *buf, uint32_t zeros_in_startcode){
//...
try {
if ((info = (buf[zeros_in_startcode] != 1)? 0: 1) == 0) return 0;//your code
}
// You can catch std::exception for more generic error handling
catch (std::exception e){
throwJavaException (env, e.what());//see method below
}
//...
Then a new method:
void throwJavaException(JNIEnv *env, const char *msg)
{
// You can put your own exception here
jclass c = env->FindClass("java/lang/RuntimeException");
if (NULL == c)
{
//B plan: null pointer ...
c = env->FindClass("java/lang/NullPointerException");
}
env->ThrowNew(c, msg);
}
}
Don't get too hung up on SEGV_ACCERR
, you have a segmentation fault,SIGSEGV
(caused by a program trying to read or write an illegal memory location, read in your case).
From siginfo.h:
SEGV_MAPERR means you tried to access an address that doesn't map to anything. SEGV_ACCERR means you tried to access an address that you don't have permission to access.
This may be of interest:
Q: I noticed that there was RTMP support. But a patch which remove RTMP had been merged.
Q: Could you tell me why ?
A: We don't think RTMP serves the mobile broadcasting use case as well as HLS,
A: and so we don't want to dedicate our limited resources towards supporting it.
see: https://github.com/Kickflip/kickflip-android-sdk/issues/33
I suggest you register an issue with:
https://github.com/Kickflip/kickflip-android-sdk/issues
https://github.com/ButterflyTV/LibRtmp-Client-for-Android/issues
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With