Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it OK to use longjmp to break out of qsort?

In In the middle of qsort, is there a way to stop it? some comments mentioned using setjmp/longjmp to break out of a call to qsort() from the comparison function.

The language specification doesn't seem to say what happens in this case. The only constraint on the comparison function explicitly mentioned is that it must return consistent results.

Does the lack of mention mean it's implicitly allowed or is it undefined behavior because the result is not explicitly defined?

I think it's obvious that the state of the array being sorted would be indeterminate. But could it have more wide-ranging impact?

like image 842
Barmar Avatar asked Sep 03 '25 17:09

Barmar


2 Answers

I argue that the behavior is undefined under many, if not all, conditions. (To be precise, I don't think the longjmp in and of itself necessarily causes UB, but I think it will be difficult or impossible to avoid having the program's subsequent execution cause UB.)

This is based on 7.1.4p4 (C23 N3220):

The functions in the standard library are not guaranteed to be reentrant and may modify objects with static or thread storage duration.

Now, the standard does not define "reentrant". But at the least, this forbids a comparison function from calling qsort recursively. So it seems to permit an implementation like

void qsort(...) {
    _Thread_local bool inside_qsort = false;
    if (inside_qsort)
        summon_nasal_demons();
    inside_qsort = true;
    // ... body ...
    inside_qsort = false;
}

If longjmp were called from the comparison function, 7.13.3.1/2-3 (as mentioned in other answers) actually guarantees that inside_qsort must retain the value true, such that any future calls to qsort would summon the demons. So at the very least, I think we can say that if you longjmp out of qsort, you cannot safely call qsort again during the execution of the program.

A trickier question is whether you can safely call any other library function. The common situation where the non-reentrancy of 7.1.4p4 comes up is signal handlers. In that setting, it's understood that it forbids calling any library function from a signal handler that may have interrupted a library function - not just the same one. For instance, even if you can prove that the only library function your signal handler could have interrupted was fgetc, that does not mean the handler can safely call fscanf.

Based on that logic, we might argue that if you longjmp out of qsort, it is undefined behavior to call any library function from that point onward, including such functions as exit and abort, which are hard to avoid (recall that returning from main is equivalent to calling exit, 5.1.2.3.4p1).

On the other hand, that same logic would lead us to the conclusion that it is unsafe to call any library function from within a qsort comparison function. And it's obvious that at least some library functions were intended to be guaranteed safe in that context, such for instance strcmp, although this is nowhere explicitly stated. So perhaps there would be some library functions that you could safely call after longjmp out of qsort, but the standard never says what they are.

like image 188
Nate Eldredge Avatar answered Sep 05 '25 06:09

Nate Eldredge


The language specification doesn't seem to say what happens in this case. The only constraint on the comparison function explicitly mentioned is that it must return consistent results.

qsort() is distinguished by being one of the very few standard library functions that relies on a user-supplied callback. You are right that its specifications don't say what happens if a call to the comparison callback does not return, so there is at least room for question. But this seems only slightly distinguished from the case where the same thing happens in a user-defined callback-based sort function. I don't see a good reason not to expect the behavior to follow longjmp()'s specifications:

The longjmp function restores the environment saved by the most recent invocation of the setjmp macro in the same invocation of the program with the corresponding jmp_buf argument. [...]

All accessible objects have values, and all other components of the abstract machine have state, as of the time the longjmp function was called, except that the representation of objects of automatic storage duration that are local to the function containing the invocation of the corresponding setjmp macro that do not have volatile-qualified type and have been changed between the setjmp invocation and longjmp call is indeterminate.

(C23 7.13.3.1/2-3)

I take that as a definition of what happens, so the behavior is not undefined.

As you observe, however, we must allow that qsort() may have made unspecified changes of substantially any nature to the contents of the array prior to the longjmp call. It may also be that qsort() dynamically allocates memory, which would presumably be leaked by the longjmp. These are the kinds of issues that attend longjmp calls generally.

like image 25
John Bollinger Avatar answered Sep 05 '25 07:09

John Bollinger



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!