In In the middle of qsort, is there a way to stop it? some comments mentioned using setjmp/longjmp
to break out of a call to qsort()
from the comparison function.
The language specification doesn't seem to say what happens in this case. The only constraint on the comparison function explicitly mentioned is that it must return consistent results.
Does the lack of mention mean it's implicitly allowed or is it undefined behavior because the result is not explicitly defined?
I think it's obvious that the state of the array being sorted would be indeterminate. But could it have more wide-ranging impact?
I argue that the behavior is undefined under many, if not all, conditions. (To be precise, I don't think the longjmp
in and of itself necessarily causes UB, but I think it will be difficult or impossible to avoid having the program's subsequent execution cause UB.)
This is based on 7.1.4p4 (C23 N3220):
The functions in the standard library are not guaranteed to be reentrant and may modify objects with static or thread storage duration.
Now, the standard does not define "reentrant". But at the least, this forbids a comparison function from calling qsort
recursively. So it seems to permit an implementation like
void qsort(...) {
_Thread_local bool inside_qsort = false;
if (inside_qsort)
summon_nasal_demons();
inside_qsort = true;
// ... body ...
inside_qsort = false;
}
If longjmp
were called from the comparison function, 7.13.3.1/2-3 (as mentioned in other answers) actually guarantees that inside_qsort
must retain the value true
, such that any future calls to qsort
would summon the demons. So at the very least, I think we can say that if you longjmp
out of qsort
, you cannot safely call qsort
again during the execution of the program.
A trickier question is whether you can safely call any other library function. The common situation where the non-reentrancy of 7.1.4p4 comes up is signal handlers. In that setting, it's understood that it forbids calling any library function from a signal handler that may have interrupted a library function - not just the same one. For instance, even if you can prove that the only library function your signal handler could have interrupted was fgetc
, that does not mean the handler can safely call fscanf
.
Based on that logic, we might argue that if you longjmp
out of qsort
, it is undefined behavior to call any library function from that point onward, including such functions as exit
and abort
, which are hard to avoid (recall that returning from main
is equivalent to calling exit
, 5.1.2.3.4p1).
On the other hand, that same logic would lead us to the conclusion that it is unsafe to call any library function from within a qsort
comparison function. And it's obvious that at least some library functions were intended to be guaranteed safe in that context, such for instance strcmp
, although this is nowhere explicitly stated. So perhaps there would be some library functions that you could safely call after longjmp
out of qsort
, but the standard never says what they are.
The language specification doesn't seem to say what happens in this case. The only constraint on the comparison function explicitly mentioned is that it must return consistent results.
qsort()
is distinguished by being one of the very few standard library functions that relies on a user-supplied callback. You are right that its specifications don't say what happens if a call to the comparison callback does not return, so there is at least room for question. But this seems only slightly distinguished from the case where the same thing happens in a user-defined callback-based sort function. I don't see a good reason not to expect the behavior to follow longjmp()
's specifications:
The
longjmp
function restores the environment saved by the most recent invocation of thesetjmp
macro in the same invocation of the program with the correspondingjmp_buf
argument. [...]All accessible objects have values, and all other components of the abstract machine have state, as of the time the
longjmp
function was called, except that the representation of objects of automatic storage duration that are local to the function containing the invocation of the correspondingsetjmp
macro that do not havevolatile
-qualified type and have been changed between thesetjmp
invocation andlongjmp
call is indeterminate.
(C23 7.13.3.1/2-3)
I take that as a definition of what happens, so the behavior is not undefined.
As you observe, however, we must allow that qsort()
may have made unspecified changes of substantially any nature to the contents of the array prior to the longjmp
call. It may also be that qsort()
dynamically allocates memory, which would presumably be leaked by the longjmp
. These are the kinds of issues that attend longjmp
calls generally.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With