Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Who determines the ordering of characters

I have a query based on the below program -

char ch;
ch = 'z';
while(ch >= 'a')
{
    printf("char is  %c and the value is %d\n", ch, ch);
    ch = ch-1;
}

Why is the printing of whole set of lowercase letters not guaranteed in the above program. If C doesn't make many guarantees about the ordering of characters in internal form, then who actually does it and how ?

like image 494
Karthik Balaguru Avatar asked Jul 26 '10 04:07

Karthik Balaguru


People also ask

How are strings ordered?

With strings, the usual order is Lexicographic Order. This is dictionary order, except that all the uppercase letters preceed all the lowercase letters.

Which character comes first in alphabetical order?

Alphabetic or alphabetize describes a listing, sort, or order that is done alphabetically. An ascending alphabetic sort orders text by the first letter of the first word, with 'A' first and 'Z' last. (Other special characters, such as an underscore, are usually ordered to precede 'A.

What is character sort?

Traditionally, information is displayed in sorted order to enable users to easily find the items they are looking for. However, users of different languages might have very different expectations of what a sorted list should look like.


1 Answers

The compiler implementor chooses their underlying character set. About the only thing the standard has to say is that a certain minimal number of characters must be available and that the numeric characters are contiguous.

The required characters for a C99 execution environment are A through Z, a through z, 0 through 9 (which must be together and in order), any of !"#%&'()*+,-./:;<=>?[\]^_{|}~, space, horizontal tab, vertical tab, form-feed, alert, backspace, carriage return and new line. This remains unchanged in the current draft of C1x, the next iteration of that standard.

Everything else depends on the implementation.

For example, code like:

int isUpperAlpha(char c) {
    return (c >= 'A') && (c <= 'Z');
}

will break on the mainframe which uses EBCDIC, dividing the upper case characters into two regions.

Truly portable code will take that into account. All other code should document its dependencies.

A more portable implementation of your example would be something along the lines of:

static char chrs[] = "zyxwvutsrqponmlkjihgfedcba";
char *pCh = chrs;
while (*pCh != 0) {
    printf ("char is %c and the value is %d\n", *pCh, *pCh);
    pCh++;
}

If you want a real portable solution, you should probably use islower() since code that checks only the Latin characters won't be portable to (for example) Greek using Unicode for its underlying character set.

like image 111
paxdiablo Avatar answered Nov 12 '22 01:11

paxdiablo