Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does conditional expression compare strings?

Tags:

bash

#!/usr/bin/env bash
echo 'Using conditional expression:'
[[ ' ' < '0' ]] && echo ok || echo not ok
[[ ' a' < '0a' ]] && echo ok || echo not ok
echo 'Using test:'
[ ' ' \< '0' ] && echo ok || echo not ok
[ ' a' \< '0a' ] && echo ok || echo not ok

The output is:

Using conditional expression:
ok
not ok
Using test:
ok
ok

bash --version: GNU bash, version 4.2.45(1)-release (x86_64-pc-linux-gnu)

uname -a: Linux linuxmint 3.8.0-19-generic

like image 654
updogliu Avatar asked Jul 30 '13 09:07

updogliu


People also ask

Can you use == to compare strings?

You should not use == (equality operator) to compare these strings because they compare the reference of the string, i.e. whether they are the same object or not. On the other hand, equals() method compares whether the value of the strings is equal, and not the object itself.

How do you compare strings in if condition in python?

How to Compare Strings Using the <= Operator. The <= operator checks if one string is less than or equal to another string. Recall that this operator checks for two things – if one string is less or if both strings are the same – and would return True if either is true. We got True because both strings are equal.

How do you compare equality of strings?

The equals() method compares two strings, and returns true if the strings are equal, and false if not. Tip: Use the compareTo() method to compare two strings lexicographically.


2 Answers

Bash manual says:

When used with [[, the < and > operators sort lexicographically using the current locale. The test command sorts using ASCII ordering.

This boils down to using strcoll(3) or strcmp(3) respectively.

Use the following program (strcoll_strcmp.c) to test this:

#include <stdio.h>
#include <string.h>
#include <locale.h>

int main(int argc, char **argv)
{
    setlocale(LC_ALL, "");

    if (argc != 3) {
        fprintf(stderr, "Usage: %s str1 str2\n", argv[0]);
        return 1;
    }

    printf("strcoll('%s', '%s'): %d\n",
           argv[1], argv[2], strcoll(argv[1], argv[2]));
    printf("strcmp('%s', '%s'): %d\n",
           argv[1], argv[2], strcmp(argv[1], argv[2]));

    return 0;
}

Note the difference:

$ LC_ALL=C ./strcoll_strcmp ' a' '0a'
strcoll(' a', '0a'): -16
strcmp(' a', '0a'): -16

$ LC_ALL=en_US.UTF-8 ./strcoll_strcmp ' a' '0a'
strcoll(' a', '0a'): 10
strcmp(' a', '0a'): -16

Exactly why these compare as such I'm not sure. This must be due to some English lexicographical sorting rules. I think the exact rules are described in ISO 14651 Method for comparing character strings and description of the common template tailorable ordering and the accompanying template table. Glibc contains this data in the source tree under libc/localedata/locales.

like image 62
spbnick Avatar answered Sep 22 '22 18:09

spbnick


The behaviour that you're observing can be explained by the following from the manual:

bash-4.1 and later use the current locale’s collation sequence and strcoll(3).

You seem to be looking for comparison based on ASCII collation. You can change the behavior by setting either compat32 or compat40.

$ cat test
shopt -s compat40
echo 'Using conditional expression:'
[[ ' ' < '0' ]] && echo ok || echo not ok
[[ ' a' < '0a' ]] && echo ok || echo not ok
echo 'Using test:'
[ ' ' \< '0' ] && echo ok || echo not ok
[ ' a' \< '0a' ] && echo ok || echo not ok
$ bash test
Using conditional expression:
ok
ok
Using test:
ok
ok

From the manual:

compat32
If set, Bash changes its behavior to that of version 3.2 with respect to locale-specific string comparison when using the ‘[[’ conditional command’s ‘<’ and ‘>’ operators. Bash versions prior to bash-4.0 use ASCII collation and strcmp(3); bash-4.1 and later use the current locale’s collation sequence and strcoll(3). 
compat40
If set, Bash changes its behavior to that of version 4.0 with respect to locale-specific string comparison when using the ‘[[’ conditional command’s ‘<’ and ‘>’ operators (see previous item) and the effect of interrupting a command list. 
like image 41
devnull Avatar answered Sep 19 '22 18:09

devnull