Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how is czech "ch" letter stored in mysql and how to get it using substr?

Even "ch" appears as two letters, in czech its considered as one letter and its order in alphabet is after H letter (so correct order is a,b,c,d,e,f,g,h,ch,i,j (I skipped some national characters). But when I do substr (colname, 1, 1) on column containing words begining with ch Im getting only "C"

this sql: SELECT SUBSTRING(title, 1, 1) AS title_truncated FROM node node WHERE node.type in ('termin') GROUP BY title_truncated ORDER BY title_truncated ASC"

returns: A, B, C, D, E, F, G, H, I, J (so no ch).

btw database is using utf8_czech_ci

like image 696
peter Avatar asked Dec 07 '25 04:12

peter


2 Answers

Ch is not a character in itself in Unicode, it is a digraph.

As such, it seems impossible for a database collation to properly map the difference. What @Ladislav says in the comment, and the user in this mySQL internals discussion, seems to support this.

You will probably have to work around this manually, e.g. in your example, using a IF clause that tests for the presence of "Ch", and returns two characters if that is the case.

Reference: utf8_czech_ci collation table (mySQL 6)

like image 103
Pekka Avatar answered Dec 08 '25 17:12

Pekka


Even though ch is considered a single sorting "letter" in Czech, it isn't considered a single "character" in any other way. It is stored and printed as two characters whenever it is encountered.

The collation setting in MySQL affects how strings are sorted, trying to sort individual characters is not very meaningful in many languages. E.g. č comes after (IIRC) c but and ne are equivalent so word ordering depends on the following letters.

I don't understand the underlying problem that you are trying to solve but I think the easiet approach might be to avoid using substring and to sort by title and only output the first "letter" if it changes when you are processing the results.

like image 22
CB Bailey Avatar answered Dec 08 '25 17:12

CB Bailey