Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to encode East-European (Polish) signs using simple escape sequences?

I'm developing an embedded application in C, which has to conform to MISRA standards. It will involve the use of strings containing Polish signs (ąęćłńśźż). I tried encoding them using octal/hex escape sequences:

dictionary[archive_error] = "B" "\x88" "ąd pamieci";

but those are prohibited by rule 4.1. of MISRA-C 2004. This rule is required.

My question is: is it possible, and how, to encode this character set using only simple escape sequences of ISO/IEC 9899?

like image 564
Michał Szydłowski Avatar asked Oct 20 '22 14:10

Michał Szydłowski


1 Answers

In is not clear which MISRA version you are using.

Rule 4.1 of MISRA-C:2004 simply prohibits non-standard escape sequences. In MISRA-C:2004 TC1 this was later changed to ban all hexadecimal and octal escape sequences (they have implementation-defined behavior unless you are careful). Apparently this rule and its supposed correction was a bit of a goof-up from the committee.

The rule has been properly fixed in the latest MISRA-C:2012, where rule 4.1 states that escape sequences shall be terminated, either with the start of a new escape sequence or with the end of the string literal, just as in your example.

So the code you have posted does not conform to MISRA-C:2004, but it conforms fully to MISRA-C:2012. If you are using the former, I'd just raise a deviation and refer to MISRA-C:2012 rule 4.1.

Otherwise, a work-around is to simply use character literals mixed with integers, instead of string literals:

dictionary[archive_error] = {'B', 0x88u, 'a', ... , '\0'};
like image 120
Lundin Avatar answered Oct 22 '22 05:10

Lundin