Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ASN1 UTF-8 string Decoding

I am working to make an ASN.1 parser in the C language (using the Ericsson ASN1 specification document). I want to decode the UTF-8 string type but I can't find information about this online, and the document I'm using does not describe UTF-8 string in detail. Can anybody provide me with some code, or explain how to decode it.

I am new to ASN.1.

like image 458
user3148326 Avatar asked Mar 08 '15 17:03

user3148326


People also ask

What is asn1 format?

ASN. 1, or Abstract Syntax Notation One, is an International Standards Organization (ISO) data representation format used to achieve interoperability between platforms. NCBI uses ASN. 1 for the storage and retrieval of data such as nucleotide and protein sequences, structures, genomes, PubMed records, and more.

What is asn1 sequence?

In ASN. 1, an ordered list of elements (or components) comprises a SEQUENCE. Using SEQUENCE, you can create a new type built from an arbitrary series of elements. Each element must identify its type, either by specifying a type name or by actually defining the element's type inline.

What is octet string in asn1?

The ASN. 1 OCTET STRING type contains arbitrary strings of octets. This type is very similar to BIT STRING, except that all values must be an integral number of eight bits. You can use constraints to specify a maximum length for an OCTET STRING type.

What is asn1 parser?

ASN.1 (Abstract Syntax Notation One) is an international standard which aims at specifying format / structure of data used in telecommunication protocols. It was designed for modeling efficient communication between heterogeneous systems. Hence the description is also known as Abstract Syntax.


1 Answers

If you're trying to parse ASN.1, then an excellent introductory resource is Kaliski's ‘Layman’s Guide’ (available at various places on the web, in HTML and PDF). However that document doesn't mention the UTF8String type.

The extra information you need to know is that UTF8String has tag 12 (decimal, or 0c in hex), and that it's encoded as a sequence of the bytes representing the string in the UTF-8 encoding.

Thus the string ‘Helló’ would be encoded as

0c 06 48 65 6c 6c c3 b3

(I'm presuming, by the way, that ‘Ericsson ASN1 specification document’ discusses the standard ASN.1, and not some variant.)

like image 99
Norman Gray Avatar answered Sep 19 '22 13:09

Norman Gray