Using ASCII delimiters (29-31) in modern programming

Tags:

ascii

I'm currently building a hash key string (collapsed from a map) where the values that are delimited by the special ASCII unit delimiter 31 (1F).

This nicely solves the problem of trying to guess what ASCII characters won't be used in the string values and I don't need to worry about escaping or quoting values etc.

However reading about the history of this is it appears to be a relic from the 1960s and I haven't seen many examples where strings are built and tokenised using this special character so it all seems too easy.

Are there any issues to using this delimiter in a modern application?

I'm currently doing this in a non-Unicode C++ application, however I'm interested to know how this applies generally in other languages such as Java, C# and with Unicode.

211

asked Dec 30 '12 18:12

TownCube

2 Answers

The lower 128 char map of ASCII is fully set in stone into the Unicode standard, this including characters 0->31. The only reason you don't see special ASCII chars in use in strings very often is simply because of human interfacing limitations: they do not visualize well (if at all) when displayed to screen or written to file, and you can't easily type them in from a keyboard either. They're also not allowed in un-escaped form within various popular 'human readable' file formats, such as XML.

For logical processing tasks within a program that do not need end-user interaction, however, they are perfectly suitable for whatever use you can find for them. Your particular use sounds novel and efficient and I think you should definitely run with it.

198

answered Oct 01 '22 01:10

jstine

Your application is free to accept whatever binary format it pleases. However, if you need to embed arbitrary binary data in your input, you need to escape whatever delimiters or other special codes your format uses. This is true regardless of which ones you choose.

I'd also not ignore Unicode. It's 2012, by now it's rather silly to work with an outdated model for dealing with text. If your input data is textual, handle it as such.

The one issue that comes to mind is why invent another format instead of using XML or JSON; or if you need a compact encoding, a "binary" variant of those two (Fast Infoset, msgpack, who knows what else), or ASN.1? There's probably a whole bunch of other issues that you'll encounter when rolling your own that the design and tooling for those formats already solved.

answered Oct 01 '22 00:10

millimoose

Related questions
                            
                                Assign ASCII character to wire in Verilog
                            
                                How to convert binary string to ascii string in python? [duplicate]
                            
                                Flip an arrow character
                            
                                How to make next step of a string. C#
                            
                                Displaying ASCII value of a character in c [duplicate]
                            
                                Setting the font size in Itextpdf
                            
                                Converting integers to alphabetic characters
                            
                                mb_detect_encoding detects ASCII as UTF-8?
                            
                                How to assign a character using ASCII data?
                            
                                Normalizing ASCII characters
                            
                                How do I extract ASCII data from binary file with unknown format, in Windows?
                            
                                How to iterate through all ASCII characters in Bash?
                            
                                C# Random Password Generator
                            
                                Escape NSString for javascript input
                            
                                How to generate a random named text file in C#?
                            
                                How can I check if char encoding is ASCII?
                            
                                Matching extended ASCII characters in .NET Regex
                            
                                Unpacking EBCDIC Packed Decimals (COMP-3) in an ASCII Conversion
                            
                                Why is “-” (hyphen) the unique ASCII limitation for E-mail compatibility?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With