Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to safely split strings?

When we want to split a sting for any kind of reasons, we (at least myself) tend to split using the (pipe) | character as it is very rare to find out someone or that the application uses it on a string ... but what happens if it uses?

Well, a simple Crash will be thrown :)

I found out that a college uses non-printable chars to do the same technique, for example:

String.Format(
         "{1}{0}{2}{0}{3}{0}{4}",
         (char)2,
         myFirstString,
         mySecondString,
         myThirdString,
         myFourthString);

and when we want to extract the hole string into it's parts

String.Split((char)2);

Is this safe? Should I adopt this way of safely splitting string? Is there any other safety technique?

like image 779
balexandre Avatar asked Dec 05 '22 01:12

balexandre


1 Answers

It may be “safer” than the pipe because it is rarer, but both ways are suboptimal because they limit you to a subset of possible strings.

Consider using a proper encoding — one that unambiguously encodes a list of arbitrary strings. The simplest in terms of coding is probably to simply serialize a string[]. You could use BinaryFormatter or XmlSerializer or something else.

If the result has to be a string, and it has to be a short one, then you could try something like this:

  • Encoding: (list of strings to single string)
    • Replace every ! with !e and every | with !p in every string. Now, none of the strings contains a | and you can easily reverse this.
    • Concatenate the strings using | as a separator.
  • Decoding: (single string back to list of strings)
    • Split on the | character.
    • Replace all !p with | and !e with ! in every string. This recovers the original strings.
like image 130
Timwi Avatar answered Dec 07 '22 13:12

Timwi