I'm having serious problems with string-handling. As my problems are rather hard to describe, I will start with some demo code reproducing them:
Dim s1 As String = "hi"
Dim c(30) As Char
c(0) = "h"
c(1) = "i"
Dim s2 As String = CStr(c)
s2 = s2.Trim()
If not s1 = s2 Then
MsgBox(s1 + " != " + s2 + Environment.NewLine + _
"Anything here won't be printed anyway..." + Environment.NewLine + _
"s1.length: " + s1.Length.ToString + Environment.NewLine + _
"s2.length: " + s2.Length.ToString + Environment.NewLine)
End If
The result messagebox looks like this:
The reason that this comparison fails is that s2 has the length 31 (from the original array-size) while s1 has the length 2.
I stumble over this kind of problem quite often when reading string-information out of byte-arrays, for example when handling ID3Tags from MP3s or other encoded (ASCII, UTF8, ...) information with pre-specified length.
Is there any fast and clean way to prevent this problem?
What is the easiest way to "trim" s2 to the string shown by the debugger?
I changed the variable names for clarity:
Dim myChars(30) As Char
myChars(0) = "h"c ' cannot convert string to char
myChars(1) = "i"c ' under option strict (narrowing)
Dim myStrA As New String(myChars)
Dim myStrB As String = CStr(myChars)
The short answer is this:
Under the hood, strings are character arrays. The last 2 lines both create a string one using NET code, the other a VB function. The thing is that, although the array has 31 elements, only 2 were initialized:
The rest are null/Nothing, which for a Char
means Chr(0)
or NUL
. Since NUL
is used to mark the end of a String
, only the characters up to that NUL
will print in the Console
, MessageBox
etc. Text appended to the string will not display either.
Concepts
Since the strings above are created directly from a char array, the length is that of the original array. The Nul
is a valid char
so they get added to the string:
Console.WriteLine(myStrA.Length) ' == 31
So, why doesn't Trim
remove the nul characters? MSDN (and Intellisense) tells us:
[Trim] Removes all leading and trailing white-space characters from the current String object.
The trailing null/Chr(0) characters are not white-space like Tab, Lf, Cr or Space, but is a control character.
However, String.Trim
has an overload which allows you to specify the characters to remove:
myStrA = myStrA.Trim(Convert.ToChar(0))
' using VB namespace constant
myStrA = myStrA.Trim( Microsoft.VisualBasic.ControlChars.NullChar)
You can specify multiple chars:
' nuls and spaces:
myStrA = myStrA.Trim(Convert.ToChar(0), " "c)
Strings can be indexed / iterated as a char array:
For n As Int32 = 0 To myStrA.Length
Console.Write("{0} is '{1}'", n, myStrA(n)) ' or myStrA.Chars(n)
Next
0 is 'h'
1 is 'i'
2 is '
(The output window will not even print the trailing CRLF.) You cannot change the string's char array to change the string data however:
myStrA(2) = "!"c
This will not compile because they are read-only.
See also:
ASCII table
If you want to create strings from a byte array, i.e. ID3v2.4.0 with ISO-8859 encoding, then this should work:
Dim s1 As String = "Test"
Dim b() As Byte = New Byte() {84, 101, 115, 116, 0, 0, 0}
Dim s2 As String = System.Text.ASCIIEncoding.ASCII.GetString(b).Trim(ControlChars.NullChar)
If s1 = s2 Then Stop
According to this http://id3.org/id3v2.4.0-structure other encodings may be present and the code would need to be adjusted if one of the others is used.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With