Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

String length with embeded null in vb.net

I have a simple question: How vb.net determine string length and treat string's termination?
I know in C (and its family) null character is end of string. In vb6 null character has no effect on string's termination but in vb.net it seems to be foggy!
Assume this code in vb6:

Private Sub Command1_Click()
Dim Str As String
Str = "Death is a good user," & Chr(0) & " Yes I'm good"
RichTextBox1.Text = Str
RichTextBox1.Text = RichTextBox1.Text & vbNewLine & Len(Str)
End Sub

This what happens when this code runs: enter image description here

And it's alright. This is similar code in C:

#include "stdafx.h"
#include <string.h> 
int main(int argc, char* argv[])
{
    char *p="Death is a good user,\0 Yes I'm good";

    printf("String:%s\nString length:%d\n",p,strlen(p));

    return 0;
}

And this is what happens:
enter image description here

Which is fine too according C rules, but here is the same code in vb.net:

Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
    Dim str As String = "Death is a good user," & Chr(0) & " Yes I'm good"
    RichTextBox1.Text = str
    RichTextBox1.Text &= vbNewLine & str.Length
End Sub

And what happens:
enter image description here


Which doesn't seems about right!

Edit 1: Writing to file seems to be right:

enter image description here

Edit 2: Mark and tcarvin suggest, it may be UI's problem but it doesn't explain why vb6 shows whole string!

Edit 3: I know Windows and its API,UI,... are written in C, So it would be normal for them to react like C, but as I showed above, they don't.

like image 640
undone Avatar asked Dec 16 '22 17:12

undone


2 Answers

In VB.NET (and C#) strings are treated very similarly to how they are in VB6, and that is they have an explicit length that is not based on any particular character or characters contained within them.

In regards to the RichTextBox, it would simply appear that it does not support an embedded null character.

like image 185
tcarvin Avatar answered Jan 07 '23 03:01

tcarvin


There are 3 distinct string types used in your snippets by the underlying runtime support libraries:

  • BSTR, used by VB6. It is an COM Automation type used by all ActiveX controls that can store a Unicode string and includes an explicit length. BSTR can therefore store embedded zeros.
  • C strings, used by the C language. No explicit length is stored, a zero indicates end-of-string. The winapi is C based and uses C strings in its functions.
  • System.String, the .NET string type and used in any .NET code. Similar to a BSTR, it also has an explicit length field and can thus store strings with embedded zeros.

In all three cases interop needs to be used by the underlying runtime support libraries to make the string visible:

  • VB6 uses an ActiveX control for the RichEditBox. Exactly what the control looks like is hard to guess, it is pretty specific to VB6 and is named richtx32.ocx. It does use the native Windows richedit control as well (riched32.dll) so the ActiveX control very much acts as a wrapper to make the native Windows control usable in a VB6 app. You've conclusively demonstrated that it honors the behavior of a BSTR and handles embedded zeros, like any ActiveX control should.

  • The C program uses the C Runtime Library which in turn implements printf() by calling a winapi console function, WriteConsole(). This api function is C based but the buck already stopped at printf(), an embedded zero is a string terminator for that function. No surprises here.

  • The Winforms program uses the .NET RichEditBox class, a managed wrapper for the riched20.dll native Windows control. The underlying mechanism is pinvoke, almost all properties and methods of the class are implemented by pinvoking SendMessage() to send messages like EM_SETTEXTEX, the message that alters the text displayed by the control. This is also a C based api, a zero acts like a string terminator. Unlike the richtx32.ocx wrapper, the .NET RichEditBox wrapper class does not make an effort to properly handle strings with embedded zeros. It simply passes the string as-is and leaves it up to the pinvoke marshaller to convert the .NET string to a C string. Which has no other option than to truncate the string at the zero since C strings don't have a length field.

like image 29
Hans Passant Avatar answered Jan 07 '23 04:01

Hans Passant