Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C# Char from Int used as String - the real equivalent of VB Chr()

I am trying to find a clear answer to my question and it is not a duplicate of any other questions on the site. I have read many posts and related questions on this on SO and several other sites. For example this one which is the key answer (many others are marked off as dulpicates and redirect to this one): What's the equivalent of VB's Asc() and Chr() functions in C#?

I was converting a VBA macro to C#. And in VBA chr(7) can simply be concatenated to a string as if chr() would yield a string. Why can't this be done in C#?

And unfortunately the answer is not clear and many times they state that this is a correct use:

string mystring=(char)7;

Yet it gives me a compiler error as it does not evaluate as a string.

I had to use this to make it work:

string mystring=((char)7).ToString();

This would be the equivalent of the VB Chr() function, really as Chr() in VB evaluates as a string.

My question is this: do I always need to cast the char over to string explicitly or there are some cases where it converts over implicitly?

UPDATE:

Per @Dirk's answer, this also works:

string mystring = "" + (char)7;

This does not lessen my mystery. If concatenation works why there is no implicit cast??

I would like to get a full explanation on the difference between the VB Chr() and its equivalents in C#. I would appreciate any reference where I can read up on, or even examples would do. Thanks in advance.

like image 553
ib11 Avatar asked May 02 '16 06:05

ib11


People also ask

What C is used for?

C programming language is a machine-independent programming language that is mainly used to create many types of applications and operating systems such as Windows, and other complicated programs such as the Oracle database, Git, Python interpreter, and games and is considered a programming foundation in the process of ...

What is the full name of C?

In the real sense it has no meaning or full form. It was developed by Dennis Ritchie and Ken Thompson at AT&T bell Lab. First, they used to call it as B language then later they made some improvement into it and renamed it as C and its superscript as C++ which was invented by Dr.

What is C language?

C is a structured, procedural programming language that has been widely used both for operating systems and applications and that has had a wide following in the academic community. Many versions of UNIX-based operating systems are written in C.


4 Answers

You are opening Pandora's box with this question. Chr() is a legacy function in VB.NET, any modern code should be using ChrW() instead. The difference is the way the character value should be interpreted, ChrW() assumes the character code is Unicode (W = wide). Chr() rolls back the clock to the previous century, a stone age without Unicode where characters were either in the ASCII character set (0..127) or an "extended" character (128..255). Where the extended characters belong to a code page. Many, many different code pages were in common use. A very significant disaster, programs could not properly interpret text that was generated by another machine located in a different country. Or even in the same country, Japan had multiple code pages in common use with none of them dominant. Producing mojibake.

I'll assume you mean ChrW(), nobody likes mojibake. Not C# either. Using Char.ToString() is fine, the alternative is to use the string constructor that takes a char:

  string mystring = new string((char)7, 1);

Or the more general form you might prefer:

  public static string ChrW(int code) {
      return new string((char)code, 1);
  }

Not the only way to do it, using literals is possible as well and likely to be what you prefer over a helper method. And the basic reason that C# does not need a helper function like Chr(). ASCII control code 7 is the bell character, it BEEPs you when you write it to the console, you can use an escape for that:

  string mystring = "\a";

Not exactly memorable, this comes from Unix. Other ones are "\b" for backspace, "\t" for a tab, "\r" for a carriage return and "\n" for a line feed. A classic trick to erase the last typed character in a console window is Console.Write("\b \b");. The Environment.NewLine property should be noted. Which is about as far as you should push it with control characters.

And last but not least the \U and \u specifier that lets you encode any character:

  string mystring = "\u0007";

Not obvious from the example but the \u value needs to be hexadecimal. \U is needed when you use codepoints from the upper Unicode bit planes.

like image 188
Hans Passant Avatar answered Nov 09 '22 17:11

Hans Passant


If you absolutely have to use the Chr method for legacy reasons of whatever, the best thing is to use it as a normal method.

If you don't want to import VisualBasic, or want to see how it works, Reflector gives a nice piece of code:

public static char Chr(int CharCode)
{
    char ch;
    if ((CharCode < -32768) || (CharCode > 0xffff))
    {
        throw new ArgumentException(Utils.GetResourceString("Argument_RangeTwoBytes1", new string[] { "CharCode" }));
    }
    if ((CharCode >= 0) && (CharCode <= 0x7f))
    {
        return Convert.ToChar(CharCode);
    }
    try
    {
        int num;
        Encoding encoding = Encoding.GetEncoding(Utils.GetLocaleCodePage());
        if (encoding.IsSingleByte && ((CharCode < 0) || (CharCode > 0xff)))
        {
            throw ExceptionUtils.VbMakeException(5);
        }
        char[] chars = new char[2];
        byte[] bytes = new byte[2];
        Decoder decoder = encoding.GetDecoder();
        if ((CharCode >= 0) && (CharCode <= 0xff))
        {
            bytes[0] = (byte) (CharCode & 0xff);
            num = decoder.GetChars(bytes, 0, 1, chars, 0);
        }
        else
        {
            bytes[0] = (byte) ((CharCode & 0xff00) >> 8);
            bytes[1] = (byte) (CharCode & 0xff);
            num = decoder.GetChars(bytes, 0, 2, chars, 0);
        }
        ch = chars[0];
    }
    catch (Exception exception)
    {
        throw exception;
    }
    return ch;
}

For an ASCII character, it just calls Convert.ToChar, which is equivalent to (char)CharCode. The first interesting thing is the call to Utils.GetLocaleCodePage:

internal static int GetLocaleCodePage()
{
    return Thread.CurrentThread.CurrentCulture.TextInfo.ANSICodePage;
}

Though one might expect it the same as Encoding.Default, it creates an encoding associated with the culture of the current thread, not the system. The rest is just stuffing the code into an array and using the encoding to decode it.

This method has one major caveat, as usual when dealing with encoding - it heavily depends on the current locale, and changing the culture of the current thread breaks all conversions for codes outside ASCII. But still, if that's what you want to do, here's a rough and short equivalent:

public static char Chr(int code)
{
    var encoding = Encoding.GetEncoding(Thread.CurrentThread.CurrentCulture.TextInfo.ANSICodePage);
    return encoding.GetChars(BitConverter.GetBytes((ushort)code))[0];
}

This lacks some checks of the original method, especially the single-byte and range check.

Then there's a much simpler and much better method in VB.NET - ChrW for Unicode:

public static char ChrW(int CharCode)
{
    if ((CharCode < -32768) || (CharCode > 0xffff))
    {
        throw new ArgumentException(Utils.GetResourceString("Argument_RangeTwoBytes1", new string[] { "CharCode" }));
    }
    return Convert.ToChar((int) (CharCode & 0xffff));
}

This again falls back to ToChar:

public static char ToChar(int value)
{
    if ((value < 0) || (value > 0xffff))
    {
        throw new OverflowException(Environment.GetResourceString("Overflow_Char"));
    }
    return (char) value;
}

As you can see, ChrW is just the same as plain old char conversion... except for negative values! You know, although the character code has to fit into two bytes, it may have come from both signed or unsigned short, so the method makes sure it is the right number for both types of origin. If you want to take that into account, just do CharCode & 0xffff.

So as you can see, Chr is just Encoding.GetChars where the encoding is the current thread's one, and ChrW is just (char)CharCode, except that both functions also handle negative values. There is no other difference.


As for the original part of your question, you can't convert from char to string because... there is no possible conversion. They don't inherit each other, so you can't cast them, neither do they have any user-defined conversion operators, and string is not a primitive value type, so no built-in conversion either. VB.NET might allow you to do this, but all in all, it allows many worse things thanks to its ancient versions.

TL;DR Is (char) equivalent to Chr? Only for ASCII character code (0 to 127), otherwise no. And Chr stops working if the current encoding and the code encoding differ, which matter if you use non-ASCII characters.

like image 22
IS4 Avatar answered Nov 09 '22 16:11

IS4


Just to simplify the syntax. The following AChar class handles the conversions.

string A = (AChar)65;
Console.WriteLine(A); // output is "A"

The following class represents a character and defines conversions from ASCII code page:

struct AChar
{
    public static implicit operator AChar(char value) => new AChar { Value = value };

    public static explicit operator AChar(string value)
    {
        if (string.IsNullOrEmpty(value))
            return '\x0000';

        if (value.Length > 1)
            throw new InvalidCastException("String contains more than 1 character.");

        return value[0];
    }

    public static explicit operator AChar(long value)
    {
        if(value < 0 || value > 0xFF)
            throw new InvalidCastException("Char code is out of ASCII range.");

        return (AChar)Encoding.ASCII.GetString(new[] { (byte)value });
    }

    public static implicit operator AChar(byte value) => (AChar)(long)value;
    public static explicit operator AChar(int value) => (AChar)(long)value;

    public static implicit operator char(AChar aChar) => aChar.Value;
    public static implicit operator string(AChar aChar) => aChar.Value.ToString();

    public static bool operator==(AChar left, AChar right) =>
        left.Value == right.Value;

    public static bool operator!=(AChar left, AChar right) =>
        left.Value != right.Value;

    public static bool operator >(AChar left, AChar right) =>
        left.Value > right.Value;

    public static bool operator >=(AChar left, AChar right) =>
        left.Value >= right.Value;

    public static bool operator <(AChar left, AChar right) =>
        left.Value < right.Value;

    public static bool operator <=(AChar left, AChar right) =>
        left.Value <= right.Value;

    public override string ToString() => this;

    public override int GetHashCode() =>    
        Value.GetHashCode();

    public override bool Equals(object obj) =>
        obj is AChar && ((AChar)obj).Value == Value;

    char Value { get; set; }
}

Convert you character code to AChar first, it is compatible with char and string of C#.

like image 36
Dmitry Nogin Avatar answered Nov 09 '22 17:11

Dmitry Nogin


The other answers are pretty complete. There is also this C# trick that you can use to get it in the mood for characters:

string mystring = "" + (char)7;

This works in general for more types that are not directly assignable to a string. It may prove less ugly to you and lets you do more concatenation on the same line.

like image 43
Dirk Bester Avatar answered Nov 09 '22 16:11

Dirk Bester