Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

decimal.TryParse is happily accepting badly formatted number strings

Tags:

c#

parsing

Is there a way to make the C# TryParse() functions a little more... strict ?

Right now, if you pass in a string containing numbers, the correct decimal & thousand separator characters, it often just seems to accept them, even if the format doesn't make sense, eg: 123''345'678

I'm looking for a way to make TryParse not be successful if the number isn't in the right format.

So, I'm based in Zurich, and if I do this:

decimal exampleNumber = 1234567.89m;
Trace.WriteLine(string.Format("Value {0} gets formatted as: \"{1:N}\"", exampleNumber, exampleNumber));

...then, with my regional settings, I get this...

Value 1234567.89 gets formatted as: "1'234'567.89"

So you can see that, for my region, the decimal place character is a full-stop and the thousand-separator is an apostrophe.

Now, let's create a simple function to test whether a string can be parsed into a decimal:

private void ParseTest(string str)
{
    decimal val = 0;
    if (decimal.TryParse(str, out val))
        Trace.WriteLine(string.Format("Parsed \"{0}\" as {1}", str, val));
    else
        Trace.WriteLine(string.Format("Couldn't parse: \"{0}\"", str));
}

Okay, let's call this function with a few strings.

Which of the following strings would you think would get successfully parsed by this function ?

Below are the results I got:

ParseTest("123345.67");         //  1. Parsed "123345.67" as 123345.67
ParseTest("123'345.67");        //  2. Parsed "123'345.67" as 123345.67
ParseTest("123'345'6.78");      //  3. Parsed "123'345'6.78" as 1233456.78
ParseTest("1''23'345'678");     //  4. Parsed "1''23'345'678" as 123345678
ParseTest("'1''23'345'678");    //  5. Couldn't parse: "'1''23'345'678"
ParseTest("123''345'678");      //  6. Parsed "123''345'678" as 123345678
ParseTest("123'4'5'6.7.89");    //  7. Couldn't parse: "123'4'5'6.7.89"
ParseTest("'12'3'45'678");      //  8. Couldn't parse: "'12'3'45'678"

I think you can see my point.

To me, only the first two strings should've parsed successfully. The others should've all failed, as they don't have 3-digits after a thousand separator, or have two apostrophes together.

Even if I change the ParseTest to be a bit more specific, the results are exactly the same. (For example, it happily accepts "123''345'678" as a valid decimal.)

private void ParseTest(string str)
{
    decimal val = 0;
    var styles = (NumberStyles.AllowDecimalPoint | NumberStyles.AllowThousands);

    if (decimal.TryParse(str, styles, CultureInfo.CurrentCulture, out val))
        Trace.WriteLine(string.Format("Parsed \"{0}\" as {1}", str, val));
    else
        Trace.WriteLine(string.Format("Couldn't parse: \"{0}\"", str));
}

So, is there a straightforward way to not allow badly formatted strings to be accepted by TryParse ?

Update

Thanks for all of the suggestions.

Perhaps I should clarify: what I'm looking for is for the first two of these strings to be valid, but the third one to be rejected.

ParseTest("123345.67");
ParseTest("123'456.67");
ParseTest("12'345'6.7");

Surely there must be a way to use "NumberStyles.AllowThousands" so it can optionally allow thousand-separators but make sure the number format does make sense ?

Right now, if I use this:

if (decimal.TryParse(str, styles, CultureInfo.CurrentCulture, out val))

I get these results:

Parsed "123345.67" as 123345.67
Parsed "123'456.67" as 123456.67
Parsed "12'345'6.7" as 123456.7

And if I use this:

if (decimal.TryParse(str, styles, CultureInfo.InvariantCulture, out val))

I get these results:

Parsed "123345.67" as 123345.67
Couldn't parse: "123'456.67"
Couldn't parse: "12'345'6.7"

This is my problem... regardless of CultureInfo settings, that third string should be rejected, and the first two accepted.

like image 801
Mike Gledhill Avatar asked Oct 13 '15 12:10

Mike Gledhill


People also ask

What does Decimal TryParse do?

TryParse(String, Decimal) Converts the string representation of a number to its Decimal equivalent. A return value indicates whether the conversion succeeded or failed.

Can you TryParse a string?

TryParse(String, Single) Converts the string representation of a number to its single-precision floating-point number equivalent. A return value indicates whether the conversion succeeded or failed.


3 Answers

The easiest way to tell if it is correctly formatted based on the current culture would be to compare the resulting number after formatting with the original string.

//input = "123,456.56" -- true
//input = "123,4,56.56" -- false
//input = "123456.56" -- true
//input = "123,,456.56" -- false
string input = "123456.56";
decimal value;

if(!decimal.TryParse(input, out value))
{
    return false;
}

return (value.ToString("N") == input || value.ToString() == input);

This will succeed for inputs that completely omit thousand separators and inputs that specify correct thousand separators.

If you need it to accept a range of decimal places then you would need to grab the number of characters after the decimal separator and append it to the "N" format string.

like image 67
ndonohoe Avatar answered Oct 14 '22 12:10

ndonohoe


Putting together all the useful suggestions here, here's what I ended up using.

It's not perfect, but, for my corporate app, it does at least reject numeric-strings which "don't look right".

Before I present my code, here's the differences between what my TryParseExact function will accept, and what the regular decimal.TryParse would accept:

enter image description here

And here's my code.

I'm sure there's a more efficient way of doing some of this, using regex or something, but this is sufficient for my needs, and I hope it helps other developers:

    public static bool TryParseExact(string str, out decimal result)
    {
        //  The regular decimal.TryParse() is a bit rubbish.  It'll happily accept strings which don't make sense, such as:
        //      123'345'6.78
        //      1''23'345'678
        //      123''345'678
        //
        //  This function does the same as TryParse(), but checks whether the number "makes sense", ie:
        //      - has exactly zero or one "decimal point" characters
        //      - if the string has thousand-separators, then are there exactly three digits inbetween them 
        // 
        //  Assumptions: if we're using thousand-separators, then there'll be just one "NumberGroupSizes" value.
        //
        //  Returns True if this is a valid number
        //          False if this isn't a valid number
        // 
        result = 0;

        if (str == null || string.IsNullOrWhiteSpace(str)) 
            return false;

        //  First, let's see if TryParse itself falls over, trying to parse the string.
        decimal val = 0;
        if (!decimal.TryParse(str, out val))
        {
            //  If the numeric string contains any letters, foreign characters, etc, the function will abort here.
            return false;
        }

        //  Note: we'll ONLY return TryParse's result *if* the rest of the validation succeeds.

        CultureInfo culture = CultureInfo.CurrentCulture;
        int[] expectedDigitLengths = culture.NumberFormat.NumberGroupSizes;         //  Usually a 1-element array:  { 3 }
        string decimalPoint = culture.NumberFormat.NumberDecimalSeparator;          //  Usually full-stop, but perhaps a comma in France.
        string thousands = culture.NumberFormat.NumberGroupSeparator;               //  Usually a comma, but can be apostrophe in European locations.

        int numberOfDecimalPoints = CountOccurrences(str, decimalPoint);
        if (numberOfDecimalPoints != 0 && numberOfDecimalPoints != 1)
        {
            //  You're only allowed either ONE or ZERO decimal point characters.  No more!
            return false;
        }

        int numberOfThousandDelimiters = CountOccurrences(str, thousands);
        if (numberOfThousandDelimiters == 0)
        {
            result = val;
            return true;
        }

        //  Okay, so this numeric-string DOES contain 1 or more thousand-seperator characters.
        //  Let's do some checks on the integer part of this numeric string  (eg "12,345,67.890" -> "12,345,67")
        if (numberOfDecimalPoints == 1)
        {
            int inx = str.IndexOf(decimalPoint);
            str = str.Substring(0, inx);
        }

        //  Split up our number-string into sections: "12,345,67" -> [ "12", "345", "67" ]
        string[] parts = str.Split(new string[] { thousands }, StringSplitOptions.None);

        if (parts.Length < 2)
        {
            //  If we're using thousand-separators, then we must have at least two parts (eg "1,234" contains two parts: "1" and "234")
            return false;
        }

        //  Note: the first section is allowed to be upto 3-chars long  (eg for "12,345,678", the "12" is perfectly valid)
        if (parts[0].Length == 0 || parts[0].Length > expectedDigitLengths[0])
        {
            //  This should catch errors like:
            //      ",234"
            //      "1234,567"
            //      "12345678,901"
            return false;
        }

        //  ... all subsequent sections MUST be 3-characters in length
        foreach (string oneSection in parts.Skip(1))
        {
            if (oneSection.Length != expectedDigitLengths[0])
                return false;
        }

        result = val;
        return true;
    }

    public static int CountOccurrences(string str, string chr)
    {
        //  How many times does a particular string appear in a string ?
        //
        int count = str.Length - str.Replace(chr, "").Length;
        return count;
    }

Btw, I created the table image above in Excel, and noticed that it's actually hard to paste values like this into Excel:

1'234567.89

Does Excel complain above this value, or try to store it as text ? Nope, it also happily accepts this as a valid number, and pastes it as "1234567.89".

Anyway, job done.. thanks to everyone for their help & suggestions.

like image 23
Mike Gledhill Avatar answered Oct 14 '22 11:10

Mike Gledhill


It's because parsing just skips the NumberFormatInfo.NumberGroupSeparator string and completely ignores the NumberFormatInfo.NumberGroupSizes property. However, you can implement such a validation:

static bool ValidateNumberGroups(string value, CultureInfo culture)
{
    string[] parts = value.Split(new string[] { culture.NumberFormat.NumberGroupSeparator }, StringSplitOptions.None);
    foreach (string part in parts)
    {
        int length = part.Length;
        if (culture.NumberFormat.NumberGroupSizes.Contains(length) == false)
        {
            return false;
        }
    }

    return true;
}

It's still not completely perfect, as the MSDN says:

The first element of the array defines the number of elements in the least significant group of digits immediately to the left of the NumberDecimalSeparator. Each subsequent element refers to the next significant group of digits to the left of the previous group. If the last element of the array is not 0, the remaining digits are grouped based on the last element of the array. If the last element is 0, the remaining digits are not grouped.

For example, if the array contains { 3, 4, 5 }, the digits are grouped similar to "55,55555,55555,55555,4444,333.00". If the array contains { 3, 4, 0 }, the digits are grouped similar to "55555555555555555,4444,333.00".

But you can see the point now.

like image 39
György Kőszeg Avatar answered Oct 14 '22 11:10

György Kőszeg