Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert a float-formated char[] to float

I have a char[] salary which contains data that comes from a string. I want to convert char[] salary to float, but it seems to be extremelly slow by the method I'm trying, which is:

float ff = float.Parse(new string(salary));

According to Visual Studio's Performance Profiler this is taking way too much processing:

enter image description here

So I'd like to know if there's a faster way to do this, Since performance here is a point. The char[] is formated like so:

[ '1', '3', '2', ',', '2', '9']

And is basically a JSON-like float converted to every digit (and comma) fit into a char[].

EDIT:

I've reformatted the code and it seems like the performance hit is actually in the conversion from char[] to string, not the parsing from string to float.

like image 270
Washington A. Ramos Avatar asked Dec 03 '22 19:12

Washington A. Ramos


1 Answers

On the float-parsing side of things, there are some gains to be had based on which overload of float.Parse() you call and what you pass to it. I ran some more benchmarks comparing these overloads (note that I changed the decimal separator character from ',' to '.' just so I could specify CultureInfo.InvariantCulture).

For example, calling an overload that takes an IFormatProvider is good for about a 10% performance increase. Specifying NumberStyles.Float ("lax") for the NumberStyles parameter effects a change in performance of about a percentage point in either direction, and, making some assumptions about our input data, specifying only NumberStyles.AllowDecimalPoint ("strict") nets a few points performance increase. (The float.Parse(string) overload uses NumberStyles.Float | NumberStyles.AllowThousands.)

On the subject of making assumptions about your input data, if you know the text you're working with has certain characteristics (single-byte character encoding, no invalid numbers, no negatives, no exponents, no need to handle NaN or positive/negative infinity, etc.) you might do well to parse from the bytes directly and forego any unneeded special case handling and error checking. I included a very simple implementation in my benchmarks and it was able to get a float from a byte[] more than 16x faster than float.Parse(string) could get a float from a string!

Here are my benchmark results...

BenchmarkDotNet=v0.11.0, OS=Windows 10.0.17134.165 (1803/April2018Update/Redstone4)
Intel Core i7 CPU 860 2.80GHz (Max: 2.79GHz) (Nehalem), 1 CPU, 8 logical and 4 physical cores
Frequency=2732436 Hz, Resolution=365.9738 ns, Timer=TSC
.NET Core SDK=2.1.202
  [Host] : .NET Core 2.0.9 (CoreCLR 4.6.26614.01, CoreFX 4.6.26614.01), 64bit RyuJIT
  Clr    : .NET Framework 4.7.2 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.7.3131.0
  Core   : .NET Core 2.0.9 (CoreCLR 4.6.26614.01, CoreFX 4.6.26614.01), 64bit RyuJIT


                                                        Method | Runtime |       Mean | Scaled |
-------------------------------------------------------------- |-------- |-----------:|-------:|
                                           float.Parse(string) |     Clr | 145.098 ns |   1.00 |
                        'float.Parse(string, IFormatProvider)' |     Clr | 134.191 ns |   0.92 |
                     'float.Parse(string, NumberStyles) [Lax]' |     Clr | 145.884 ns |   1.01 |
                  'float.Parse(string, NumberStyles) [Strict]' |     Clr | 139.417 ns |   0.96 |
    'float.Parse(string, NumberStyles, IFormatProvider) [Lax]' |     Clr | 133.800 ns |   0.92 |
 'float.Parse(string, NumberStyles, IFormatProvider) [Strict]' |     Clr | 127.413 ns |   0.88 |
                       'Custom byte-to-float parser [Indexer]' |     Clr |   7.657 ns |   0.05 |
                    'Custom byte-to-float parser [Enumerator]' |     Clr | 566.440 ns |   3.90 |
                                                               |         |            |        |
                                           float.Parse(string) |    Core | 154.369 ns |   1.00 |
                        'float.Parse(string, IFormatProvider)' |    Core | 138.668 ns |   0.90 |
                     'float.Parse(string, NumberStyles) [Lax]' |    Core | 155.644 ns |   1.01 |
                  'float.Parse(string, NumberStyles) [Strict]' |    Core | 150.221 ns |   0.97 |
    'float.Parse(string, NumberStyles, IFormatProvider) [Lax]' |    Core | 142.591 ns |   0.92 |
 'float.Parse(string, NumberStyles, IFormatProvider) [Strict]' |    Core | 135.000 ns |   0.87 |
                       'Custom byte-to-float parser [Indexer]' |    Core |  12.673 ns |   0.08 |
                    'Custom byte-to-float parser [Enumerator]' |    Core | 584.236 ns |   3.78 |

...from running this code (requires BenchmarkDotNet assembly)...

using System;
using System.Globalization;
using BenchmarkDotNet.Attributes;

namespace StackOverflow_51584129
{
    [ClrJob()]
    [CoreJob()]
    public class FloatParsingBenchmarks
    {
        private const string InputString = "132.29";
        private static readonly byte[] InputBytes = System.Text.Encoding.ASCII.GetBytes(InputString);

        private static readonly IFormatProvider ParsingFormatProvider = CultureInfo.InvariantCulture;
        private const NumberStyles LaxParsingNumberStyles = NumberStyles.Float;
        private const NumberStyles StrictParsingNumberStyles = NumberStyles.AllowDecimalPoint;
        private const char DecimalSeparator = '.';

        [Benchmark(Baseline = true, Description = "float.Parse(string)")]
        public float SystemFloatParse()
        {
            return float.Parse(InputString);
        }

        [Benchmark(Description = "float.Parse(string, IFormatProvider)")]
        public float SystemFloatParseWithProvider()
        {
            return float.Parse(InputString, CultureInfo.InvariantCulture);
        }

        [Benchmark(Description = "float.Parse(string, NumberStyles) [Lax]")]
        public float SystemFloatParseWithLaxNumberStyles()
        {
            return float.Parse(InputString, LaxParsingNumberStyles);
        }

        [Benchmark(Description = "float.Parse(string, NumberStyles) [Strict]")]
        public float SystemFloatParseWithStrictNumberStyles()
        {
            return float.Parse(InputString, StrictParsingNumberStyles);
        }

        [Benchmark(Description = "float.Parse(string, NumberStyles, IFormatProvider) [Lax]")]
        public float SystemFloatParseWithLaxNumberStylesAndProvider()
        {
            return float.Parse(InputString, LaxParsingNumberStyles, ParsingFormatProvider);
        }

        [Benchmark(Description = "float.Parse(string, NumberStyles, IFormatProvider) [Strict]")]
        public float SystemFloatParseWithStrictNumberStylesAndProvider()
        {
            return float.Parse(InputString, StrictParsingNumberStyles, ParsingFormatProvider);
        }

        [Benchmark(Description = "Custom byte-to-float parser [Indexer]")]
        public float CustomFloatParseByIndexing()
        {
            // FOR DEMONSTRATION PURPOSES ONLY!
            // This code has been written for and only tested with
            // parsing the ASCII string "132.29" in byte form
            var currentIndex = 0;
            var boundaryIndex = InputBytes.Length;
            char currentChar;
            var wholePart = 0;

            while (currentIndex < boundaryIndex && (currentChar = (char) InputBytes[currentIndex++]) != DecimalSeparator)
            {
                var currentDigit = currentChar - '0';

                wholePart = 10 * wholePart + currentDigit;
            }

            var fractionalPart = 0F;
            var nextFractionalDigitScale = 0.1F;

            while (currentIndex < boundaryIndex)
            {
                currentChar = (char) InputBytes[currentIndex++];
                var currentDigit = currentChar - '0';

                fractionalPart += currentDigit * nextFractionalDigitScale;
                nextFractionalDigitScale *= 0.1F;
            }

            return wholePart + fractionalPart;
        }

        [Benchmark(Description = "Custom byte-to-float parser [Enumerator]")]
        public float CustomFloatParseByEnumerating()
        {
            // FOR DEMONSTRATION PURPOSES ONLY!
            // This code has been written for and only tested with
            // parsing the ASCII string "132.29" in byte form
            var wholePart = 0;
            var enumerator = InputBytes.GetEnumerator();

            while (enumerator.MoveNext())
            {
                var currentChar = (char) (byte) enumerator.Current;

                if (currentChar == DecimalSeparator)
                    break;

                var currentDigit = currentChar - '0';
                wholePart = 10 * wholePart + currentDigit;
            }

            var fractionalPart = 0F;
            var nextFractionalDigitScale = 0.1F;

            while (enumerator.MoveNext())
            {
                var currentChar = (char) (byte) enumerator.Current;
                var currentDigit = currentChar - '0';

                fractionalPart += currentDigit * nextFractionalDigitScale;
                nextFractionalDigitScale *= 0.1F;
            }

            return wholePart + fractionalPart;
        }

        public static void Main()
        {
            BenchmarkDotNet.Running.BenchmarkRunner.Run<FloatParsingBenchmarks>();
        }
    }
}
like image 159
Lance U. Matthews Avatar answered Dec 16 '22 21:12

Lance U. Matthews