Strtod in C# – Part 2: The Implementation

Samuel Cragg

0.00/5 (No votes)

30 Jan 2012CPOL2 min read

11.9K

The implementation of strtod in C#

The way we’re going to parse the number is from the string is the same way as in “What Every Computer Scientist Should Know About Floating-Point Arithmetic” I linked to last time (note this is for a single precision number, hence nine digits – we’ll need to use more):

First read in the 9 decimal digits as an integer N, ignoring the decimal point. [..] Next, find the appropriate power 10^P necessary to scale N. This will be a combination of the exponent of the decimal number, together with the position of the (up until now) ignored decimal point. [..] Finally, multiply (or divide if P < 0) N and 10^|P|.

The style I’ve used to parse the string is a series of static functions that take the string, the current index and return the index of the end of their data, passing any extracted data through an out parameter. In code that looks like this:

private static int SkipWhiteSpace(string str, int index);
private static int ParseSign(string str, int index, out int value);

I could have made the string and index member variables, but if the parsing failed, I would need to roll back the index. I also toyed with the idea of using an iterator, but I couldn’t see an advantage. Also, when parsing the significand two values need to be returned to take into account the decimal place. Sure, I could have used a Tuple to accomplish this, but I went for the above to keep everything consistent.

I’m not going to go through each method here (you can download the updated code here), but I’ll go through the main sequence of events.

First, we validate the arguments. This will be one less worry for the private methods knowing the passed in values are sane.
Skip any whitespace, remembering the position of the first non-whitespace character for later.
Try to parse a number. This means searching for a sign, skip any leading zeros and then parse the significand. Parsing the significand will give us a number and an exponent, in case the number has a decimal point. If we’ve parsed any digits (including the leading zeros), then we can check for an exponent and adjust the exponent of the significand accordingly. Finally, we can create a double from the sign, significand (stored in a 64-bit integer) and exponent.
If we didn’t parse any digits, then we need to search for +/- infinity or NaN, after skipping the whitespace.
Failing that then, we couldn’t get any number so set the end to the starting index and return 0.

It’s actually quite difficult to explain in words, so here’s a code snippet:

public static double Parse(string input, int start, out int end)
{
    if (input == null)
    {
        throw new ArgumentNullException("input");
    }
    if (start < 0)
    {
        throw new ArgumentOutOfRangeException("start", 
                "Value cannot be negative.");
    }
    if (start > input.Length)
    {
        throw new ArgumentOutOfRangeException("start", 
              "Value must be less then input.Length");
    }

    int endOfWhitespace = SkipWhiteSpace(input, start);

    long significand;
    int significandsExponent, sign;
    int startOfDigits = ParseSign(input, endOfWhitespace, out sign);
    int index = SkipLeadingZeros(input, startOfDigits);
    index = ParseSignificand(input, index, out significand, out significandsExponent);

    // Have we parsed a number?
    if ((index - startOfDigits) > 0)
    {
        int exponent;
        end = ParseExponent(input, index, out exponent);
        return MakeDouble(significand * sign, exponent - significandsExponent);
    }

    // Not a number, is it a constant?
    double value;
    end = ParseNamedConstant(input, endOfWhitespace, out value);
    if (end != endOfWhitespace)
    {
        return value;
    }

    // If we're here then we couldn't parse anything.
    end = start;
    return default(double);
}

One thing that caught me out (but picked up in the unit tests) was creating the double from the significand and the exponent when the exponent is less than -308 – there exists a category of floating point numbers that are denormalized so I needed to add a check in the MakeDouble function to handle these, as it was rounding them down to zero.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)