Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ignore 'E' when reading double with sscanf

Tags:

c++

c

scanf

I have input such as "(50.1003781N, 14.3925125E)" .These are latitude and longitude.

I want to parse this with

sscanf(string,"(%lf%c, %lf%c)",&a,&b,&c,&d);

but when %lf sees E after the number, it consumes it and stores it as number in exponential form. Is there way to disable this?

like image 698
lllook Avatar asked Apr 10 '15 14:04

lllook


People also ask

Does sscanf ignore whitespace?

Template strings for sscanf and related functions are somewhat more free-form than those for printf . For example, most conversion specifiers ignore any preceding whitespace. Further, you cannot specify a precision for sscanf conversion specifiers, as you can for those of printf .

What does sscanf mean in C?

In C, sscanf() is used to read formatted data. It works much like scanf() but the data will be read from a string instead of the console.

How do I ignore a character in scanf?

Explanation: The %*s in scanf is used to ignore some input as required. In this case, it ignores the input until the next space or newline. Similarly, if you write %*d it will ignore integers until the next space or newline.

Does sscanf null terminate?

Sequence of one or more characters as specified by field width; white space characters that are ordinarily skipped are read when %c is specified. No terminating null is added.


2 Answers

I think you'll need to do manual parsing, probably using strtod(). This shows that strtod() behaves sanely when it comes up against the trailing E (at least on Mac OS X 10.10.3 with GCC 4.9.1 — but likely everywhere).

#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void)
{
    const char latlong[] = "(50.1003781N, 14.3925125E)";
    char *eptr;
    double d;
    errno = 0;      // Necessary in general, but probably not necessary at this point
    d = strtod(&latlong[14], &eptr);
    if (eptr != &latlong[14])
        printf("PASS: %10.7f (%s)\n", d, eptr);
    else
        printf("FAIL: %10.7f (%s) - %d: %s\n", d, eptr, errno, strerror(errno));

    return 0;
}

Compilation and run:

$ gcc -O3 -g -std=c11 -Wall -Wextra -Werror latlong.c -o latlong
$ ./latlong
PASS: 14.3925125 (E))
$

Basically, you'll skip white space, check for an (, strtod() a number, check for N or S or lower case versions, comma, strtod() a number, check for W or E, check for ) maybe allowing white space before it.

Upgraded code, with moderately general strtolatlon() function based on strtod() et al. The 'const cast' is necessary in the functions such as strtod() which take a const char * input and return a pointer into that string via a char **eptr variable.

#include <ctype.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define CONST_CAST(type, value) ((type)(value))

extern int strtolatlon(const char *str, double *lat, double *lon, char **eptr);

int strtolatlon(const char *str, double *lat, double *lon, char **eptr)
{
    const char *s = str;
    char *end;
    while (isspace(*s))
        s++;
    if (*s != '(')
        goto error;
    *lat = strtod(++s, &end);
    if (s == end || *lat > 90.0 || *lat < 0.0)
        goto error;
    int c = toupper((unsigned char)*end++);
    if (c != 'N' && c != 'S')  // I18N
        goto error;
    if (c == 'S')
        *lat = -*lat;
    if (*end != ',')
        goto error;
    s = end + 1;
    *lon = strtod(s, &end);
    if (s == end || *lon > 180.0 || *lon < 0.0)
        goto error;
    c = toupper((unsigned char)*end++);
    if (c != 'W' && c != 'E')  // I18N
        goto error;
    if (c == 'E')
        *lon = -*lon;
    if (*end != ')')
        goto error;
    if (eptr != 0)
        *eptr = end + 1;
    return 0;

error:
    if (eptr != 0)
        *eptr = CONST_CAST(char *, str);
    errno = EINVAL;
    return -1;
}

int main(void)
{
    const char latlon1[] = "(50.1003781N, 14.3925125E)";
    const char latlon2[] = "   (50.1003781N, 14.3925125E) is the position!";
    char *eptr;
    double d;
    errno = 0;      // Necessary in general, but Probably not necessary at this point
    d = strtod(&latlon1[14], &eptr);
    if (eptr != &latlon1[14])
        printf("PASS: %10.7f (%s)\n", d, eptr);
    else
        printf("FAIL: %10.7f (%s) - %d: %s\n", d, eptr, errno, strerror(errno));

    printf("Converting <<%s>>\n", latlon2);
    double lat;
    double lon;
    int rc = strtolatlon(latlon2, &lat, &lon, &eptr);
    if (rc == 0)
        printf("Lat: %11.7f, Lon: %11.7f; trailing material: <<%s>>\n", lat, lon, eptr);
    else
        printf("Conversion failed\n");

    return 0;
}

Sample output:

PASS: 14.3925125 (E))
Converting <<   (50.1003781N, 14.3925125E) is the position!>>
Lat:  50.1003781, Lon: -14.3925125; trailing material: << is the position!>>

That is not comprehensive testing, but it is illustrative and close to production quality. You might need to worry about infinities, for example, in true production code. I don't often use goto, but this is a case where the use of goto simplified the error handling. You could write the code without it; if I had more time, maybe I would upgrade it. However, with seven places where errors are diagnosed and 4 lines required for reporting the error, the goto provides reasonable clarity without great repetition.

Note that the strtolatlon() function explicitly identifies errors via its return value; there is no need to guess whether it succeeded or not. You can enhance the error reporting if you wish to identify where the error is. But doing that depends on your error reporting infrastructure in a way this does not.

Also, the strtolatlon() function will accept some odd-ball formats such as (+0.501003781E2N, 143925125E-7E). If that's a problem, you'll need to write your own fussier variant of strtod() that only accepts fixed-point notation. On the other hand, there's a meme/guideline "Be generous in what you accept; be strict in what you produce". That implies that what's here is more or less OK (it might be good to allow optional white space before the N, S, E, W letters, the comma and the close parenthesis). The converse code, latlontostr() or fmt_latlon() (with strtolatlon() renamed to scn_latlon(), perhaps) or whatever, would be careful about what it produces, only generating upper-case letters, and always using the fixed format, etc.

int fmt_latlon(char *buffer, size_t buflen, double lat, double lon, int dp)
{
    assert(dp >= 0 && dp < 15);
    assert(lat >=  -90.0 && lat <=  90.0);
    assert(lon >= -180.0 && lon <= 180.0);
    assert(buffer != 0 && buflen != 0);
    char ns = 'N';
    if (lat < 0.0)
    {
        ns = 'S';
        lat = -lat;
    }
    char ew = 'W';
    if (lon < 0.0)
    {
        ew = 'E';
        lon = -lon;
    }
    int nbytes = snprintf(buffer, buflen, "(%.*f%c, %.*f%c)", dp, lat, ns, dp, lon, ew);
    if (nbytes < 0 || (size_t)nbytes >= buflen)
        return -1;
    return 0;
}

Note that 1 unit at 7 decimal places of a degree (10-7 ˚) corresponds to about a centimetre on the ground (oriented along a meridian; the distance represented by a degree along a parallel of latitude varies with the latitude, of course).

like image 192
Jonathan Leffler Avatar answered Oct 10 '22 14:10

Jonathan Leffler


Process the string first using

char *p;
while((p = strchr(string, 'E')) != NULL) *p = 'W';
while((p = strchr(string, 'e')) != NULL) *p = 'W';

// scan it using your approach

sscanf(string,"(%lf%c, %lf%c)",&a,&b,&c,&d);

// get back the original characters (converted to uppercase).

if (b == 'W') b = 'E';    
if (d == 'W') d = 'E';

strchr() is declared in the C header <string.h>.

Note: This is really a C approach, not a C++ approach. But, by using sscanf() you are really using a C approach.

like image 32
Peter Avatar answered Oct 10 '22 13:10

Peter