Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Faster input scanning

Tags:

string

go

I am attempting to solve the SPOJ question that can be found here

Following is my solution:

package main

import "fmt"
import "bufio"
import "os"

func main() {
    var n, k int
    var num int
    var divisible int

    in := bufio.NewReader(os.Stdin)

    fmt.Fscan(in, &n)
    fmt.Fscan(in, &k)

    for n > 0 {
        fmt.Fscan(in, &num)

        if num%k == 0 {
            divisible++
        }

        n--
    }

    fmt.Println(divisible)
}

The code works fine. The issue here is that I get a timeout when I execute it in SPOJ.

I was first using only fmt.Scan but I then came across this thread that suggested that I use bufio instead for faster input scanning.

But I still get a timeout issue. I am only looping to get all the inputs and within this loop itself I determine whether the input is divisible or not. So, I believe that its not the loop but the input scanning that's taking time. How can I improve this to read the input faster? Or is the issue somewhere else?

like image 480
callmekatootie Avatar asked Jul 10 '15 05:07

callmekatootie


3 Answers

I coded 3 versions to compare them. The first using fmt.Scanf("%d", &v), the second converting numbers from bytes (like @icza), and the third converting using strconv.Atoi. To use the functions I initializated scanner in that way:

scanner := bufio.NewScanner(os.Stdin)
scanner.Split(bufio.ScanWords)

So, every time I call scanner.Scan, it returns a token splitted by spaces. And the functions follow:

func scanFromBytes(scanner *bufio.Scanner) (n int) {
    scanner.Scan()

    buf := scanner.Bytes()
    for _, v := range buf {
        n = n*10 + int(v-'0')
    }

    return
}

And:

func scanAtoi(scanner *bufio.Scanner) (n int) {
    scanner.Scan()
    
    n, _ = strconv.Atoi(scanner.Text())

    return
}

I have tested with a big file (40k tests), reading about 8 integers per test. The fmt.Scanf solution, takes about 1.9s, as expected (more than the others). In the two functions I got about 0.8s. But the scanAtoi always takes about 0.05s less than scanFromBytes, except for the very first time (maybe some caching occurs).

like image 155
Alex Lucchesi Avatar answered Oct 19 '22 10:10

Alex Lucchesi


You can use bufio.Scanner to read lines from the input.

And since we're always reading numbers, we can create a highly optimized converter to get the number. We should avoid using Scanner.Text() which creates a string as we can obtain the number just from the raw bytes returned by Scanner.Bytes(). Scanner.Text() returns the same token as Scanner.Bytes() but it first converts to string which is obviously slower and generates "garbage" and work for the gc.

So here is a converter function which obtains an int from the raw bytes:

func toInt(buf []byte) (n int) {
    for _, v := range buf {
        n = n*10 + int(v-'0')
    }
    return
}

This toInt() works because the []byte contains the UTF-8 encoded byte sequence of the string representation of the decimal format of the number, which contains only digits in the range of '0'..'9' whose UTF-8 encoded bytes are mapped one-to-one (one byte is used for one digit). The mapping from digit to byte is simply a shift: '0' -> 48, '1' -> 49 etc.

Using this your complete application:

package main

import (
    "bufio"
    "fmt"
    "os"
)

func main() {
    var n, k, c int
    scanner := bufio.NewScanner(os.Stdin)

    scanner.Scan()
    fmt.Sscanf(scanner.Text(), "%d %d", &n, &k)

    for ;n > 0; n-- {
        scanner.Scan()
        if toInt(scanner.Bytes())%k == 0 {
            c++
        }
    }

    fmt.Println(c)
}

func toInt(buf []byte) (n int) {
    for _, v := range buf {
        n = n*10 + int(v-'0')
    }
    return
}

This solution is about 4 times faster than calling strconv.Atoi() for example.

Notes:

In the above solution I assumed input is valid, that is it always contains valid numbers and contains at least n lines after the first (which gives us n and k).

If the input is closed after n+1 lines, we can use a simplified for (and we don't even need to decrement and rely on n):

for scanner.Scan() {
    if toInt(scanner.Bytes())%k == 0 {
        c++
    }
}
like image 41
icza Avatar answered Oct 19 '22 10:10

icza


Try using bufio.Scanner (as suggested in the thread you mentioned):

fmt.Scan(&n)
fmt.Scan(&k)

scanner := bufio.NewScanner(os.Stdin)
for n > 0 {
    scanner.Scan()
    k, _ := strconv.Atoi(scanner.Text())
    ...
like image 2
kostya Avatar answered Oct 19 '22 11:10

kostya