Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use bufio.ScanWords

Tags:

go

How do I use bufio.ScanWords and bufio.ScanLines functions to count words and lines?

I tried:

fmt.Println(bufio.ScanWords([]byte("Good day everyone"), false))

Prints:

5 [103 111 111 100] <nil>

Not sure what that means?

like image 465
zianwar Avatar asked Apr 17 '17 10:04

zianwar


People also ask

What is bufio scanwords Golang?

Golang bufio.ScanWords() function example 22nd June 2015 package bufio ScanWords is a split function for a Scanner that returns each space-separated word of text, with surrounding spaces deleted. It will never return an empty string. The definition of space is set by unicode.IsSpace.

What is scanner in bufio?

This post focuses on Scanner provided by bufio package. It helps to process stream of data by splitting it into tokens and removing space between them: If we’re are interested only in words then scanner helps retrieving “foo”, “bar” and “baz” in sequence ( source code ):

What is scanwords() function in Golang?

ScanWords is a split function for a Scanner that returns each space-separated word of text, with surrounding spaces deleted. It will never return an empty string. The definition of space is set by unicode.IsSpace. Golang bufio.ScanWords() function usage example

What is the maximum length of a bufio token?

By default maximum length of buffer which is used underneath is 64 * 1024 bytes. It means that found token cannot be longer than this limit (source code) Program prints bufio.Scanner: token too long.


3 Answers

To count words:

input := "Spicy jalapeno pastrami ut ham turducken.\n Lorem sed ullamco, leberkas sint short loin strip steak ut shoulder shankle porchetta venison prosciutto turducken swine.\n Deserunt kevin frankfurter tongue aliqua incididunt tri-tip shank nostrud.\n"
scanner := bufio.NewScanner(strings.NewReader(input))
// Set the split function for the scanning operation.
scanner.Split(bufio.ScanWords)
// Count the words.
count := 0
for scanner.Scan() {
    count++
}
if err := scanner.Err(); err != nil {
    fmt.Fprintln(os.Stderr, "reading input:", err)
}
fmt.Printf("%d\n", count)

To count lines:

input := "Spicy jalapeno pastrami ut ham turducken.\n Lorem sed ullamco, leberkas sint short loin strip steak ut shoulder shankle porchetta venison prosciutto turducken swine.\n Deserunt kevin frankfurter tongue aliqua incididunt tri-tip shank nostrud.\n"

scanner := bufio.NewScanner(strings.NewReader(input))
// Set the split function for the scanning operation.
scanner.Split(bufio.ScanLines)
// Count the lines.
count := 0
for scanner.Scan() {
    count++
}
if err := scanner.Err(); err != nil {
    fmt.Fprintln(os.Stderr, "reading input:", err)
}
fmt.Printf("%d\n", count)
like image 55
Alex Efimov Avatar answered Oct 19 '22 08:10

Alex Efimov


This is an exercise in book The Go Programming Language Exercise 7.1

This is an extension of @repler solution:

package main

import (
    "bufio"
    "fmt"
    "os"
    "strings"
)

type byteCounter int
type wordCounter int
type lineCounter int

func main() {
    var c byteCounter
    c.Write([]byte("Hello This is a line"))
    fmt.Println("Byte Counter ", c)

    var w wordCounter
    w.Write([]byte("Hello This is a line"))
    fmt.Println("Word Counter ", w)

    var l lineCounter
    l.Write([]byte("Hello \nThis \n is \na line\n.\n.\n"))
    fmt.Println("Length ", l)

}

func (c *byteCounter) Write(p []byte) (int, error) {
    *c += byteCounter(len(p))
    return len(p), nil
}

func (w *wordCounter) Write(p []byte) (int, error) {
    count := retCount(p, bufio.ScanWords)
    *w += wordCounter(count)
    return count, nil
}

func (l *lineCounter) Write(p []byte) (int, error) {
    count := retCount(p, bufio.ScanLines)
    *l += lineCounter(count)
    return count, nil
}

func retCount(p []byte, fn bufio.SplitFunc) (count int) {
    s := string(p)
    scanner := bufio.NewScanner(strings.NewReader(s))
    scanner.Split(fn)
    count = 0
    for scanner.Scan() {
        count++
    }
    if err := scanner.Err(); err != nil {
        fmt.Fprintln(os.Stderr, "reading input:", err)
    }
    return
}
like image 35
Nagri Avatar answered Oct 19 '22 06:10

Nagri


This is an exercise in book The Go Programming Language Exercise 7.1

This is my solution:

package main

import (
    "bufio"
    "fmt"
)

// WordCounter count words
type WordCounter int

// LineCounter count Lines
type LineCounter int

type scanFunc func(p []byte, EOF bool) (advance int, token []byte, err error)

func scanBytes(p []byte, fn scanFunc) (cnt int) {
    for true {
        advance, token, _ := fn(p, true)
        if len(token) == 0 {
            break
        }
        p = p[advance:]
        cnt++
    }
    return cnt
}

func (c *WordCounter) Write(p []byte) (int, error) {
    cnt := scanBytes(p, bufio.ScanWords)
    *c += WordCounter(cnt)
    return cnt, nil
}

func (c WordCounter) String() string {
    return fmt.Sprintf("contains %d words", c)
}

func (c *LineCounter) Write(p []byte) (int, error) {
    cnt := scanBytes(p, bufio.ScanLines)
    *c += LineCounter(cnt)
    return cnt, nil
}

func (c LineCounter) String() string {
    return fmt.Sprintf("contains %d lines", c)
}

func main() {
    var c WordCounter
    fmt.Println(c)

    fmt.Fprintf(&c, "This is an sentence.")
    fmt.Println(c)

    c = 0
    fmt.Fprintf(&c, "This")
    fmt.Println(c)

    var l LineCounter
    fmt.Println(l)

    fmt.Fprintf(&l, `This is another
line`)
    fmt.Println(l)

    l = 0
    fmt.Fprintf(&l, "This is another\nline")
    fmt.Println(l)

    fmt.Fprintf(&l, "This is one line")
    fmt.Println(l)
}
like image 1
abel_abel Avatar answered Oct 19 '22 08:10

abel_abel