How do I use bufio.ScanWords
and bufio.ScanLines
functions to count words and lines?
I tried:
fmt.Println(bufio.ScanWords([]byte("Good day everyone"), false))
Prints:
5 [103 111 111 100] <nil>
Not sure what that means?
Golang bufio.ScanWords() function example 22nd June 2015 package bufio ScanWords is a split function for a Scanner that returns each space-separated word of text, with surrounding spaces deleted. It will never return an empty string. The definition of space is set by unicode.IsSpace.
This post focuses on Scanner provided by bufio package. It helps to process stream of data by splitting it into tokens and removing space between them: If we’re are interested only in words then scanner helps retrieving “foo”, “bar” and “baz” in sequence ( source code ):
ScanWords is a split function for a Scanner that returns each space-separated word of text, with surrounding spaces deleted. It will never return an empty string. The definition of space is set by unicode.IsSpace. Golang bufio.ScanWords() function usage example
By default maximum length of buffer which is used underneath is 64 * 1024 bytes. It means that found token cannot be longer than this limit (source code) Program prints bufio.Scanner: token too long.
To count words:
input := "Spicy jalapeno pastrami ut ham turducken.\n Lorem sed ullamco, leberkas sint short loin strip steak ut shoulder shankle porchetta venison prosciutto turducken swine.\n Deserunt kevin frankfurter tongue aliqua incididunt tri-tip shank nostrud.\n"
scanner := bufio.NewScanner(strings.NewReader(input))
// Set the split function for the scanning operation.
scanner.Split(bufio.ScanWords)
// Count the words.
count := 0
for scanner.Scan() {
count++
}
if err := scanner.Err(); err != nil {
fmt.Fprintln(os.Stderr, "reading input:", err)
}
fmt.Printf("%d\n", count)
To count lines:
input := "Spicy jalapeno pastrami ut ham turducken.\n Lorem sed ullamco, leberkas sint short loin strip steak ut shoulder shankle porchetta venison prosciutto turducken swine.\n Deserunt kevin frankfurter tongue aliqua incididunt tri-tip shank nostrud.\n"
scanner := bufio.NewScanner(strings.NewReader(input))
// Set the split function for the scanning operation.
scanner.Split(bufio.ScanLines)
// Count the lines.
count := 0
for scanner.Scan() {
count++
}
if err := scanner.Err(); err != nil {
fmt.Fprintln(os.Stderr, "reading input:", err)
}
fmt.Printf("%d\n", count)
This is an exercise in book The Go Programming Language Exercise 7.1
This is an extension of @repler solution:
package main
import (
"bufio"
"fmt"
"os"
"strings"
)
type byteCounter int
type wordCounter int
type lineCounter int
func main() {
var c byteCounter
c.Write([]byte("Hello This is a line"))
fmt.Println("Byte Counter ", c)
var w wordCounter
w.Write([]byte("Hello This is a line"))
fmt.Println("Word Counter ", w)
var l lineCounter
l.Write([]byte("Hello \nThis \n is \na line\n.\n.\n"))
fmt.Println("Length ", l)
}
func (c *byteCounter) Write(p []byte) (int, error) {
*c += byteCounter(len(p))
return len(p), nil
}
func (w *wordCounter) Write(p []byte) (int, error) {
count := retCount(p, bufio.ScanWords)
*w += wordCounter(count)
return count, nil
}
func (l *lineCounter) Write(p []byte) (int, error) {
count := retCount(p, bufio.ScanLines)
*l += lineCounter(count)
return count, nil
}
func retCount(p []byte, fn bufio.SplitFunc) (count int) {
s := string(p)
scanner := bufio.NewScanner(strings.NewReader(s))
scanner.Split(fn)
count = 0
for scanner.Scan() {
count++
}
if err := scanner.Err(); err != nil {
fmt.Fprintln(os.Stderr, "reading input:", err)
}
return
}
This is an exercise in book The Go Programming Language Exercise 7.1
This is my solution:
package main
import (
"bufio"
"fmt"
)
// WordCounter count words
type WordCounter int
// LineCounter count Lines
type LineCounter int
type scanFunc func(p []byte, EOF bool) (advance int, token []byte, err error)
func scanBytes(p []byte, fn scanFunc) (cnt int) {
for true {
advance, token, _ := fn(p, true)
if len(token) == 0 {
break
}
p = p[advance:]
cnt++
}
return cnt
}
func (c *WordCounter) Write(p []byte) (int, error) {
cnt := scanBytes(p, bufio.ScanWords)
*c += WordCounter(cnt)
return cnt, nil
}
func (c WordCounter) String() string {
return fmt.Sprintf("contains %d words", c)
}
func (c *LineCounter) Write(p []byte) (int, error) {
cnt := scanBytes(p, bufio.ScanLines)
*c += LineCounter(cnt)
return cnt, nil
}
func (c LineCounter) String() string {
return fmt.Sprintf("contains %d lines", c)
}
func main() {
var c WordCounter
fmt.Println(c)
fmt.Fprintf(&c, "This is an sentence.")
fmt.Println(c)
c = 0
fmt.Fprintf(&c, "This")
fmt.Println(c)
var l LineCounter
fmt.Println(l)
fmt.Fprintf(&l, `This is another
line`)
fmt.Println(l)
l = 0
fmt.Fprintf(&l, "This is another\nline")
fmt.Println(l)
fmt.Fprintf(&l, "This is one line")
fmt.Println(l)
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With