The reading part isn't concurrent but the processing is. I phrased the title this way because I'm most likely to search for this problem again using that phrase. :)
I'm getting a deadlock after trying to go beyond the examples so this is a learning experience for me. My goals are these:
func() that does some regex work.Here is the playground link. I tried to write helpful comments, hopefully this makes sense. My design could be completely wrong so don't hesitate to refactor.
package main
import (
  "bufio"
  "fmt"
  "regexp"
  "strings"
  "sync"
)
func telephoneNumbersInFile(path string) int {
  file := strings.NewReader(path)
  var telephone = regexp.MustCompile(`\(\d+\)\s\d+-\d+`)
  // do I need buffered channels here?
  jobs := make(chan string)
  results := make(chan int)
  // I think we need a wait group, not sure.
  wg := new(sync.WaitGroup)
  // start up some workers that will block and wait?
  for w := 1; w <= 3; w++ {
    wg.Add(1)
    go matchTelephoneNumbers(jobs, results, wg, telephone)
  }
  // go over a file line by line and queue up a ton of work
  scanner := bufio.NewScanner(file)
  for scanner.Scan() {
    // Later I want to create a buffer of lines, not just line-by-line here ...
    jobs <- scanner.Text()
  }
  close(jobs)
  wg.Wait()
  // Add up the results from the results channel.
  // The rest of this isn't even working so ignore for now.
  counts := 0
  // for v := range results {
  //   counts += v
  // }
  return counts
}
func matchTelephoneNumbers(jobs <-chan string, results chan<- int, wg *sync.WaitGroup, telephone *regexp.Regexp) {
  // Decreasing internal counter for wait-group as soon as goroutine finishes
  defer wg.Done()
  // eventually I want to have a []string channel to work on a chunk of lines not just one line of text
  for j := range jobs {
    if telephone.MatchString(j) {
      results <- 1
    }
  }
}
func main() {
  // An artificial input source.  Normally this is a file passed on the command line.
  const input = "Foo\n(555) 123-3456\nBar\nBaz"
  numberOfTelephoneNumbers := telephoneNumbersInFile(input)
  fmt.Println(numberOfTelephoneNumbers)
}
Multiple threads can also read data from the same FITS file simultaneously, as long as the file was opened independently by each thread. This relies on the operating system to correctly deal with reading the same file by multiple processes.
You're almost there, just need a little bit of work on goroutines' synchronisation. Your problem is that you're trying to feed the parser and collect the results in the same routine, but that can't be done.
I propose the following:
The relevant changes could look like this:
// Go over a file line by line and queue up a ton of work
go func() {
    scanner := bufio.NewScanner(file)
    for scanner.Scan() {
        jobs <- scanner.Text()
    }
    close(jobs)
}()
// Collect all the results...
// First, make sure we close the result channel when everything was processed
go func() {
    wg.Wait()
    close(results)
}()
// Now, add up the results from the results channel until closed
counts := 0
for v := range results {
    counts += v
}
Fully working example on the playground: http://play.golang.org/p/coja1_w-fY
Worth adding you don't necessarily need the WaitGroup to achieve the same, all you need to know is when to stop receiving results. This could be achieved for example by scanner advertising (on a channel) how many lines were read and then the collector reading only specified number of results (you would need to send zeros as well though).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With