tm_map has parallel::mclapply error in R 3.0.1 on Mac

I am using R 3.0.1 on Platform: x86_64-apple-darwin10.8.0 (64-bit)

I am trying to use tm_map from the tm library. But when I execute the this code

tm_map(crude, stemDocument)

I get this error:

Warning message:
In parallel::mclapply(x, FUN, ...) :
  all scheduled cores encountered errors in user code

Does anyone know a solution for this?

2 Answers

I suspect you don't have the SnowballC package installed, which seems to be required. tm_map is supposed to run stemDocument on all the documents using mclapply. Try just running the stemDocument function on one document, so you can extract the error:


For me, I got an error:

Error in loadNamespace(name) : there is no package called ‘SnowballC’

So I just went ahead and installed SnowballC and it worked. Clearly, SnowballC should be a dependency.

I just ran into this. It took me a bit of digging but I found out what was happening.

  1. I had a line of code 'rdevel <- tm_map(rdevel, asPlainTextDocument)'

  2. Running this produced the error

    In parallel::mclapply(x, FUN, ...) :
      all scheduled cores encountered errors in user code

  1. It turns out that 'tm_map' calls some code in 'parallel' which attempts to figure out how many cores you have. To see what it's thinking, type

    > getOption("mc.cores", 2L)
    [1] 2

  1. Aha moment! Tell the 'tm_map' call to only use one core!

    > rdevel <- tm_map(rdevel, asPlainTextDocument, mc.cores=1)
    Error in match.fun(FUN) : object 'asPlainTextDocument' not found
    > rdevel <- tm_map(rdevel, asPlainTextDocument, mc.cores=4)
    Warning message:
    In parallel::mclapply(x, FUN, ...) :
      all scheduled cores encountered errors in user code

So ... with more than one core, rather than give you the error message, 'parallel' just tells you there was an error in each core. Not helpful, parallel! I forgot the dot - the function name is supposed to be 'as.PlainTextDocument'!

So - if you get this error, add 'mc.cores=1' to the 'tm_map' call and run it again.

