Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dictionary() is not supported anymore in tm package. How to emend code?

Tags:

r

tm

I just noticed that after updating to tm v. 0.5-10 the function Dictionary() is not supported anymore. Is this an mistake? Or was it deprecated? Am I suppose to use another function to create a dictionary?

Since I have many lines of code to emend now, what's the best way to proceed without engineering everything?

like image 744
CptNemo Avatar asked Feb 14 '14 22:02

CptNemo


People also ask

What is the TM package in R?

In R, the tm package is often used to create a corpus object. This package can be used to read in data in many different formats– including text within data frames, .txt files, or .doc files. Let's begin with an example of how to read in text from within a data frame.

What is TM package mainly used for?

The tm package is a good tool for novice researchers to conduct basic text analysis.


2 Answers

As IShouldBuyABoat says, you haven't given us any clue about how you're using Dictionary so we can't really give you any specific answers (do update your question with more details).

In any case, the answer your question of 'how to update my code' is probably 'just delete Dictionary and it should be fine', as you can see here:

library(tm)
data(crude)

Find out what Dictionary did in earlier versions of the tm package:

methods(Dictionary)
getAnywhere(Dictionary.DocumentTermMatrix)
# function(x) structure(Terms(x), class = c("Dictionary", "character"))
getAnywhere(Dictionary.character)
# function (x)  structure(x, class = c("Dictionary", "character"))

Kind of a pointless function anyway, seems quite sensible to remove it. But how to update your code that depended on it?

You may have used Dictionary like this:

myDictionary <- Dictionary(c("some", "tokens", "that", "I", "am", "interested", "in"))
inspect(DocumentTermMatrix(crude, list(dictionary = myDictionary)))

Now that this function is not longer available, you'd do this instead, using a character vector: '

myTerms <- c("some", "tokens", "that", "I", "am", "interested", "in")
inspect(DocumentTermMatrix(crude, list(dictionary = myTerms)))

The output for these two examples is identical, first one was using tm version 0.5-9 and the second with version 0.5-10

The instruction in the NEWS to use Terms is if you want to get all the words in a document term matrix, like so

Terms(DocumentTermMatrix(crude))

If none of that helps you then you'll need to supply more detail about what you're trying to do.

like image 144
Ben Avatar answered Oct 30 '22 16:10

Ben


If you are using Dictionary as @Ben suggested, I think you could create a dummy function called Dictionary which just returned the character vector you passed to it.

Dictionary <- function(x) {
    if( is.character(x) ) {
        return (x)
    }
    stop('x is not a character vector')
}

However, longer term, it's probably better to roll up your sleeves and refactor the code.

like image 44
wds Avatar answered Oct 30 '22 17:10

wds