Base word stemming instead of root word stemming in R

Question

Is there any way to get base word instead of root word in stemming using NLP in R?

Code:

> #Loading libraries
> library(tm)
> library(slam)
> 
> #Vector
> Vec=c("happyness happies happys","sky skies")
> 
> #Creating Corpus
> Txt=Corpus(VectorSource(Vec))
> 
> #Stemming
> Txt=tm_map(Txt, stemDocument)
> 
> #Checking result
> inspect(Txt)
A corpus with 2 text documents

The metadata consists of 2 tag-value pairs and a data frame
Available tags are:
  create_date creator 
Available variables in the data frame are:
  MetaID 

[[1]]
happi happi happi

[[2]]
sky sky

>

Can I get base word "happy" (base word) instead of "happi" (root word) for "happyness happies happys" using R.

cyborg · Accepted Answer

You're probably looking for a stemmer. Here are some stemmers from CRAN Task View: Natural Language Processing:

RWeka is a interface to Weka which is a collection of machine learning algorithms for data mining tasks written in Java. Especially useful in the context of natural language processing is its functionality for tokenization and stemming.
Snowball provides the Snowball stemmers which contain the Porter stemmer and several other stemmers for different languages. See the Snowball webpage for details.
Rstem is an alternative interface to a C version of Porter's word stemming algorithm.

Base word stemming instead of root word stemming in R

Tags:

r

nlp

stemming

AVSuresh

1 Answers

cyborg

Recent Activity

Donate For Us

Base word stemming instead of root word stemming in R

Tags:

r

nlp

stemming

AVSuresh

1 Answers

cyborg

Related questions

Recent Activity

Donate For Us