August 19, 2015

Stem completion in R

Wanna create a word cloud using R, then check out http://www.r-bloggers.com/text-mining/

Although stemming the word is great, but

myCorpus <- tm_map(myCorpus, stemCompletion, dictionary = dictCorpus) didn’t work for me.

I tried a different approach.

After creating the document matrix as listed in the above link, pass the myNames character vector to the stem completion function with corpus copy – dictCorpus as the dictionary.

i.e

m = as.matrix(myDtm);
v = sort(colSums(m), decreasing=TRUE);
myNames = names(v);
myNames <- stemCompletion(myNames, dictCorpus, type = “prevalent”)
After all this, I still found some words that require manual modifications.
k = which(names(v)=="miners");
myNames[k] = "mining";
d = data.frame(word=myNames, freq=v);
There we go with the word cloud.

 wordcloud_rep(myCorpus, scale=c(4,0.5),

                  min.freq = input$freq, max.words=input$max,

                  colors=brewer.pal(8, “Dark2”))

Leave a Reply

Your email address will not be published. Required fields are marked *