RmecabKo {RmecabKo} | R Documentation |
The mecab-ko
and mecab-ko-dic
is based on a C++
library,
and POS tagging with them is useful when the spacing of source text is not correct.
For integrating mecab-ko
with R
, Rcpp
package is used for providing the basic framework.
It is based on the Eunjeon Project
.
For Mac OSX and Linux, You need to install mecab-ko
and mecab-ko-dic
before install this package in R.
mecab-ko
: https://bitbucket.org/eunjeon/mecab-ko
mecab-ko-dic
: https://bitbucket.org/eunjeon/mecab-ko-dic
In Windows, install_mecab(mecabLocation)
function will install mecab-ko-msvc
and mecab-ko-dic-msvc
in user specified directory.
It is operated by system command and file I/O, the speed of the analysis is slow compared to the Linux-based operating system.
Junhewk Kim
Wonsup Yoon, mecab-ko VC++ builds at https://github.com/Pusnow/mecab-ko-msvc, https://github.com/Pusnow/mecab-ko-dic-msvc
## Not run: # install.packages("devtools") devtools::install_github("junhewk/RmecabKo") # On Windows platform only install_mecab("D:/Rlibs/mecab") phrase <- # Some Korean character vectors # For full POS tagging pos(phrase) # For noun extraction only nouns(phrase) # For tokenizing of selective morphemes tokens_words(phrase) # For n-grams tokenizing tokens_ngram(phrase) ## End(Not run)