token_morph {RmecabKo} | R Documentation |
These tokernizer functions perform tokenization into full or selected morphemes, nouns.
token_morph(phrase, strip_punct = FALSE, strip_numeric = FALSE) token_words(phrase, strip_punct = FALSE, strip_numeric = FALSE) token_nouns(phrase, strip_punct = FALSE, strip_numeric = FALSE)
phrase |
A character vector or a list of character vectors to be tokenized into morphemes.
If |
strip_punct |
Bool. If you want to remove punctuations in the phrase, set this as TRUE. |
strip_numeric |
Bool. If you want to remove numbers in the phrase, set this as TRUE. |
A list of character vectors containing the tokens, with one element in the list.
See examples in Github.
## Not run: txt <- # Some Korean sentence token_morph(txt) token_words(txt, strip_punct = FALSE) token_nouns(txt, strip_numeric = TRUE) ## End(Not run)