Skip to content

Commit 1df76a5

Browse files
committed
rtext : initialize, replace default tokenizer
1 parent fc528d4 commit 1df76a5

File tree

3 files changed

+4
-3
lines changed

3 files changed

+4
-3
lines changed

R/rtext.R

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -209,7 +209,7 @@ rtext <-
209209
function(
210210
text = NULL,
211211
text_file = NULL,
212-
tokenizer = rtext_tokenizer$words,
212+
tokenizer = function(x){text_tokenize(x, "\n", non_token = TRUE)},
213213
encoding = "UTF-8",
214214
id = NULL,
215215
tokenize_by = NULL,

R/rtext_tools.R

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,8 @@ dp_storage <- new.env(parent = emptyenv())
77
#' @export
88
rtext_tokenizer <- list(
99
words = function(x){text_tokenize_words(x, non_token = TRUE )},
10-
words2 = function(x){text_tokenize_words(x, non_token = FALSE)}
10+
words2 = function(x){text_tokenize_words(x, non_token = FALSE)},
11+
lines = function(x){text_tokenize(x, "\n", non_token = TRUE)}
1112
)
1213

1314

man/rtext_tokenizer.Rd

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)