Updates for quosure changes

topepo · topepo · commit c4a52a5805a0 · 2018-10-18T11:14:04.000-04:00
diff --git a/vignettes/articles/Scratch.Rmd b/vignettes/articles/Scratch.Rmd
@@ -64,14 +64,14 @@ A row for "unknown" modes is not needed in this object.
 
 Now, we enumerate the _main arguments_ for each engine. `parsnip` standardizes the names of arguments across different models and engines. For example, random forest and boosting use multiple trees to create the ensemble. Instead of using different argument names, `parsnip` standardizes on `trees` and the underlying code translates to the actual arguments used by the different functions. 
 
-In our case, the MDA argument name will be "subclasses". 
+In our case, the MDA argument name will be "sub_classes". 
 
 Here, the object name will have the suffix `_arg_key` and will have columns for the engines and rows for the arguments. The entries for the data frame are the actual arguments for each engine (and is `NA` when an engine doesn't have that argument). Ours:
 
 ```{r arg-key}
 mixture_da_arg_key <- data.frame(
-  mda      =   "subclasses",
-  row.names =  "subclasses",
+  mda      =   "sub_classes",
+  row.names =  "sub_classes",
   stringsAsFactors = FALSE
 )
 ```
@@ -89,27 +89,25 @@ The internals of `parsnip` will use these objects during the creation of the mod
 This is a fairly simple function that can follow a basic template. The main arguments to our function will be:
 
  * The mode. If the model can do more than one mode, you might default this to "unknown". In our case, since it is only a classification model, it makes sense to default it to that mode. 
- * The argument names (`subclasses` here). These should be defaulted to `NULL`.
- * An argument, `others`, that can be used to pass in other arguments to the underlying model fit functions. 
- * `...`, although they are not currently used. We encourage developers to move the `...` after mode so that users are encouraged to use named arguments to the model specification. 
+ * The argument names (`sub_classes` here). These should be defaulted to `NULL`.
+ * `...` is used to pass in other arguments to the underlying model fit functions. 
 
 A basic version of the function is:
 
 ```{r model-fun}
 mixture_da <-
-  function(mode = "classification", ...,  subclasses = NULL, others = list()) {
-    
-    # start with some basic error traps
-    check_empty_ellipse(...)
-    
+  function(mode = "classification",  sub_classes = NULL, ...) {
+    # Check for correct mode
     if (!(mode %in% mixture_da_modes))
       stop("`mode` should be one of: ",
            paste0("'", mixture_da_modes, "'", collapse = ", "),
            call. = FALSE)
     
-    args <- list(subclasses = subclasses)
-    
-    # save the other arguments but remove them if they are null. 
+    # Capture the arguments in quosures
+    others <- enquos(...)
+    args <- list(sub_classes = enquo(sub_classes))
+
+    # Save the other arguments but remove them if they are null. 
     no_value <- !vapply(others, is.null, logical(1))
     others <- others[no_value]
     
@@ -233,7 +231,7 @@ For example:
 library(parsnip)
 library(tidyverse)
 
-mixture_da(subclasses = 2) %>%
+mixture_da(sub_classes = 2) %>%
   translate(engine = "mda")
 ```
 
@@ -248,7 +246,7 @@ iris_split <- initial_split(iris, prop = 0.90)
 iris_train <- training(iris_split)
 iris_test  <-  testing(iris_split)
 
-mda_spec <- mixture_da(subclasses = 2)
+mda_spec <- mixture_da(sub_classes = 2)
 
 mda_fit <- mda_spec %>%
   fit(Species ~ ., data = iris_train, engine = "mda")
@@ -278,7 +276,7 @@ There are some models (e.g. `glmnet`, `plsr`, `Cubist`, etc.) that can make pred
 For example, if I fit a linear regression model via `glmnet` and get four values of the regularization parameter (`lambda`):
 
 ```{r glmnet}
-linear_reg(others = list(nlambda = 4)) %>%
+linear_reg(nlambda = 4) %>%
   fit(mpg ~ ., data = mtcars, engine = "glmnet") %>%
   predict(new_data = mtcars[1:3, -1])
 ```
@@ -302,7 +300,7 @@ logistic_reg() %>% translate(engine = "glm")
 
 # but you can change it:
 
-logistic_reg(others = list(family = expr(binomial(link = "probit")))) %>% 
+logistic_reg(family = binomial(link = "probit")) %>% 
   translate(engine = "glm")
 ```
 
@@ -322,13 +320,23 @@ translate.rand_forest <- function (x, engine, ...){
   # Run the general method to get the real arguments in place
   x <- translate.default(x, engine, ...)
   
+  # Make code easier to read
+  arg_vals <- x$method$fit$args
+  
   # Check and see if they make sense for the engine and/or mode:
   if (x$engine == "ranger") {
-    if (any(names(x$method$fit$args) == "importance")) 
-      if (is.logical(x$method$fit$args$importance)) 
+    if (any(names(arg_vals) == "importance")) 
+      # We want to check the type of `importance` but it is a quosure. We first
+      # get the expression. It is is logical, the value of `quo_get_expr` will
+      # not be an expression but the actual logical. The wrapping of `isTRUE`
+      # is there in case it is not an atomic value. 
+      if (isTRUE(is.logical(quo_get_expr(arg_vals$importance)))) 
         stop("`importance` should be a character value. See ?ranger::ranger.", 
              call. = FALSE)
+    if (x$mode == "classification" && !any(names(arg_vals) ==  "probability")) 
+      arg_vals$probability <- TRUE
   }
+  x$method$fit$args <- arg_vals
   x
 }
 ```
diff --git a/vignettes/parsnip_Intro.Rmd b/vignettes/parsnip_Intro.Rmd
@@ -77,24 +77,23 @@ The arguments to the default function are:
 args(rand_forest)
 ```
 
-However, there might be other arguments that you would like to change or allow to vary. These are accessible using the `others` option. This is a named list of arguments in the form of the underlying function being called. For example, `ranger` has an option to set the internal random number seed. To set this to a specific value: 
+However, there might be other arguments that you would like to change or allow to vary. These are accessible using the `...` slot. This is a named list of arguments in the form of the underlying function being called. For example, `ranger` has an option to set the internal random number seed. To set this to a specific value: 
 
 ```{r rf-seed}
 rf_with_seed <- rand_forest(
-  trees = 2000, mtry = varying(), 
-  others = list(seed = 63233), 
+  trees = 2000, 
+  mtry = varying(), 
+  seed = 63233, 
   mode = "regression"
 )
 rf_with_seed
 ```
 
-If the model function contains the ellipses (`...`), these additional arguments can be passed along using `others`. 
-
 ### Process
 
 To fit the model, you must:
 
-* define the model, including the _mode_,
+* have a defined model, including the _mode_,
 * have no `varying()` parameters, and
 * specify a computational engine. 
 
@@ -123,44 +122,52 @@ translate(rf_with_seed, engine = "randomForest")
 
 These models can be fit using the `fit` function. Only the model object is returned. 
 
-```r
+```{r, eval = FALSE}
 fit(rf_mod, mpg ~ ., data = mtcars, engine = "ranger")
 ```
 
 ```
-## parsnip model object
-## 
-## Ranger result
-## 
-## Call:
-##  ranger::ranger(formula = mpg ~ ., data = mtcars, num.trees = 2000,      num.threads = 1, verbose = FALSE, seed = sample.int(10^5, 1)) 
-## 
-## Type:                             Regression 
-## Number of trees:                  2000 
-## Sample size:                      32 
-## Number of independent variables:  10 
-## Mtry:                             3 
-## Target node size:                 5 
-## Variable importance mode:         none 
-## Splitrule:                        variance 
-## OOB prediction error (MSE):       5.71 
-## R squared (OOB):                  0.843
+#> parsnip model object
+#> 
+#> Ranger result
+#> 
+#> Call:
+#>  ranger::ranger(formula = formula, data = data, num.trees = ~2000,      num.threads = 1, verbose = FALSE, seed = sample.int(10^5,          1)) 
+#> 
+#> Type:                             Regression 
+#> Number of trees:                  2000 
+#> Sample size:                      32 
+#> Number of independent variables:  10 
+#> Mtry:                             3 
+#> Target node size:                 5 
+#> Variable importance mode:         none 
+#> Splitrule:                        variance 
+#> OOB prediction error (MSE):       5.71 
+#> R squared (OOB):                  0.843
 ```
 
 
-```r
+```{r, eval = FALSE}
 fit(rf_mod, mpg ~ ., data = mtcars, engine = "randomForest")
 ```
 
 ```
-## parsnip model object
-## 
-## Call:
-##  randomForest(x = as.data.frame(x), y = y, ntree = 2000) 
-##                Type of random forest: regression
-##                      Number of trees: 2000
-## No. of variables tried at each split: 3
-## 
-##           Mean of squared residuals: 5.6
-##                     % Var explained: 84.1
+#> parsnip model object
+#> 
+#> 
+#> Call:
+#>  randomForest(x = as.data.frame(x), y = y, ntree = ~2000) 
+#>                Type of random forest: regression
+#>                      Number of trees: 2000
+#> No. of variables tried at each split: 3
+#> 
+#>           Mean of squared residuals: 5.6
+#>                     % Var explained: 84.1
 ```
+
+Note that, in the case of the `ranger` fit, the call object shows `num.trees = ~2000`. The tilde is the consequence of `parsnip` using quosures to process the model specification's arguments. 
+
+Normally, when a function is executed, the function's arguments are immediately evaluated. In the case of `parsnip`, the model specification's arguments are _not_; the expression is captured along with the environment where it should be evaluated. That is what a quosure does. 
+
+`parsnip` uses these expressions to make a model fit call that is evaluated. The tilde in the call above reflects that the argument was captured using a quosure. 
+