rlang 0.4.0
introduced the curly-curly {{ operator to simplify writing functions around tidyverse pipelines. The minor update 0.4.3 of rlang makes it possible to use { and {{ to create result names in tidyverse verbs taking pairs of names and expressions.
Install the latest version of rlang to make the new feature globally available throughout the tidyverse:
install.packages("rlang")Tunnelling data-variables with curly-curly#
With the {{ operator you can tunnel data-variables (i.e. columns from the data frames) through arg-variables (function arguments):
library(tidyverse)
mean_by <- function(data, by, var) {
data %>%
group_by({{ by }}) %>%
summarise(avg = mean({{ var }}, na.rm = TRUE))
}The tunnel makes it possible to supply variables from the data frame to your wrapper function:
iris %>% mean_by(Species, Sepal.Width)
#> # A tibble: 3 x 2
#> Species avg
#> <fct> <dbl>
#> 1 setosa 3.43
#> 2 versicolor 2.77
#> 3 virginica 2.97Without a tunnel, the ambiguity between data-variables and arg-variables causes R to complain about objects not found:
mean_by_no_tunnel <- function(data, by, var) {
data %>%
group_by(by) %>%
summarise(avg = mean(var, na.rm = TRUE))
}
iris %>% mean_by_no_tunnel(Species, Sepal.Width)
#> Error: Must group by variables found in `.data`
#> * Column `by` is not foundThat’s because of the ambiguity between the function argument by and the data-variable Species. R has no way of knowing that you meant the variable from the data frame.
Custom result names#
In the example above, the result name is hard-coded to avg. This is an informative generic name, but returning a more specific name that reflects the context might make the function more helpful. For this reason, tidy eval functions taking dots (like dplyr::mutate(), dplyr::group_by(), or dplyr::summarise()) now support glue strings as result names.
Glue strings are implemented in the glue package . They are a flexible way of composing a string from components, interpolating R code within the string:
library(glue)
#>
#> Attaching package: 'glue'
#> The following object is masked from 'package:dplyr':
#>
#> collapse
name <- "Bianca"
glue("The result of `1 + 2` is {1 + 2}, so says {name}.")
#> The result of `1 + 2` is 3, so says Bianca.You can now use glue strings in result names. Note that for technical reasons you need the Walrus operator := instead of the usual =.
suffix <- "foo"
iris %>% summarise("prefix_{suffix}" := mean(Sepal.Width))
#> prefix_foo
#> 1 3.057333In addition to normal glue interpolation with {, you can also tunnel data-variables through function arguments with {{ inside the string:
mean_by <- function(data, by, var) {
data %>%
group_by({{ by }}) %>%
summarise("{{ var }}" := mean({{ var }}, na.rm = TRUE))
}
iris %>% mean_by(Species, Sepal.Width)
#> # A tibble: 3 x 2
#> Species Sepal.Width
#> <fct> <dbl>
#> 1 setosa 3.43
#> 2 versicolor 2.77
#> 3 virginica 2.97And you can combine both forms of interpolation in a same glue string:
mean_by <- function(data, by, var, prefix = "avg") {
data %>%
group_by({{ by }}) %>%
summarise("{prefix}_{{ var }}" := mean({{ var }}, na.rm = TRUE))
}
iris %>% mean_by(Species, Sepal.Width)
#> # A tibble: 3 x 2
#> Species avg_Sepal.Width
#> <fct> <dbl>
#> 1 setosa 3.43
#> 2 versicolor 2.77
#> 3 virginica 2.97You can learn more about tunnelling variables in this RStudio::conf 2020 talk .
Acknowledgements#
Read about other bugfixes and features from the 0.4.3 release in the changelog . Many thanks to all the contributors for this release!
@chendaniely , @clauswilke , @DavisVaughan , @enoshliang , @hadley , @ianmcook , @jennybc , @krlmlr , @lionel- , @moodymudskipper , @neelan29 , @nick-youngblut , @nteetor , @romainfrancois , @TylerGrantSmith , @vspinu , and @yutannihilation






