--- title: "Introduction to `scaffolder` Package" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Introduction to scaffolder} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE} knitr::opts_chunk$set(eval = FALSE) ``` The **scaffolder** package provides a comprehensive set of tools to automate the process of scaffolding interfaces to modules, classes, functions, and documentations written in other programming languages. # Examples ## Scaffold R wrappers to Python function The following example requires the installation of TensorFlow 2.0 for demonstration purposes. Please check out the instruction [here](https://tensorflow.rstudio.com/installation/). Once it's installed, we can then execute `scaffold_py_function_wrapper()` to generate the R wrapper to `tensorflow.nn.top_k()` function: ``` library(scaffolder) library(tensorflow) scaffold_py_function_wrapper("tf$nn$top_k") ``` This will generate the following R wrapper that could serve as a good starting point when building R interfaces to Python modules including all the function parameters, default values, and the associated doc-strings for each parameters. ``` #' @title top_k #' #' @description Finds values and indices of the `k` largest entries for the last dimension. #' #' @details If the input is a vector (rank=1), finds the `k` largest entries in the vector #' and outputs their values and indices as vectors. Thus `values[j]` is the #' `j`-th largest entry in `input`, and its index is `indices[j]`. For matrices (resp. higher rank input), computes the top `k` entries in each #' row (resp. vector along the last dimension). Thus, values.shape = indices.shape = input.shape[:-1] + [k] If two elements are equal, the lower-index element appears first. #' #' @param input 1-D or higher `Tensor` with last dimension at least `k`. #' @param k 0-D `int32` `Tensor`. Number of top elements to look for along the last dimension (along each row for matrices). #' @param sorted If true the resulting `k` elements will be sorted by the values in descending order. #' @param name Optional name for the operation. #' #' @return values: The `k` largest elements along each last dimensional slice. indices: The indices of `values` within the last dimension of `input`. #' #' @export top_k <- function(input, k = 1L, sorted = TRUE, name = NULL) { python_function_result <- tf$nn$top_k( input = input, k = k, sorted = sorted, name = name ) } ``` Note that the generated wrapper will often require additional editing (e.g. to convert Python list literals in the docs to R lists, to massage R numeric values to Python integers via `as.integer` where required, etc.) so is really intended as an starting point for an R wrapper rather than a wrapper that can be used without modification. ## Customized Scaffolding We can customize the scaffolding process in order to avoid the additional editing that we mentioned above. For example, we can implement a function to cast parameters with default values that contains "L" to integers. This is often useful when the package maintainer wants to automate the generation of the R wrapper and reduce the maintenance efforts required in the future, e.g. make sure doc-strings and default values for all the parameters are up-to-date. ``` library(stringr) process_int_param_fn <- function(param, docs) { # Extract the list of parameters that have integer values as default int_params <- gsub( " = [-]?[0-9]+L", "", str_extract_all(docs$signature, "[A-z]+ = [-]?[0-9]+L")[[1]]) # Explicitly cast parameter in the list obtained above to integer if (param %in% int_params) { param <- paste0("as.integer(", param, ")") } param } ``` Note that since the default value of parameter `k` is `1L`. It is wrapped by `as.integer()` to ensure it's casted to integer before sending it to `tf$nn$top_k` for execution. We then print out the python function result. ``` library(scaffolder) library(tensorflow) custom_scaffold_py_function_wrapper( "tf$nn$top_k", r_function = "top_k", process_param_fn = process_int_param_fn, postprocess_fn = function() { "return(python_function_result)" }) ``` This will generate the following R code: ``` #' @title top_k #' #' @description Finds values and indices of the `k` largest entries for the last dimension. #' #' @details If the input is a vector (rank=1), finds the `k` largest entries in the vector #' and outputs their values and indices as vectors. Thus `values[j]` is the #' `j`-th largest entry in `input`, and its index is `indices[j]`. For matrices (resp. higher rank input), computes the top `k` entries in each #' row (resp. vector along the last dimension). Thus, values.shape = indices.shape = input.shape[:-1] + [k] If two elements are equal, the lower-index element appears first. #' #' @param input 1-D or higher `Tensor` with last dimension at least `k`. #' @param k 0-D `int32` `Tensor`. Number of top elements to look for along the last dimension (along each row for matrices). #' @param sorted If true the resulting `k` elements will be sorted by the values in descending order. #' @param name Optional name for the operation. #' #' @return values: The `k` largest elements along each last dimensional slice. indices: The indices of `values` within the last dimension of `input`. #' #' @export top_k <- function(input, k = 1L, sorted = TRUE, name = NULL) { python_function_result <- tf$nn$top_k( input = input, k = as.integer(k), sorted = sorted, name = name ) return(python_function_result) } ``` This is the same R code as what we generated previously but with two differences: * The parameter `k` is casted to integer automatically. * The final result `python_function_result` from executing the underlying Python function is being returned as a result of this generated wrapper. There are also several other different parts where users can customize the scaffolding of the R wrapper functions. Please check out the documentation via `?custom_scaffold_py_function_wrapper` for more details.