Introduction to scaffolder Package

The scaffolder package provides a comprehensive set of tools to automate the process of scaffolding interfaces to modules, classes, functions, and documentations written in other programming languages.

Examples

Scaffold R wrappers to Python function

The following example requires the installation of TensorFlow 2.0 for demonstration purposes. Please check out the instruction here.

Once it’s installed, we can then execute scaffold_py_function_wrapper() to generate the R wrapper to tensorflow.nn.top_k() function:

library(scaffolder)
library(tensorflow)

scaffold_py_function_wrapper("tf$nn$top_k")

This will generate the following R wrapper that could serve as a good starting point when building R interfaces to Python modules including all the function parameters, default values, and the associated doc-strings for each parameters.

#' @title top_k
#'
#' @description Finds values and indices of the `k` largest entries for the last dimension.
#'
#' @details If the input is a vector (rank=1), finds the `k` largest entries in the vector
#' and outputs their values and indices as vectors. Thus `values[j]` is the
#' `j`-th largest entry in `input`, and its index is `indices[j]`. For matrices (resp. higher rank input), computes the top `k` entries in each
#' row (resp. vector along the last dimension). Thus, values.shape = indices.shape = input.shape[:-1] + [k] If two elements are equal, the lower-index element appears first.
#'
#' @param input 1-D or higher `Tensor` with last dimension at least `k`.
#' @param k 0-D `int32` `Tensor`. Number of top elements to look for along the last dimension (along each row for matrices).
#' @param sorted If true the resulting `k` elements will be sorted by the values in descending order.
#' @param name Optional name for the operation.
#'
#' @return values: The `k` largest elements along each last dimensional slice. indices: The indices of `values` within the last dimension of `input`.
#'
#' @export
top_k <- function(input, k = 1L, sorted = TRUE, name = NULL) {

  python_function_result <- tf$nn$top_k(
    input = input,
    k = k,
    sorted = sorted,
    name = name
  )

}

Note that the generated wrapper will often require additional editing (e.g. to convert Python list literals in the docs to R lists, to massage R numeric values to Python integers via as.integer where required, etc.) so is really intended as an starting point for an R wrapper rather than a wrapper that can be used without modification.

Customized Scaffolding

We can customize the scaffolding process in order to avoid the additional editing that we mentioned above. For example, we can implement a function to cast parameters with default values that contains “L” to integers. This is often useful when the package maintainer wants to automate the generation of the R wrapper and reduce the maintenance efforts required in the future, e.g. make sure doc-strings and default values for all the parameters are up-to-date.

library(stringr)

process_int_param_fn <- function(param, docs) {
  # Extract the list of parameters that have integer values as default
  int_params <- gsub(
    " = [-]?[0-9]+L",
    "",
    str_extract_all(docs$signature, "[A-z]+ = [-]?[0-9]+L")[[1]])
  # Explicitly cast parameter in the list obtained above to integer
  if (param %in% int_params) {
    param <- paste0("as.integer(", param, ")")
  }
  param
}

Note that since the default value of parameter k is 1L. It is wrapped by as.integer() to ensure it’s casted to integer before sending it to tf$nn$top_k for execution. We then print out the python function result.

library(scaffolder)
library(tensorflow)

custom_scaffold_py_function_wrapper(
  "tf$nn$top_k",
  r_function = "top_k",
  process_param_fn = process_int_param_fn,
  postprocess_fn = function() { "return(python_function_result)" })

This will generate the following R code:

#' @title top_k
#'
#' @description Finds values and indices of the `k` largest entries for the last dimension.
#'
#' @details If the input is a vector (rank=1), finds the `k` largest entries in the vector
#' and outputs their values and indices as vectors. Thus `values[j]` is the
#' `j`-th largest entry in `input`, and its index is `indices[j]`. For matrices (resp. higher rank input), computes the top `k` entries in each
#' row (resp. vector along the last dimension). Thus, values.shape = indices.shape = input.shape[:-1] + [k] If two elements are equal, the lower-index element appears first.
#'
#' @param input 1-D or higher `Tensor` with last dimension at least `k`.
#' @param k 0-D `int32` `Tensor`. Number of top elements to look for along the last dimension (along each row for matrices).
#' @param sorted If true the resulting `k` elements will be sorted by the values in descending order.
#' @param name Optional name for the operation.
#'
#' @return values: The `k` largest elements along each last dimensional slice. indices: The indices of `values` within the last dimension of `input`.
#'
#' @export
top_k <- function(input, k = 1L, sorted = TRUE, name = NULL) {

  python_function_result <- tf$nn$top_k(
    input = input,
    k = as.integer(k),
    sorted = sorted,
    name = name
  )
  return(python_function_result)
}

This is the same R code as what we generated previously but with two differences:

  • The parameter k is casted to integer automatically.
  • The final result python_function_result from executing the underlying Python function is being returned as a result of this generated wrapper.

There are also several other different parts where users can customize the scaffolding of the R wrapper functions. Please check out the documentation via ?custom_scaffold_py_function_wrapper for more details.