---
title: "Introduction to `luajr`"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Introduction to `luajr`}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
luajr allows you to run [Lua](https://www.lua.org) code from R.
Lua is a lightweight, simple, and fast scripting language that is used in a
variety of settings. The standard Lua interpreter is already reasonably fast,
but there is also a just-in-time compiler for Lua called
[LuaJIT](https://luajit.org) that is even faster. luajr uses LuaJIT.
This is **not** a guide to Lua or LuaJIT; it is a quick-start guide to luajr for
people who already know how to program in Lua. See the
[Lua web site](https://www.lua.org) for resources related to coding in Lua.
# Running Lua code: `lua()` and `lua_shell()`
```{r setup}
library(luajr)
```
To get a feel for luajr or to run "one-off" Lua code from your R project, use
`lua()` and `lua_shell()`.
When you pass a character string to `lua()`, it is run as Lua code:
```{r}
lua("return 'Hello ' .. 'world!'")
```
Assignments to global variables will persist between calls to `lua()`:
```{r}
lua("my_animal = 'walrus'")
lua("return my_animal")
```
This is because luajr maintains a "default Lua state" which holds all global
variables. This default Lua state is opened the first time a package function
is used. You can create your own, separate Lua states, or reset the default Lua
state (see [Lua states](#states), below).
Assignments to local variables will *not* persist between calls to `lua()`:
```{r}
lua("local my_animal = 'donkey'")
lua("return my_animal")
```
In this case, the second line returns `"walrus"` because the local variable
`my_animal` goes out of scope after the first call to `lua()` ends, so the
second call to `lua()` is referring back to the global variable `my_animal` from
before.
You can include more than one statement in the code run by `lua()`:
```{r}
lua("local my_veg = 'potato'; local my_dish = my_veg .. ' pie'; return my_dish")
```
You can also use the `filename` argument to `lua()` to load and run a Lua
source file, instead of running the contents of a string.
Call `lua_shell()` to open an interactive Lua shell at the R prompt. This can
can be helpful for debugging or for testing Lua statements.
# Calling Lua functions from R: `lua_func()`
The key piece of functionality for luajr is probably `lua_func()`. This allows
you to call Lua functions from R.
The first argument to `lua_func()`, `func`, is a string that should evaluate to
a Lua function. `lua_func()` then returns an R function that can be used to call
that Lua function from R. For example, you can use `lua_func()` to access an
existing Lua function from R:
```{r}
luatype = lua_func("type")
luatype(TRUE)
```
Here, `"type"` is just referring to the built-in Lua function `type` which
returns a string describing the Lua type of the value passed to it. You can also
use `lua_func()` to refer to a previously defined function in the default Lua
state:
```{r}
lua("function squared(x) return x^2 end")
lua("return squared(4)")
sq = lua_func("squared")
sq(8)
```
Or you can use `lua_func()` to define an anonymous Lua function:
```{r}
timestwo = lua_func("function(x) return x*2 end")
timestwo(123)
```
Under the hood, `lua_func()` just takes its first parameter (a string), adds
`"return "` to the front of it, executes it as Lua code, and registers the
result as the function.
The second argument to `lua_func()`, `argcode`, is also very important.
`argcode` determines how the arguments passed to the function from R are
translated into Lua values for use inside the function.
The permissible arg codes are:
* `'s'`: **s**implest Lua type (the default)
* `'a'`: **a**rray type
* `'1'`: same as `'s'`, but require that the argument has length 1
* ...
* `'9'`: same as `'s'`, but require that the argument has length 9
* `'v'`: pass by **v**alue
* `'r'`: pass by **r**eference
The kinds of R values that can be passed to Lua functions, and their behaviour
under different arg codes, is summarized in the following table:
|R type |Example R value |**arg code** 's' | 'a' | '1' |'2' | 'v' | 'r' |
|:--------------:|:--------------:|-----------------|-----------------|------------|-----------------|---------------------------------------------|-----------------------------------------------|
|`NULL` |`NULL` |`nil` |`nil` |`nil` |`nil` |`nil` |`nil` |
|`logical(1)` |`TRUE` |`true` |`{true}` |`true` |**error** |`luajr.logical({true})` |`luajr.logical_r({true})` |
|`integer(1)` |`1L` |`1` |`{1}` |`1` |**error** |`luajr.integer({1})` |`luajr.integer_r({1})` |
|`numeric(1)` |`3.14159` |`3.14159` |`{3.14159}` |`3.14149` |**error** |`luajr.numeric({3.14159})` |`luajr.numeric_r({3.14159})` |
|`character(1)` |`"howdy"` |`"howdy"` |`{"howdy"}` |`"howdy"` |**error** |`luajr.character({"howdy"})` |`luajr.character_r({"howdy"})` |
|`logical(nn)` |`c(TRUE, FALSE)`|`{true, false}` |`{true, false}` |**error** |`{true, false}` |`luajr.logical({true, false})` |`luajr.logical_r({true, false})` |
|`integer(nn)` |`1:2` |`{1, 2}` |`{1.0, 2.0}` |**error** |`{1.0, 2.0}` |`luajr.integer({1, 2})` |`luajr.integer_r({1, 2})` |
|`numeric(nn)` |`exp(0:1)` |`{1, 2.71828...}`|`{1, 2.71828...}`|**error** |`{1, 2.71828...}`|`luajr.numeric({1, 2.71828...})` |`luajr.numeric_r({1, 2.71828...})` |
|`character(nn)` |`letters[1:2]` |`{"a", "b"}` |`{"a", "b"}` |**error** |`{"a", "b"}` |`luajr.character({"a", "b"})` |`luajr.character_r({"a", "b"})` |
|`list(...)` |`list(1, b='b')`|`{[1]=1, b='b'}` |**error** |**error** |**error** |`x = luajr.list(); x[1] = luajr.numeric({1});`
`x.b = luajr.character({'b'}); return x`|`x = luajr.list(); x[1] = luajr.numeric_r({1});`
`x.b = luajr.character_r({'b'}); return x`|
|external pointer|`lua_open()` |`userdata:…` |`userdata:…` |`userdata:…`|`userdata:…` |`userdata:…` |`userdata:…` |
Above, `nn` stands for an integer that is greater than 1; in the examples, it
stands specifically for 2.
There should be one character in `argcode` for every argument of the function,
but the string is "recycled" when there are more arguments passed than
characters in the `argcode` string. So, for example, just passing `"s"` as
`argcode` means all parameters will be passed as the **s**implest Lua type,
while if `argcode` is `"sr"`, then the first argument has arg code `"s"`, the
second argument has arg code `"r"`, the third argument has arg code `"s"`, etc.
When a vector (logical vector, integer vector, numeric vector, or character
vector) is passed from R to Lua by **r**eference, modifications made to the
elements of that passed-in vector persist back in the R calling frame. For
example:
```{r}
values = c(1.0, 2.0, 3.0)
keep = lua_func("function(x) x[1] = 999 end", "v") # passed by value
keep(values)
print(values)
change = lua_func("function(x) x[1] = 999 end", "r") # passed by reference
change(values)
print(values)
```
Vectors can never be resized by reference; only their already-existing elements
can be changed by reference.
Lists are always passed by value, but their vector elements can be either passed
by value or by reference depending on the arg code. In order to make changes to
a list, it needs to be returned to the calling function. For example:
```{r}
x = list(1)
f1 = lua_func("function(x) x[1][1] = 999; x.a = 42; end", "v")
f1(x)
print(x)
f2 = lua_func("function(x) x[1][1] = 999; x.a = 42; end", "r")
f2(x)
print(x)
```
Note that in the examples above, we modify `x[1][1]`, not just `x[1]`. That is
because, even when lists are passed by reference, we need to modify specific
elements of their vector elements if we want to make changes by reference.
Setting `x[1] = 999` would only reassign an element of the list `x`, and lists
cannot be passed by reference itself, so this would have no effect on the R
object passed in. By contrast, setting `x[1][1] = 999` modifies the passed-in
vector `x[1]`, which can be passed in by reference, so this does change the R
object that is passed in.
To modify a list, we have to return its new value from the function:
```{r}
x = list(1)
f3 = lua_func("function(x) x[1][1] = 999; x.a = 42; return x; end", "v")
x = f3(x)
print(x)
f4 = lua_func("function(x) x[1] = luajr.numeric({888, 999}); return x; end", "v")
x = f4(x)
print(x)
```
Here, because we are returning a modified value, we can add elements to the list
(`f3`) and change whole entries of the list (`f4`).
# Benchmarking
In the same way that using R packages
[cpp11](https://CRAN.R-project.org/package=cpp11) or
[Rcpp](https://CRAN.R-project.org/package=Rcpp) allows you to
bridge your R code with C++ code that runs faster than R, you can use `luajr` to
bridge your R code with Lua code that runs faster than R.
In general, C/C++ code runs about 5-1,000 times faster than the equivalent R
code.
In my experience, luajr code often presents a similar speedup, about the same as
C/C++ code, or in the worst case, maybe half as fast. Sometimes luajr code can
be faster than C/C++, though usually it isn't quite as good.
Why use luajr then? Rcpp and cpp11 require a C++ toolchain (e.g. gcc, clang,
etc.) and requires long compilation times, whereas luajr doesn't. This means
that luajr is usable when a C++ compiler isn't available, or when compilation
times are prohibitive or an annoyance.
In the following section, we look at two aspects of benchmarking. In the first
example, we compare different ways of passing vectors to Lua functions relative
to R. In the second example, we compare more fundamentally the difference in
running a whole algorithm in Lua versus R.
## Parameter passing example: sum of squares
For reasonably long vectors, passing by reference (`'r'`) is faster than passing
by value (`'v'`), which is faster than passing by simplify (`'s'`, `'a'`, or
`'1'`–`'9'`). But for relatively short vectors, passing by simplify can
avoid some overhead, and for very simple operations on very short vectors, it
might not be worth using Lua at all.
To illustrate the above points, here is some code that takes a numeric vector
and calculates the sum of squares, comparing a pure R function with three
alternatives written in Lua, respectively passing the vector by reference, by
value, and by simplify.
```{r}
v1 = rnorm(1e1)
v4 = rnorm(1e4)
v7 = rnorm(1e7)
lua("sum2 = function(x) local s = 0; for i=1,#x do s = s + x[i]*x[i] end; return s end")
sum2 = function(x) sum(x*x)
sum2_r = lua_func("sum2", "r")
sum2_v = lua_func("sum2", "v")
sum2_s = lua_func("sum2", "s")
# Comparing the results of each function:
sum2(v1) # Pure R version
sum2_r(v1) # luajr pass-by-reference
sum2_v(v1) # luajr pass-by-value
sum2_s(v1) # luajr pass-by-simplify
```
The time taken to sum `v1`, `v4` and `v7`, depending upon the function kind, on
a 2019-era MacBook Pro is summarised in this table:
|Call |Number of summands | Method | Median runtime |
|:---------------|:-----------------:|:-------------------------:|---------------:|
|`sum2(v1)` | 10 | Pure R | **1 µs** |
|`sum2_r(v1)` | 10 | `luajr` pass by reference | 6 µs |
|`sum2_v(v1)` | 10 | `luajr` pass by value | 7 µs |
|`sum2_s(v1)` | 10 | `luajr` pass by simplify | 4 µs |
| | | | |
|`sum2(v4)` | 10,000 | Pure R | 63 µs |
|`sum2_r(v4)` | 10,000 | `luajr` pass by reference | **18 µs** |
|`sum2_v(v4)` | 10,000 | `luajr` pass by value | 93 µs |
|`sum2_s(v4)` | 10,000 | `luajr` pass by simplify | 78 µs |
| | | | |
|`sum2(v7)` | 10,000,000 | Pure R | 49,000 µs |
|`sum2_r(v7)` | 10,000,000 | `luajr` pass by reference | **13,000 µs** |
|`sum2_v(v7)` | 10,000,000 | `luajr` pass by value | 96,000 µs |
|`sum2_s(v7)` | 10,000,000 | `luajr` pass by simplify | 124,000 µs |
When the vector is 10 elements long, the R version wins handily.
When the vector is 10,000 elements long, the pass-by-reference Lua version is
fastest, but the other methods are all comparable.
When the vector is 10,000,000 elements long, the story is similar.
This is an example where luajr doesn't add that much speed—the function
is relatively short, and there is a certain amount of overhead in invoking it,
while the R function doesn't have the overhead of transferring control between
languages.
## Processing time example: logistic map
Consider the following:
```{r}
logistic_map_R = function(x0, burn, iter, A)
{
result_x = numeric(length(A) * iter)
j = 1
for (a in A) {
x = x0
for (i in 1:burn) {
x = a * x * (1 - x)
}
for (i in 1:iter) {
result_x[j] = x
x = a * x * (1 - x)
j = j + 1
}
}
return (list2DF(list(a = rep(A, each = iter), x = result_x)))
}
logistic_map_L = lua_func(
"function(x0, burn, iter, A)
local dflen = #A * iter
local result = luajr.dataframe()
result.a = luajr.numeric_r(dflen, 0)
result.x = luajr.numeric_r(dflen, 0)
local j = 1
for k,a in pairs(A) do
local x = x0
for i = 1, burn do
x = a * x * (1 - x)
end
for i = 1, iter do
result.a[j] = a
result.x[j] = x
x = a * x * (1 - x)
j = j + 1
end
end
return result
end", "sssr")
```
Here we are comparing two different versions (R versus Lua) of running a
parameter sweep of the logistic map, a chaotic dynamical system popularized by
Bob May in a [1976 Nature article](https://sites.ifi.unicamp.br/aguiar/files/2014/10/May-Nature-1976.pdf).
The output looks like this:
```{r}
logistic_map = logistic_map_L(0.5, 50, 100, 200:385/100)
plot(logistic_map$a, logistic_map$x, pch = ".")
```
The times taken by each function on a 2019-era MacBook Pro are as follows:
|Call | Method | Median runtime |
|---------------------------------------------|-------------------|---------------:|
|`logistic_map_R(0.5, 50, 100, 200:385/100))` | R function | 1900 µs |
|`logistic_map_L(0.5, 50, 100, 200:385/100))` | Lua function | **200 µs** |
The version written in Lua is around 10 times faster than the version in R.
The speedup was much more notable in an earlier test where the R version first
created the data frame and then performed the iteration, i.e. with the line
`result$x[j] = x` instead of `result_x[j] = x`. The median runtime for that R
version was two orders of magnitude slower; the extra overhead associated with
`data.frame` methods was pointed out by Tim Taylor.
# Working with Lua States: `lua_open()`, `lua_reset()` {#states}
All the functions mentioned above (`lua()`, `lua_shell()`, and `lua_func()`)
can also take an argument `L` that specifies a particular Lua state that the
function operates in.
When `L = NULL` (the default) the functions operate on the default Lua state.
But you can also open alternative Lua states using `lua_open()`, and then by
passing the result as the parameter `L`, specify that the function operates in
that specific state. For example:
```{r}
L1 = lua_open()
lua("a = 2")
lua("a = 4", L = L1)
lua("return a")
lua("return a", L = L1)
```
There is no `lua_close` in luajr because Lua states are closed automatically
when they are garbage collected in R.
`lua_reset()` resets the default Lua state:
```{r}
lua("a = 2")
lua("return a")
lua_reset()
lua("return a")
#> NULL
```
To reset a non-default Lua state `L` returned by `lua_open()`, just do
`L = lua_open()` again. The memory previously used by `L` will be cleaned up at
the next garbage collection.
# Further reading
For notes on the `luajr` Lua module -- which contains functions and types for
interacting with R from Lua code -- see `vignette("luajr-module")`.