Programming in R

Defining Functions

In R, functions are "first class" objects.

In [1]:
g = function(x) {
    sin(x) * exp(-0.01*x)  # not: y = ...
}                          # but: return(y)

g
g(1)
Out[1]:
function (x) 
{
    sin(x) * exp(-0.01 * x)
}
Out[1]:
0.833098208613807

The last evaluated expression is returned.
NOTE: Assignments do not return a value!

R supports "local scoping" rules:

In [2]:
a = 0.05
g = function(x) sin(x) * exp(-a*x)
g(1)

a = 0.10
g(1)
Out[2]:
0.800431960612864
Out[2]:
0.761394433245753

Keywords with default values are allowed.

In [3]:
g = function(x, a=0.1)
        return(sin(x) * exp(-a*x))

As objects, functions can be defined within functions. And they can be used as variables for other functions.

In [4]:
curve(g, 0, 6*pi, col=4)
grid()

NOTE: Some functions, such as curve() or integrate() require the function to be vectorized.

In [5]:
integrate(g, 0, 2*pi, a=0.1)
Out[5]:
0.461893 with absolute error < 3.3e-14

Variables are passed as values, not as pointers, thus they cannot be changed with the function.
As a consequence: more memory will be needed, especially when a variable is a big dataframe.
[copy-by-demand/change + laziness]

When several return values shall be returned, a list will be needed -- multiple assignments are not possible.

In [6]:
min_ind <- function(x) {
    m <- min(x)
    i <- (1:length(x))[x == m]
    list(min=m, inds=i)
}

x = c(4,5,6,3,7,8,3,3,9)
res = min_ind(x)
cat("The minimum is", res$min, "and the indices are:", res$inds, "\n")
The minimum is 3 and the indices are: 4 7 8 

Flow Control

for (i in 1:n) {            while (...) {               if (...) {
    ...                         ...                         ...
}                           }                           } else {
                                                            ...
for (a in vec) {            repeat {                    }
    ...                         ...
}                               break
                            }
In [7]:
mySqrt = function(a, n=2) {
    x0 = Inf
    x1 = 1
    while (abs(x1-x0) > 1e-15) {
        x0 = x1
        x1 = x0 - (x0^n - a) / n / x0^(n-1)
    }
    x1
}

mySqrt(2, 5)
2^0.2
Out[7]:
1.14869835499704
Out[7]:
1.14869835499704

apply Functions

In [8]:
A = matrix(1:12, 4, 3)
A
apply(A, 1, mean)
apply(A, 2, mean)
Out[8]:
159
2 610
3 711
4 812
Out[8]:
  1. 5
  2. 6
  3. 7
  4. 8
Out[8]:
  1. 2.5
  2. 6.5
  3. 10.5
In [9]:
X = list(x=1, y=c('a','b','c'), z=rep(0,100))
lapply(X, length)
sapply(X, length)
Out[9]:
$x
1
$y
3
$z
100
Out[9]:
x
1
y
3
z
100

S3 Methods

In [10]:
minind <- function(x) {
    m <- min(x)
    i <- (1:length(x))[x == m]
    r = list(min=m, inds=i)
    class(r) <- "minind"
    r
}

print.minind <- function(r) {
    cat("The minimum is", res$min, "and the indices are:", res$inds, "\n")
}
In [11]:
x <- c(4,5,6,3,7,8,3,3,9)

res = minind(x)
res
str(res)
Out[11]:
The minimum is 3 and the indices are: 4 7 8 
List of 2
 $ min : num 3
 $ inds: int [1:3] 4 7 8
 - attr(*, "class")= chr "minind"

Debugging

For debugging purposes, it is best to use the features RStudio is providing.

Also:

options(error=recover)     # options(error=NULL)
trace(fun)                 # untrace(fun)
debug(fun)                 # undebug(fun), also: debugonce(fun)
setBreakpoint(file, line)  # clear=TRUE

These functions will call the browser: browse