Basic Introduction to R¶

Hans W Borchers
ABB Corporate Research, Ladenburg
January 19, 2016

Computing With R¶

Datatypes in R are numeric (Float64): 10.0, 10, 1e01,
complex: 1 + 1i
Boolean: TRUE, FALSE,
and character, enclosed in ' or ".

Variable names can include alphanumeric characters, _, or . (sometimes with special meaning).
The assignment operator is <-, but also = can be used in most situations.
# denotes an end-to-line comment.

Arithmetic Operators: +, -, *, /, ^

Comparison Operators: ==, !=, >, >=, <, <=

Logical Operators: !, &, | (Syntax: &&, ||)

Mathematical Functions:
All common mathematical functions are available in R: abs, sqrt, sin, ..., exp, log, ..., round, etc.

0/0
1 + NA
sqrt(-1+0i)

[1] NA

[1] 0+1i

NaN means "Not a Number". More important in R is NA for "Not Available", used for missing data.

Using the Workspace

e = exp(1)
ls()
ls.str()

getwd()
#setwd()

e :  num 2.72

Characters¶

cat("The answer is:", 42, "\n")
paste("Ich", "bin", "ein", "Berliner")
c('z', 'Z') %in% letters  # or Letters

The answer is: 42

substr("Ich bin ein Berliner", 9, 11)
strsplit("Ich bin ein Berliner", ' ')

R integrates the PCRE library for regular expressions.

res <- gregexpr("\\w+", "Eine Rose ist eine Rose ist eine Rose.")
str(res)

List of 1
 $ : atomic [1:8] 1 6 11 15 20 25 29 34
  ..- attr(*, "match.length")= int [1:8] 4 4 3 4 4 3 4 4
  ..- attr(*, "useBytes")= logi TRUE

Vectors and Matrices¶

Generating vectors and matrices
Vectors and matrices are in R defined with the help of the 'concatenate' function c(),
or by using the 'colon' operator: 1:5 (or using the seq function).
Matrices will be generated with, e.g., the matrix function.

b = c(1, 2, 3, 4, 5, 6)
A = matrix(runif(36), nrow=6, byrow=TRUE)
A
(x = qr.solve(A, b))
A %*% x

Operations and Functions
Vectors or matrices(of equal length resp. size) can be added, multiplied, taken to the power: + - * / ^.
These operations work elementwise (not as matrix operations). Elements are cyclically filled when more elements are needed!

%*% is matrix multiplication.
Functions on arrays: length, sum, prod, sort, mean, median, ...

'Yoga' of Indexing
x[i] the i-th element of a vector x (indexing starts with 1, not 0 !).
A[i,j] the element in row i and column j of matrix A. A[i,] the i-th row, A[,j] the j-th column of matrix A.

Lists¶

Lists are collections of objects of different data types, known as its components.

L <- list(a=c(1,2,3,4,5), b=TRUE, c=c('a','b','c'), d=diag(1,3))
L
names(L)
L[1:2]
L[[1]][1]

L$d
L[["d"]]

L[[1]][2]

lapply(L, length)

Dataframes are lists where all components have the same length.

dframe <- data.frame(a=0, b=1:26, c=letters, d=LETTERS)

str(dframe)

'data.frame':	26 obs. of  4 variables:
 $ a: num  0 0 0 0 0 0 0 0 0 0 ...
 $ b: int  1 2 3 4 5 6 7 8 9 10 ...
 $ c: Factor w/ 26 levels "a","b","c","d",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ d: Factor w/ 26 levels "A","B","C","D",..: 1 2 3 4 5 6 7 8 9 10 ...

Factors¶

Factors are categorical variables with a finite number of different values, called levels.

gender <- factor(c('m', 'f', 'm', 'm', 'f', 'm', 'f', 'f'))
gender

levels(gender)
nlevels(gender)

as.numeric(gender)

0.6940234	0.7228398	0.2423202	0.3766033	0.6095528	0.7730926
0.803969404	0.390726755	0.005829482	0.123837501	0.688995443	0.927701381
0.6264596	0.1171935	0.4070221	0.5008644	0.2762211	0.5603193
0.71603676	0.05542268	0.34226408	0.01408316	0.02551473	0.39958570
0.3479528	0.6012284	0.3703129	0.4143563	0.6644754	0.6812745
0.0172455	0.6718268	0.9503481	0.9650054	0.4444700	0.4217130