# Basic Introduction to R¶

Hans W Borchers
ABB Corporate Research, Ladenburg
January 19, 2016

## Computing With R¶

Datatypes in R are numeric (Float64): 10.0, 10, 1e01,
complex: 1 + 1i
Boolean: TRUE, FALSE,
and character, enclosed in ' or ".

Variable names can include alphanumeric characters, _, or . (sometimes with special meaning).
The assignment operator is <-, but also = can be used in most situations.
# denotes an end-to-line comment.

Arithmetic Operators: +, -, *, /, ^

Comparison Operators: ==, !=, >, >=, <, <=

Logical Operators: !, &, | (Syntax: &&, ||)

Mathematical Functions:
All common mathematical functions are available in R: abs, sqrt, sin, ..., exp, log, ..., round, etc.

In :
0/0
1 + NA
sqrt(-1+0i)

Out:
NaN
Out:
 NA
Out:
 0+1i

NaN means "Not a Number". More important in R is NA for "Not Available", used for missing data.

Using the Workspace

In :
e = exp(1)
ls()
ls.str()

getwd()
#setwd()

Out:
'e'
Out:
e :  num 2.72
Out:
'/Users/hwb/Documents/meetup/RinLadenburg/notebooks'

### Characters¶

In :
cat("The answer is:", 42, "\n")
paste("Ich", "bin", "ein", "Berliner")
c('z', 'Z') %in% letters  # or Letters

The answer is: 42

Out:
'Ich bin ein Berliner'
Out:
1. TRUE
2. FALSE
In :
substr("Ich bin ein Berliner", 9, 11)
strsplit("Ich bin ein Berliner", ' ')

Out:
'ein'
Out:
1. 'Ich'
2. 'bin'
3. 'ein'
4. 'Berliner'

R integrates the PCRE library for regular expressions.

In :
res <- gregexpr("\\w+", "Eine Rose ist eine Rose ist eine Rose.")
str(res)

List of 1
$: atomic [1:8] 1 6 11 15 20 25 29 34 ..- attr(*, "match.length")= int [1:8] 4 4 3 4 4 3 4 4 ..- attr(*, "useBytes")= logi TRUE  ### Vectors and Matrices¶ Generating vectors and matrices Vectors and matrices are in R defined with the help of the 'concatenate' function c(), or by using the 'colon' operator: 1:5 (or using the seq function). Matrices will be generated with, e.g., the matrix function. In : b = c(1, 2, 3, 4, 5, 6) A = matrix(runif(36), nrow=6, byrow=TRUE) A (x = qr.solve(A, b)) A %*% x  Out:  0.694023 0.72284 0.24232 0.376603 0.609553 0.773093 0.803969 0.390727 0.00582948 0.123838 0.688995 0.927701 0.62646 0.117194 0.407022 0.500864 0.276221 0.560319 0.716037 0.0554227 0.342264 0.0140832 0.0255147 0.399586 0.347953 0.601228 0.370313 0.414356 0.664475 0.681275 0.0172455 0.671827 0.950348 0.965005 0.44447 0.421713 Out: 1. 12.9030124608126 2. -11.0184779518448 3. 20.3019166563725 4. -10.0486491125022 5. 37.4855482348609 6. -31.0118489278822 Out:  1 2 3 4 5 6 Operations and Functions Vectors or matrices(of equal length resp. size) can be added, multiplied, taken to the power: + - * / ^. These operations work elementwise (not as matrix operations). Elements are cyclically filled when more elements are needed! %*% is matrix multiplication. Functions on arrays: length, sum, prod, sort, mean, median, ... 'Yoga' of Indexing x[i] the i-th element of a vector x (indexing starts with 1, not 0 !). A[i,j] the element in row i and column j of matrix A. A[i,] the i-th row, A[,j] the j-th column of matrix A. ### Lists¶ Lists are collections of objects of different data types, known as its components. In : L <- list(a=c(1,2,3,4,5), b=TRUE, c=c('a','b','c'), d=diag(1,3)) L names(L) L[1:2] L[]  Out:$a
1. 1
2. 2
3. 3
4. 4
5. 5
$b TRUE$c
1. 'a'
2. 'b'
3. 'c'
$d  1 0 0 0 1 0 0 0 1 Out: 1. 'a' 2. 'b' 3. 'c' 4. 'd' Out:$a
1. 1
2. 2
3. 3
4. 4
5. 5
$b TRUE Out: 1 In : L$d
L[["d"]]

Out:
 1 0 0 0 1 0 0 0 1
Out:
 1 0 0 0 1 0 0 0 1
In :
L[]

Out:
2
In :
lapply(L, length)

Out:
$a 5$b
1
$c 3$d
9

Dataframes are lists where all components have the same length.

In :
dframe <- data.frame(a=0, b=1:26, c=letters, d=LETTERS)

In :
str(dframe)

'data.frame':	26 obs. of  4 variables:
$a: num 0 0 0 0 0 0 0 0 0 0 ...$ b: int  1 2 3 4 5 6 7 8 9 10 ...
$c: Factor w/ 26 levels "a","b","c","d",..: 1 2 3 4 5 6 7 8 9 10 ...$ d: Factor w/ 26 levels "A","B","C","D",..: 1 2 3 4 5 6 7 8 9 10 ...


### Factors¶

Factors are categorical variables with a finite number of different values, called levels.

In :
gender <- factor(c('m', 'f', 'm', 'm', 'f', 'm', 'f', 'f'))
gender

Out:
1. m
2. f
3. m
4. m
5. f
6. m
7. f
8. f
In :
levels(gender)
nlevels(gender)

Out:
1. 'f'
2. 'm'
Out:
2
In :
as.numeric(gender)

Out:
1. 2
2. 1
3. 2
4. 2
5. 1
6. 2
7. 1
8. 1