6 Working in R

6.1 Arithmetic operators

Symbol Operation
+ addition
- subtraction
* multiplication
/ division
^ exponentiation

PEMDAS applies when writing R-code

# Arithmetic operators in action:
5 + 5 / 2
3 * 2^2
(3*2)^2 
# Arithmetic operators using objects:
z <- 5
w <- c(3,7,9,2)
s <- w[3]
z + s

6.2 Arithmetic operators practice

Using what you know about parenthesis and PEMDAS, in one line of code do each of the following:

Number Excercise
1. Assign the variable x to be a vector containing the values 5,5,6,2
2. Assign the variable y to be a vector containing the values 3,3,1,7
3. Add x and y
4. Substract y from x
5. Assign d as y divided by x
3. Multiply z by s then add five (z and s from last practice)
4. Add 5 to z then multilply by s
5. Take z to the fifth power and then add 2
6. Divide s by three, then add 33, then take that sum to the 0.5 power

6.3 Missing values (NA)

# is.na tests for missing values
dat.1 <- c(-1,NA,1,1,-1)
dat.1 + 2
dat.1 + rep(2, length(dat.1))

6.4 Dealing with missing values

  • In R, missing values are represented by “NA”.

  • Undefined values (like dividing by zero) are represented by “NaN”, not a number. Often missing values are represented with numbers: -1, 99, -9999, etc. This is obviously a problem and should be avoided. You will likely need to use indexing prior using arithmetic operations to replace these values.

The “is.na” function and the “na.rm”” argument:

  • Sometimes we do not know whether there are missing values in our data.

  • We can use the is.na function to test for missing values:

# is.na tests for missing values
dat.1 <- c(-1,NA,1,1,-1)
is.na(dat.1)
which(is.na(dat.1))

We can use the logical na.rm argument to remove missing values from our data prior to executing the funciton:

dat.1 <- c(-1,NA,1,1,-1)
mean(dat.1)
mean(dat.1, na.rm = T)

6.5 Dealing with problematic values

Sometimes we code no data as -1 and that can really screw things up. R does not know that -1 means “no data”. However we can replace the -1 with NA. There are many ways to do this, but here is a one way.

# In this example, -1 is coded as a missing data field. Think back to our logical arguments and subsetting exercises.
dat.2 <- c(2,-1,3,4,5)
dat.2 == -1
dat.2[dat.2 == -1]
dat.2[dat.2 == -1] <- NA 
dat.1 <- c(2,NA,3,4,5)
# is.na tests for missing values
is.na(dat.1) 
# returns the element(s) number in the vector that is NA
which(is.na(dat.1))