# Vectorization, Recycling, and Indexing in R

Many functions and almost all the *operators* (like `+`

and `*`

, etc.) are vectorized.

They operate very quickly on each element of an atomic vector.

Goals: we want to learn about:

- Vectorization
- Recycling
- Indexing

These three ideas of fundamental to R.

We will also discuss:

- Comparison operators
- Logical operators
- Mathematical operators

## Binary Comparison Operators

These are “binary” because they involve *two* arguments.

Operate *elementwise* on vectors and return //logical vectors//

```
x < y # less than
x > y # greater than
x <= y # less than or equal to
x >= y # greater than or equal to
x == y # equal to
x != y # not equal to
```

`==`

is the “*comparison equals*” which tests for equality. (Be careful not to use `=`

which, in today’s versions of R, is actually interpreted as leftwards assignment.)

### Binary Comparison Examples

With numeric vectors

`x <- c(1,2,5) y <- c(4,4,3) x == y #> [1] FALSE FALSE FALSE x != y #> [1] TRUE TRUE TRUE x < y #> [1] TRUE TRUE FALSE`

With strings

`a <- c("izzy", "jazz", "tyler") b <- c("devon", "vanessa", "hilary") a < b # alphabetical order #> [1] FALSE TRUE FALSE`

Here is a tricky combination of both. Can you parse it?

`(a < b) <= (x == y) # trickier...notice the parentheses to force precedence #> [1] TRUE FALSE TRUE`

### Binary Comparison Between a Vector and a Scalar

Check this out:

```
x <- 1:10 # the colon operator returns a sequence
x
#> [1] 1 2 3 4 5 6 7 8 9 10
x <= 3
#> [1] TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
# compare this to:
x <= c(3,3,3,3,3,3,3,3,3,3)
#> [1] TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
```

What is going on here?

### Comparison With Different-Lengths of Vectors

Try this one:

```
x <- 1:10
x
#> [1] 1 2 3 4 5 6 7 8 9 10
x > c(1,7)
#> [1] FALSE FALSE TRUE FALSE TRUE FALSE TRUE TRUE TRUE TRUE
# compare this to:
x > c(1,7, 1,7, 1,7, 1,7, 1,7)
#> [1] FALSE FALSE TRUE FALSE TRUE FALSE TRUE TRUE TRUE TRUE
```

To understand what is going on here we need to talk about *recycling*

## Recycling of Vectors in R

A *very super-wickedly, important*, concept: R likes to operate on vectors of the same length, so if it encounters two vectors of different lengths in a binary operation, it merely replicates (

*recycles*) the smaller vector until it is the same length as the longest vector, then it does the operation.

If the recycled smaller vector has to be “chopped off” to make it the length of the longer vector, you will get a warning, but it will still return a result:

```
x <- c(1,2,3)
y <- c(1,10)
x * y
#> Warning in x * y: longer object length is not a multiple of shorter object
#> length
#> [1] 1 20 3
```

### We will see Recycling In Many contexts

Recycling occurs wherever two or more vectors get operated on elementwise, not just with comparison operators. It also happens (as we saw above) with mathematical operators. And it also happens with indexing operators when indexing by logical vectors (you’ll see that later)!!

You gotta know it! Here are some more examples:

```
x <- 1:20
x * c(1,0) # turns the even numbers to 0
#> [1] 1 0 3 0 5 0 7 0 9 0 11 0 13 0 15 0 17 0 19 0
x * c(0, 0, 1) # turns non-multiples of 3 to 0
#> Warning in x * c(0, 0, 1): longer object length is not a multiple of
#> shorter object length
#> [1] 0 0 3 0 0 6 0 0 9 0 0 12 0 0 15 0 0 18 0 0
x < ((1:4)^2) # recycling c(1, 4, 9, 16)
#> [1] FALSE TRUE TRUE TRUE FALSE FALSE TRUE TRUE FALSE FALSE FALSE
#> [12] TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
```

## Combinations of Comparisons; Logical Ops

**A Weather Example**

- Suppose two variables,
`temp`

(in degrees Celsius) and`precip`

(in mm) each a vector of length 365. - Tell me how you test for:
- All days with temp less than 10 and precip greater than 5
- Days with temp greater than 15 or with no precip (or both)
- Days with temp greater than 15 or with no precip (but not both)

### Logical Operators-I

These operate on `logical`

s and return `logical`

s. `numeric`

and `complex`

vectors are coerced to `logical`

before applying these.

- Unary operators (those that operate elementwise on a single vector)
`!`

Turns`TRUE`

to`FALSE`

and`FALSE`

to`TRUE`

`x <- c(T, T, F, F) # you can use abbreviations for TRUE and FALSE... x #> [1] TRUE TRUE FALSE FALSE !x #> [1] FALSE FALSE TRUE TRUE`

### Logical Operators-II

- Binary operators (operate elementwise on two vectors)
`&`

— Logical AND`|`

— Logical OR`xor(x,y)`

— Logical EXCLUSIVE OR`x <- c(NA, T, F, T, F) y <- c(T, T, F, F, NA) x #> [1] NA TRUE FALSE TRUE FALSE y #> [1] TRUE TRUE FALSE FALSE NA x & y #> [1] NA TRUE FALSE FALSE FALSE x | y #> [1] TRUE TRUE FALSE TRUE NA xor(x,y) #> [1] NA FALSE FALSE TRUE NA`

## Mathematical Operators

Operate on `numeric`

or `complex`

mode data and return the same

```
x + y # addition
x - y # subtraction
x * y # multiplication
x / y # division
x ^ y # exponentiation
x %% y # modulo division (remainder) 10 %% 3 = 1
x %/% y # integer division: 10 %/% 3 = 3
```

### Grouping Parts of Expressions

Parentheses are good for ensuring that parts of complex expressions are evaluated in the right order.

But, in case you want to appear like a real code jock and don’t want to use parentheses, learn the rules of precence.

### Precedence of Operators we Have seen

From highest to lowest:

```
^ # exponentiation (right to left)
- + # unary minus and plus
: # sequence operator
* / # multiply, divide
+ - # (binary) add, subtract
< > <= >= == != # ordering and comparison
! # negation
& # and
| # or
-> # rightwards assignment
= # assignment (right to left)
<- # assignment (right to left)
```

Higher precedence operators “stick” more tightly to their arguments. So, for example:

```
x<-3
y<-2
-x * y # this is like (-x) * y
#> [1] -6
-x ^ y # this is like -(x ^ y)
#> [1] -9
```

### One very important precedence rule

Notice that the `:`

has higher precedence than the `+`

, `-`

, `*`

, or `/`

.

Thus

```
1:5*3 # this is (1:5)*3
#> [1] 3 6 9 12 15
1:(5*3) # this is 1:15
#> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
```

Or, if you want the sequence of numbers from 0 to n-1, be careful:

```
n <- 5
0:n-1 # wrong
#> [1] -1 0 1 2 3 4
0:(n-1) # right
#> [1] 0 1 2 3 4
```

### Built In Help On Functions and Operators

Recall that `?function_name`

returns help (if available) for the function with name `function_name`

:

```
# examples:
?c
?sum
?mean
```

Builtin help on topics we have discussed today can be found at `?Syntax`

, `?Logic`

, `?Comparison`

, `?Arithmetic`

.

Also, all material here is covered in parts of sections 1 through 3 in intro.pdf available on CRAN.

## Indexing

There are times when we want to access one or just a few elements from a vector. We’ve already seen an example of extracting a single element, for example:

```
x <- c("devon", "alicia", "cassie")
x[2] # this extracts the second element of x
#> [1] "alicia"
```

Vectors in R are *base-1* subscripted. i.e. elements are subscripted “1, 2, 3, …” instead of “0, 1, 2, …”

### Overview: 4 Ways To Extract from a Vector

Single square brackets are the *indexing* operators. There are four (common) ways of using the indexing operators. They differ by putting different things inside of the square brackets:

- A vector of
*positive*indices:`x[c(1,6,4)]`

- A vector of
*negative*indices:`x[-c(1,6,4)`

- A
*logical vector*of the appropriate length:`x[c(T,F,F,T,T)]`

- A
*character vector*of*names*:`x[c("Sept10","Sept24")]`

*if the vector has a*`names`

*attribute*.

Number four should not make sense to you yet!

### Indexing With Positive Integers

A vector of positive integers extracts the corresponding elements,

*in the same order*and*as many times*as the indices are listed in the vector`x <- c(5,4,7,8) x[c(4,4,4,2,2,1,3,2)] # returns a vector of length 8! #> [1] 8 8 8 4 4 5 7 4`

If an index exceeds the

*length*of the vector, it returns an`NA`

for that element`x <- c(5,4,7,8) x[c(4,1,3,5)] # the 4th element of the returned vector is NA #> [1] 8 5 7 NA`

and gives

*no warning*of this.

### Indexing With Negative Integers

- A vector of negative integers says, “extract everything
*except*these indices.”- The order of the remaining elements is preserved.
- Multiple instances of the same negative integer have the same effect as a single one
- Negative integers exceeding the length of the vector are just ignored

- The order of the remaining elements is preserved.

```
x <- c(5,4,7,8) # here is our vector...
x[-2]
#> [1] 5 7 8
x[-c(2,4)]
#> [1] 5 7
x[-c(2,2,2,2,4,4,4,4)]
#> [1] 5 7
x[-c(2,4,5,10,18)]
#> [1] 5 7
```

You *cannot* mix positive and negative indices!

### Indexing with Logical Vectors

- You can supply a logical vector that is “parallel” to the vector you want to extract from. Any element where a
`TRUE`

occurs in the index vector gets returned. Order of elements is preserved and elements can’t get replicated.

```
x <- c(5,4,7,8)
x[c(FALSE, TRUE, TRUE, FALSE)]
#> [1] 4 7
```

- If the index vector is shorter than the vector being indexed, the index vector is
*recycled*

```
x <- c(5,4,7,8)
x[c(FALSE, TRUE)]
#> [1] 4 8
```

### Empty Subscript Indexing

- Here is a quirky feature that you should get to know well, as it will help to understand matrix and data.frame subscripting.
- If you apply an empty indexing operator
`[]`

to a vector, then it returns everything in the vector. Observe:

```
x <- c(5,4,7,8)
x[]
#> [1] 5 4 7 8
x
#> [1] 5 4 7 8
```

- “When you give R nothing it gives you everything in return!”

## The Replacement form of Indexing

- Also called the
*assignment*form. Allows you to change specified elements of a vector while leaving the others untouched (*except for mode changes due to coercion!*) This usually takes some getting used to, but you will use it all over in R. So get comfortable with it!

`x <- c(5,4,7,8) y <- x x[c(1,3)] <- 0 x #> [1] 0 4 0 8 x <- y x[c(T,F,T,F)] <- 1 x #> [1] 1 4 1 8 x <- y x[-c(1,3)] <- NA x #> [1] 5 NA 7 NA x <- y x[c(1,3)] <- c("a","c") # coercion of remaining elements x #> [1] "a" "4" "c" "8" x <- y x[c(3,1,2)] <- c("boing1", "boing2", "boing3") # note ordering x #> [1] "boing2" "boing3" "boing1" "8" x <- y x[c(3,1,3,2,2,2)] <- c("boing1", "boing2", "boing3") # repeated occurrences ignored x #> [1] "boing2" "boing3" "boing3" "8"`

The vector that is being assigned gets

*recycled*as need be to match the length of the (extracted part of the) vector being indexed and assigned to.

### Assignment Beyond the length of the Vector

This is allowable when using the replacement form. Intermediate elements are set to NA

`x <- c(5,4,7,8) length(x) #> [1] 4 x[10] <- 12 x #> [1] 5 4 7 8 NA NA NA NA NA 12 length(x) #> [1] 10`

Those NA’s don’t get overwritten by recycling. Recycling only occurs to match the length of the vector returned by the indexing operation:

`x <- c(5,4,7,8) x[11:19] <- c(-1,0,1)`

comments powered by Disqus