5.4 Managing Lists

A list is an R structure that allows you to combine elements of different types, including lists embedded in a list, and length. Many statistical outputs are provided as a list as well; therefore, its critical to understand how to work with lists. In this section I will guide you throught the basics of managing lists to include:

5.4.1 Creating Lists

To create a list we can use the list() function. Note how each of the four list items are of different classes (integer, character, logical, and numeric) and different length.

l <- list(1:3, "a", c(TRUE, FALSE, TRUE), c(2.5, 4.2))
str(l)
List of 4
 $ : int [1:3] 1 2 3
 $ : chr "a"
 $ : logi [1:3] TRUE FALSE TRUE
 $ : num [1:2] 2.5 4.2

# a list containing a list
l <- list(1:3, list(letters[1:5], c(TRUE, FALSE, TRUE)))
str(l)
List of 2
 $ : int [1:3] 1 2 3
 $ :List of 2
  ..$ : chr [1:5] "a" "b" "c" "d" ...
  ..$ : logi [1:3] TRUE FALSE TRUE

5.4.2 Adding on to Lists

To add additional list components to a list we can leverage the list() and append() functions. We can illustrate with the following list.

l1 <- list(1:3, "a", c(TRUE, FALSE, TRUE))
str(l1)
List of 3
 $ : int [1:3] 1 2 3
 $ : chr "a"
 $ : logi [1:3] TRUE FALSE TRUE

If we add the new elements with list() it will create a list of two components, component 1 will be a nested list of the original list and component 2 will be the new elements added:

l2 <- list(l1, c(2.5, 4.2))
str(l2)
List of 2
 $ :List of 3
  ..$ : int [1:3] 1 2 3
  ..$ : chr "a"
  ..$ : logi [1:3] TRUE FALSE TRUE
 $ : num [1:2] 2.5 4.2

To simply add a 4th list component without creating nested lists we use the append() function:

l3 <- append(l1, list(c(2.5, 4.2)))
str(l3)
List of 4
 $ : int [1:3] 1 2 3
 $ : chr "a"
 $ : logi [1:3] TRUE FALSE TRUE
 $ : num [1:2] 2.5 4.2

Alternatively, we can also add a new list component by utilizing the ‘$’ sign and naming the new item:

l3$item4 <- "new list item"
str(l3)
List of 5
 $      : int [1:3] 1 2 3
 $      : chr "a"
 $      : logi [1:3] TRUE FALSE TRUE
 $      : num [1:2] 2.5 4.2
 $ item4: chr "new list item"

To add individual elements to a specific list component we need to introduce some subsetting, we’ll continue with our original l1 list:

str(l1)
List of 3
 $ : int [1:3] 1 2 3
 $ : chr "a"
 $ : logi [1:3] TRUE FALSE TRUE

To add additional values to a list item you need to subset for that specific list item and then you can use the c() function to add the additional elements to that list item:

l1[[1]] <- c(l1[[1]], 4:6)

str(l1)
List of 3
 $ : int [1:6] 1 2 3 4 5 6
 $ : chr "a"
 $ : logi [1:3] TRUE FALSE TRUE

l1[[2]] <- c(l1[[2]], c("adding", "to a", "list"))

str(l1)
List of 3
 $ : int [1:6] 1 2 3 4 5 6
 $ : chr [1:4] "a" "adding" "to a" "list"
 $ : logi [1:3] TRUE FALSE TRUE

5.4.3 Adding Attributes to Lists

The attributes that you can add to lists include names, general comments, and specific list item comments. Currently, our l1 list has no attributes:

attributes(l1)
NULL

We can add names to lists in two ways. First, we can use names() to assign names to list items in a pre-existing list. Second, we can add names to a list when we are creating a list.

# adding names to a pre-existing list
names(l1) <- c("item1", "item2", "item3")

str(l1)
List of 3
 $ item1: int [1:6] 1 2 3 4 5 6
 $ item2: chr [1:4] "a" "adding" "to a" "list"
 $ item3: logi [1:3] TRUE FALSE TRUE

attributes(l1)
$names
[1] "item1" "item2" "item3"

# adding names when creating lists
l2 <- list(item1 = 1:3, item2 = letters[1:5], item3 = c(T, F, T, T))

str(l2)
List of 3
 $ item1: int [1:3] 1 2 3
 $ item2: chr [1:5] "a" "b" "c" "d" ...
 $ item3: logi [1:4] TRUE FALSE TRUE TRUE

attributes(l2)
$names
[1] "item1" "item2" "item3"

We can also add comments to lists. As previously mentioned, comments act as a note to the user without changing how the object behaves. With lists, we can add a general comment to the list using comment() and we can also add comments to specific list items with attr().

# adding a general comment to list l2 with comment()
comment(l2) <- "This is a comment on a list"

str(l2)
List of 3
 $ item1: int [1:3] 1 2 3
 $ item2: chr [1:5] "a" "b" "c" "d" ...
 $ item3: logi [1:4] TRUE FALSE TRUE TRUE
 - attr(*, "comment")= chr "This is a comment on a list"

attributes(l2)
$names
[1] "item1" "item2" "item3"

$comment
[1] "This is a comment on a list"

# adding a comment to a specific list item with attr() 
attr(l2, "item2") <- "Comment for item2"

str(l2)
List of 3
 $ item1: int [1:3] 1 2 3
 $ item2: chr [1:5] "a" "b" "c" "d" ...
 $ item3: logi [1:4] TRUE FALSE TRUE TRUE
 - attr(*, "comment")= chr "This is a comment on a list"
 - attr(*, "item2")= chr "Comment for item2"

attributes(l2)
$names
[1] "item1" "item2" "item3"

$comment
[1] "This is a comment on a list"

$item2
[1] "Comment for item2"

5.4.4 Subsetting Lists

If list x is a train carrying objects, then x[[5]] is the object in car 5; x[4:6] is a train of cars 4-6 Twitter - ‘(???)’

To subset lists we can utilize the single bracket [ ], double brackets [[ ]], and dollar sign $ operators. Each approach provides a specific purpose and can be combined in different ways to achieve the following subsetting objectives:

Subset list and preserve output as a list
Subset list and simplify output
Subset list to get elements out of a list
Subset list with a nested list

5.4.4.1 Subset list and preserve output as a list

To extract one or more list items while preserving¹² the output in list format use the [ ] operator:

# extract first list item
l2[1]
$item1
[1] 1 2 3

# same as above but using the item's name
l2["item1"]
$item1
[1] 1 2 3

# extract multiple list items
l2[c(1,3)]
$item1
[1] 1 2 3

$item3
[1]  TRUE FALSE  TRUE  TRUE

# same as above but using the items' names
l2[c("item1", "item3")]
$item1
[1] 1 2 3

$item3
[1]  TRUE FALSE  TRUE  TRUE

5.4.4.2 Subset list and simplify output

To extract one or more list items while simplifying¹³ the output use the [[ ]] or $ operator:

# extract first list item and simplify to a vector
l2[[1]]
[1] 1 2 3

# same as above but using the item's name
l2[["item1"]]
[1] 1 2 3

# same as above but using the `$` operator
l2$item1
[1] 1 2 3

One thing that differentiates the [[ operator from the $ is that the [[ operator can be used with computed indices. The $ operator can only be used with literal names.

5.4.4.3 Subset list to get elements out of a list

To extract individual elements out of a specific list item combine the [[ (or $) operator with the [ operator:

# extract third element from the second list item
l2[[2]][3]
[1] "c"

# same as above but using the item's name
l2[["item2"]][3]
[1] "c"

# same as above but using the `$` operator
l2$item2[3]
[1] "c"

5.4.4.4 Subset list with a nested list

If you have nested lists you can expand the ideas above to extract items and elements. We’ll use the following list l3 which has a nested list in item 2.

l3 <- list(item1 = 1:3, 
           item2 = list(item2a = letters[1:5], 
                        item3b = c(T, F, T, T)))
str(l3)
List of 2
 $ item1: int [1:3] 1 2 3
 $ item2:List of 2
  ..$ item2a: chr [1:5] "a" "b" "c" "d" ...
  ..$ item3b: logi [1:4] TRUE FALSE TRUE TRUE

If the goal is to subset l3 to extract the nested list item item2a from item2, we can perform this multiple ways.

# preserve the output as a list
l3[[2]][1]
$item2a
[1] "a" "b" "c" "d" "e"

# same as above but simplify the output
l3[[2]][[1]]
[1] "a" "b" "c" "d" "e"

# same as above with names
l3[["item2"]][["item2a"]]
[1] "a" "b" "c" "d" "e"

# same as above with `$` operator
l3$item2$item2a
[1] "a" "b" "c" "d" "e"

# extract individual element from a nested list item
l3[[2]][[1]][3]
[1] "c"

5.4.5 Applying functions to lists

5.4.5.1 The `lapply()` function

The lapply() function does the following simple series of operations:

it loops over a list, iterating over each element in that list
it applies a function to each element of the list (a function that you specify)
and returns a list (the l is for “list”).

The syntax for lapply() is as follows where

X is the list
FUN is the function to be applied
... is for any other arguments to be passed to the function

# syntax of lapply function
lapply(X, FUN, ...)

To provide examples we’ll generate a list of four items:

data <- list(item1 = 1:4, 
             item2 = rnorm(10), 
             item3 = rnorm(20, 1), 
             item4 = rnorm(100, 5))

# get the mean of each list item 
lapply(data, mean)
$item1
[1] 2.5

$item2
[1] -0.06161948

$item3
[1] 0.6029422

$item4
[1] 4.816777

The above provides a simple example where each list item is simply a vector of numeric values. However, consider the case where you have a list that contains data frames and you would like to loop through each list item and perform a function to the data frame. In this case we can embed an apply function within an lapply function.

For example, the following creates a list for R’s built in beaver data sets. The lapply function loops through each of the two list items and uses apply to calculate the mean of the columns in both list items. Note that I wrap the apply function with round to provide an easier to read output.

# list of R's built in beaver data
beaver_data <- list(beaver1 = beaver1, 
                    beaver2 = beaver2)

# get the mean of each list item 
lapply(beaver_data, function(x) round(apply(x, 2, mean), 2))
$beaver1
    day    time    temp   activ 
 346.20 1312.02   36.86    0.05 

$beaver2
    day    time    temp   activ 
 307.13 1446.20   37.60    0.62

5.4.5.2 The `sapply()` function

The sapply() function behaves similarly to lapply(); the only real difference is in the return value. sapply() will try to simplify the result of lapply() if possible. Essentially, sapply() calls lapply() on its input and then applies the following algorithm:

If the result is a list where every element is length 1, then a vector is returned
If the result is a list where every element is a vector of the same length (> 1), a matrix is returned.
If neither of the above simplifications can be performed then a list is returned

To illustrate the differences we can use the previous example using a list with the beaver data and compare the sapply and lapply outputs:

# list of R's built in beaver data
beaver_data <- list(beaver1 = beaver1, 
                    beaver2 = beaver2)

# get the mean of each list item and return as a list
lapply(beaver_data, function(x) round(apply(x, 2, mean), 2))
$beaver1
    day    time    temp   activ 
 346.20 1312.02   36.86    0.05 

$beaver2
    day    time    temp   activ 
 307.13 1446.20   37.60    0.62 

# get the mean of each list item and simplify the output
sapply(beaver_data, function(x) round(apply(x, 2, mean), 2))

	beaver1	beaver2
day	346.20	307.13
time	1312.02	1446.20
temp	36.86	37.60
activ	0.05	0.62

Its important to understand the difference between simplifying and preserving subsetting. Simplifying subsets returns the simplest possible data structure that can represent the output. Preserving subsets keeps the structure of the output the same as the input. See Hadley Wickham’s section on Simplifying vs. Preserving Subsetting to learn more.↩
Its important to understand the difference between simplifying and preserving subsetting. Simplifying subsets returns the simplest possible data structure that can represent the output. Preserving subsets keeps the structure of the output the same as the input. See Hadley Wickham’s section on Simplifying vs. Preserving Subsetting to learn more.↩