5.4 Managing Lists
A list is an R structure that allows you to combine elements of different types, including lists embedded in a list, and length. Many statistical outputs are provided as a list as well; therefore, its critical to understand how to work with lists. In this section I will guide you throught the basics of managing lists to include:
5.4.1 Creating Lists
To create a list we can use the list()
function. Note how each of the four list items are of different classes (integer, character, logical, and numeric) and different length.
l <- list(1:3, "a", c(TRUE, FALSE, TRUE), c(2.5, 4.2))
str(l)
List of 4
$ : int [1:3] 1 2 3
$ : chr "a"
$ : logi [1:3] TRUE FALSE TRUE
$ : num [1:2] 2.5 4.2
# a list containing a list
l <- list(1:3, list(letters[1:5], c(TRUE, FALSE, TRUE)))
str(l)
List of 2
$ : int [1:3] 1 2 3
$ :List of 2
..$ : chr [1:5] "a" "b" "c" "d" ...
..$ : logi [1:3] TRUE FALSE TRUE
5.4.2 Adding on to Lists
To add additional list components to a list we can leverage the list()
and append()
functions. We can illustrate with the following list.
l1 <- list(1:3, "a", c(TRUE, FALSE, TRUE))
str(l1)
List of 3
$ : int [1:3] 1 2 3
$ : chr "a"
$ : logi [1:3] TRUE FALSE TRUE
If we add the new elements with list()
it will create a list of two components, component 1 will be a nested list of the original list and component 2 will be the new elements added:
l2 <- list(l1, c(2.5, 4.2))
str(l2)
List of 2
$ :List of 3
..$ : int [1:3] 1 2 3
..$ : chr "a"
..$ : logi [1:3] TRUE FALSE TRUE
$ : num [1:2] 2.5 4.2
To simply add a 4th list component without creating nested lists we use the append()
function:
l3 <- append(l1, list(c(2.5, 4.2)))
str(l3)
List of 4
$ : int [1:3] 1 2 3
$ : chr "a"
$ : logi [1:3] TRUE FALSE TRUE
$ : num [1:2] 2.5 4.2
Alternatively, we can also add a new list component by utilizing the ‘$’ sign and naming the new item:
l3$item4 <- "new list item"
str(l3)
List of 5
$ : int [1:3] 1 2 3
$ : chr "a"
$ : logi [1:3] TRUE FALSE TRUE
$ : num [1:2] 2.5 4.2
$ item4: chr "new list item"
To add individual elements to a specific list component we need to introduce some subsetting, we’ll continue with our original l1
list:
To add additional values to a list item you need to subset for that specific list item and then you can use the c()
function to add
the additional elements to that list item:
5.4.3 Adding Attributes to Lists
The attributes that you can add to lists include names, general comments, and specific list item comments. Currently, our l1
list has no attributes:
We can add names to lists in two ways. First, we can use names()
to assign names to list items in a pre-existing list. Second, we can add names to a list when we are creating a list.
# adding names to a pre-existing list
names(l1) <- c("item1", "item2", "item3")
str(l1)
List of 3
$ item1: int [1:6] 1 2 3 4 5 6
$ item2: chr [1:4] "a" "adding" "to a" "list"
$ item3: logi [1:3] TRUE FALSE TRUE
attributes(l1)
$names
[1] "item1" "item2" "item3"
# adding names when creating lists
l2 <- list(item1 = 1:3, item2 = letters[1:5], item3 = c(T, F, T, T))
str(l2)
List of 3
$ item1: int [1:3] 1 2 3
$ item2: chr [1:5] "a" "b" "c" "d" ...
$ item3: logi [1:4] TRUE FALSE TRUE TRUE
attributes(l2)
$names
[1] "item1" "item2" "item3"
We can also add comments to lists. As previously mentioned, comments act as a note to the user without changing how the object behaves. With lists, we can add a general comment to the list using comment()
and we can also add comments to specific list items with attr()
.
# adding a general comment to list l2 with comment()
comment(l2) <- "This is a comment on a list"
str(l2)
List of 3
$ item1: int [1:3] 1 2 3
$ item2: chr [1:5] "a" "b" "c" "d" ...
$ item3: logi [1:4] TRUE FALSE TRUE TRUE
- attr(*, "comment")= chr "This is a comment on a list"
attributes(l2)
$names
[1] "item1" "item2" "item3"
$comment
[1] "This is a comment on a list"
# adding a comment to a specific list item with attr()
attr(l2, "item2") <- "Comment for item2"
str(l2)
List of 3
$ item1: int [1:3] 1 2 3
$ item2: chr [1:5] "a" "b" "c" "d" ...
$ item3: logi [1:4] TRUE FALSE TRUE TRUE
- attr(*, "comment")= chr "This is a comment on a list"
- attr(*, "item2")= chr "Comment for item2"
attributes(l2)
$names
[1] "item1" "item2" "item3"
$comment
[1] "This is a comment on a list"
$item2
[1] "Comment for item2"
5.4.4 Subsetting Lists
If list x is a train carrying objects, then x[[5]] is the object in car 5; x[4:6] is a train of cars 4-6 Twitter - ‘(???)’
To subset lists we can utilize the single bracket [ ]
, double brackets [[ ]]
, and dollar sign $
operators. Each approach provides a specific purpose and can be combined in different ways to achieve the following subsetting objectives:
- Subset list and preserve output as a list
- Subset list and simplify output
- Subset list to get elements out of a list
- Subset list with a nested list
5.4.4.1 Subset list and preserve output as a list
To extract one or more list items while preserving12 the output in list format use the [ ]
operator:
# extract first list item
l2[1]
$item1
[1] 1 2 3
# same as above but using the item's name
l2["item1"]
$item1
[1] 1 2 3
# extract multiple list items
l2[c(1,3)]
$item1
[1] 1 2 3
$item3
[1] TRUE FALSE TRUE TRUE
# same as above but using the items' names
l2[c("item1", "item3")]
$item1
[1] 1 2 3
$item3
[1] TRUE FALSE TRUE TRUE
5.4.4.2 Subset list and simplify output
To extract one or more list items while simplifying13 the output use the [[ ]]
or $
operator:
# extract first list item and simplify to a vector
l2[[1]]
[1] 1 2 3
# same as above but using the item's name
l2[["item1"]]
[1] 1 2 3
# same as above but using the `$` operator
l2$item1
[1] 1 2 3
One thing that differentiates the [[
operator from the $
is that the [[
operator can be used with computed indices. The $
operator can only be used with literal names.
5.4.4.3 Subset list to get elements out of a list
To extract individual elements out of a specific list item combine the [[
(or $
) operator with the [
operator:
5.4.4.4 Subset list with a nested list
If you have nested lists you can expand the ideas above to extract items and elements. We’ll use the following list l3
which has a nested list in item 2.
l3 <- list(item1 = 1:3,
item2 = list(item2a = letters[1:5],
item3b = c(T, F, T, T)))
str(l3)
List of 2
$ item1: int [1:3] 1 2 3
$ item2:List of 2
..$ item2a: chr [1:5] "a" "b" "c" "d" ...
..$ item3b: logi [1:4] TRUE FALSE TRUE TRUE
If the goal is to subset l3
to extract the nested list item item2a
from item2
, we can perform this multiple ways.
# preserve the output as a list
l3[[2]][1]
$item2a
[1] "a" "b" "c" "d" "e"
# same as above but simplify the output
l3[[2]][[1]]
[1] "a" "b" "c" "d" "e"
# same as above with names
l3[["item2"]][["item2a"]]
[1] "a" "b" "c" "d" "e"
# same as above with `$` operator
l3$item2$item2a
[1] "a" "b" "c" "d" "e"
# extract individual element from a nested list item
l3[[2]][[1]][3]
[1] "c"
5.4.5 Applying functions to lists
5.4.5.1 The lapply()
function
The lapply()
function does the following simple series of operations:
it loops over a list, iterating over each element in that list
it applies a function to each element of the list (a function that you specify)
and returns a list (the l is for “list”).
The syntax for lapply()
is as follows where
X
is the listFUN
is the function to be applied...
is for any other arguments to be passed to the function
To provide examples we’ll generate a list of four items:
data <- list(item1 = 1:4,
item2 = rnorm(10),
item3 = rnorm(20, 1),
item4 = rnorm(100, 5))
# get the mean of each list item
lapply(data, mean)
$item1
[1] 2.5
$item2
[1] -0.06161948
$item3
[1] 0.6029422
$item4
[1] 4.816777
The above provides a simple example where each list item is simply a vector of numeric values. However, consider the case where you have a list that contains data frames and you would like to loop through each list item and perform a function to the data frame. In this case we can embed an apply
function within an lapply
function.
For example, the following creates a list for R’s built in beaver data sets. The lapply
function loops through each of the two list items and uses apply
to calculate the mean of the columns in both list items. Note that I wrap the apply function with round
to provide an easier to read output.
5.4.5.2 The sapply()
function
The sapply()
function behaves similarly to lapply()
; the only real difference is in the return value. sapply()
will try to simplify the result of lapply()
if possible. Essentially, sapply()
calls lapply()
on its input and then applies the following algorithm:
- If the result is a list where every element is length 1, then a vector is returned
- If the result is a list where every element is a vector of the same length (> 1), a matrix is returned.
- If neither of the above simplifications can be performed then a list is returned
To illustrate the differences we can use the previous example using a list with the beaver data and compare the sapply
and lapply
outputs:
# list of R's built in beaver data
beaver_data <- list(beaver1 = beaver1,
beaver2 = beaver2)
# get the mean of each list item and return as a list
lapply(beaver_data, function(x) round(apply(x, 2, mean), 2))
$beaver1
day time temp activ
346.20 1312.02 36.86 0.05
$beaver2
day time temp activ
307.13 1446.20 37.60 0.62
# get the mean of each list item and simplify the output
sapply(beaver_data, function(x) round(apply(x, 2, mean), 2))
beaver1 | beaver2 | |
---|---|---|
day | 346.20 | 307.13 |
time | 1312.02 | 1446.20 |
temp | 36.86 | 37.60 |
activ | 0.05 | 0.62 |
Its important to understand the difference between simplifying and preserving subsetting. Simplifying subsets returns the simplest possible data structure that can represent the output. Preserving subsets keeps the structure of the output the same as the input. See Hadley Wickham’s section on Simplifying vs. Preserving Subsetting to learn more.↩
Its important to understand the difference between simplifying and preserving subsetting. Simplifying subsets returns the simplest possible data structure that can represent the output. Preserving subsets keeps the structure of the output the same as the input. See Hadley Wickham’s section on Simplifying vs. Preserving Subsetting to learn more.↩