4.3 Dealing with Factors

Factors are used to represent categorical data and can be unordered or ordered. One can think of a factor as an integer vector where each integer has a label. In fact, factors are built on top of integer vectors using two attributes: the class(), “factor”, which makes them behave differently from regular integer vectors, and the levels(), which defines the set of allowed values. Factors are important in statistical modeling and are treated specially by modelling functions like lm() and glm(). This section will provide you the basics of managing categorical data as factors.

4.3.2 Ordering, Revaluing, & Dropping Factor Levels

We can easily order, revalue, and drop factor levels as the following illustrates.

4.3.2.2 Revalue Levels

To recode factor levels I usually use the revalue() function from the plyr package.

Using the :: notation allows you to access the revalue() function without having to fully load the plyr package.