-
Notifications
You must be signed in to change notification settings - Fork 176
/
ch12.Rmd
514 lines (347 loc) · 24.6 KB
/
ch12.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
---
output:
bookdown::html_document2:
fig_caption: yes
editor_options:
chunk_output_type: console
---
```{r echo = FALSE, cache = FALSE}
# This block needs cache=FALSE to set fig.width and fig.height, and have those
# persist across cached builds.
source("utils.R", local = TRUE)
# knitr::opts_chunk$set(fig.width=3.5, fig.height=3.5)
```
Using Colors in Plots {#CHAPTER-COLORS}
=====================
In ggplot2's implementation of the grammar of graphics, `color` is an aesthetic, just like *x* position, *y* position, and `size`. If color is just another aesthetic, why does it deserve its own chapter? The reason is that color is a more complicated aesthetic than the others. Instead of simply moving geoms left and right or making them larger and smaller, when you use color, there are many degrees of freedom and many more choices to make. What palette should you use for discrete values? Should you use a gradient with several different hues? How do you choose colors that can be interpreted accurately by those with color-vision deficiencies? In this chapter, I'll address these issues.
Setting the Colors of Objects {#RECIPE-COLORS-SETTING}
-----------------------------
### Problem
You want to set the color of some geoms in your graph.
### Solution
In the call to the geom, set the values of `colour` or `fill` (Figure \@ref(fig:FIG-COLORS-SETTING)):
(ref:cap-FIG-COLORS-SETTING) Setting `fill` and `colour` (left); Setting `colour` for points (right)
```{r FIG-COLORS-SETTING, fig.show="hold", fig.cap="(ref:cap-FIG-COLORS-SETTING)", message=FALSE, fig.width=4, fig.height=3.5}
library(MASS) # Load MASS for the birthwt data set
ggplot(birthwt, aes(x = bwt)) +
geom_histogram(fill = "red", colour = "black")
ggplot(mtcars, aes(x = wt, y = mpg)) +
geom_point(colour = "red")
```
### Discussion
In ggplot2, there's an important difference between *setting* and *mapping* aesthetic properties. In the preceding example, we set the color of the objects to "red".
Generally speaking, `colour` controls the color of lines and of the outlines of polygons, while `fill` controls the color of the fill area of polygons. However, point shapes are sometimes a little different. For most point shapes, the color of the entire point is controlled by `colour`, not `fill`. The exception is the point shapes (21–25) that have both a fill and an outline.
You can use `colour` or `color` interchangeably with ggplot2. In this book, I've used `colour`, in keeping with the form used in the official ggplot2 documentation.
### See Also
For more information about point shapes, see Recipe \@ref(RECIPE-LINE-GRAPH-POINT-APPEARANCE).
See Recipe \@ref(RECIPE-COLORS-PALETTE-DISCRETE-MANUAL) for more on specifying colors.
Representing Variables with Colors {#RECIPE-COLORS-MAPPING}
---------------------------
### Problem
You want to use a variable (column from a data frame) to control the color of geoms.
### Solution
In the call to the geom, inside of `aes()`, set the value of `colour` or `fill` to the name of one of the columns in the data (Figure \@ref(fig:FIG-COLORS-MAPPING)):
```{r eval=FALSE}
library(gcookbook) # Load gcookbook for the cabbage_exp data set
# These both have the same effect
ggplot(cabbage_exp, aes(x = Date, y = Weight, fill = Cultivar)) +
geom_col(colour = "black", position = "dodge")
ggplot(cabbage_exp, aes(x = Date, y = Weight)) +
geom_col(aes(fill = Cultivar), colour = "black", position = "dodge")
# These both have the same effect
ggplot(mtcars, aes(x = wt, y = mpg, colour = cyl)) +
geom_point()
ggplot(mtcars, aes(x = wt, y = mpg)) +
geom_point(aes(colour = cyl))
```
(ref:cap-FIG-COLORS-MAPPING) Mapping a variable to `fill` (left); Mapping a variable to `colour` for points (right)
```{r FIG-COLORS-MAPPING, echo=FALSE, fig.show="hold", message=FALSE, fig.cap="(ref:cap-FIG-COLORS-MAPPING)", fig.width=4, fig.height=3.5}
library(gcookbook) # Load gcookbook for the cabbage_exp data set
ggplot(cabbage_exp, aes(x = Date, y = Weight, fill = Cultivar)) +
geom_col(colour = "black", position = "dodge")
ggplot(mtcars, aes(x = wt, y = mpg, colour = cyl)) +
geom_point()
```
When the mapping is specified in `ggplot()` it is used as the default mapping, which is inherited by all the geoms. Within a geom, the default mappings can be overridden.
### Discussion
In the `cabbage_exp` example, the variable `Cultivar` is mapped to `fill`. The `Cultivar` column in `cabbage_exp` is a factor, so ggplot treats it as a categorical variable. You can check the type using `str()`:
```{r}
str(cabbage_exp)
```
In the `mtcars` example, `cyl` is numeric, so it is treated as a continuous variable. Because of this, even though the actual values of `cyl` include only 4, 6, and 8, the legend has entries for the intermediate values 5 and 7. To make ggplot treat `cyl` as a categorical variable, you can convert it to a factor in the call to `ggplot()` (Figure \@ref(fig:FIG-COLORS-MAPPING-FACTOR), left), or you can modify the data so that the column is a character vector or factor (Figure \@ref(fig:FIG-COLORS-MAPPING-FACTOR), right):
(ref:cap-FIG-COLORS-MAPPING-FACTOR) Converting `cyl` to a factor, within the call to ggplot (left); By modifying the dataframe (right)
```{r FIG-COLORS-MAPPING-FACTOR, fig.show="hold", fig.cap="(ref:cap-FIG-COLORS-MAPPING-FACTOR)", fig.width=4, fig.height=3.5}
# Convert to factor in call to ggplot()
ggplot(mtcars, aes(x = wt, y = mpg, colour = factor(cyl))) +
geom_point()
# Another method: Convert to factor in the data
library(dplyr)
mtcars_mod <- mtcars %>%
mutate(cyl = as.factor(cyl)) # Convert cyl to a factor
ggplot(mtcars_mod, aes(x = wt, y = mpg, colour = cyl)) +
geom_point()
```
### See Also
You may also want to change the colors that are used in the scale. For continuous data, see Recipe \@ref(RECIPE-COLORS-PALETTE-CONTINUOUS). For discrete data, see Recipe \@ref(RECIPE-COLORS-PALETTE-DISCRETE) and Recipe \@ref(RECIPE-COLORS-PALETTE-DISCRETE-MANUAL).
Using a Colorblind-Friendly Palette {#RECIPE-COLORS-PALETTE-DISCRETE-COLORBLIND}
-----------------------------------
### Problem
You want to select a color palette that can also be distinguished by colorblind viewers.
### Solution
Use the color scales in the viridis package.
The viridis package contains a set of beautiful color scales that are each designed to span as wide a palette as possible, making it easier to see differences in your data. These scales are also designed to be perceptually uniform, printable in grey scale, and easier to read by those with colorblindness.
Here is an example from the introduction page to viridis (https://cran.r-project.org/web/packages/viridis/vignettes/intro-to-viridis.html; Figure \@ref(fig:FIG-COLORS-PALETTE-VIRIDIS)):
```{r FIG-COLORS-PALETTE-VIRIDIS, warning=FALSE, echo=FALSE, fig.cap="Example of viridis color palette", fig.height=5, fig.width=5}
ggplot(data.frame(x = rnorm(10000), y = rnorm(10000)), aes(x = x, y = y)) +
geom_hex() +
coord_fixed() +
scale_fill_viridis_c() +
theme_bw()
```
The viridis color scales can be implemented for data that is both continuous and discrete in nature. You will need to add `scale_fill_viridis_c()` to your plot if your data is continuous. If your data is discrete you will need to use `scale_fill_viridis_d()` instead, as in Figure \@ref(fig:FIG-COLORS-PALETTE-DISCRETE-COLORBLIND-VIRIDIS) below:
```{r FIG-COLORS-PALETTE-DISCRETE-COLORBLIND-VIRIDIS, fig.show="hold", fig.cap="A plot with the colorblind-friendly viridis palette"}
library(gcookbook) # Load gcookbook for the uspopage data set
# Create the base plot
uspopage_plot <- ggplot(uspopage, aes(x = Year, y = Thousands, fill = AgeGroup)) +
geom_area()
# Add the viridis color scale
uspopage_plot +
scale_fill_viridis_d()
```
### Discussion
About 8 percent of males and 0.5 percent of females have some form of color-vision deficiency, so there's a good chance that someone in your audience will be among them. There are many different forms of color blindness - the palettes that are mentioned in this book are designed to enable people with any of the most common forms of color-vision deficiency to distinguish the colors. (Monochromacy, or total colorblindness, is rare. Those who have it can only see differences in brightness.)
The viridis color scales come with the current version of ggplot2 (3.0.0). There are also other color palettes that are friendly to users with color blindness, such as those in cetcolor package (see below).
### See Also
To see more on the different viridis palettes, see `?scales::viridis_pal`.
The cetcolor scales: <https://github.com/coatless/cetcolor>.
The Color Oracle program (<http://colororacle.org>) can simulate how things on your screen appear to someone with color vision deficiency, but keep in mind that the simulation isn't perfect. In my informal testing, I viewed an image with simulated red-green deficiency, and I could distinguish the colors just fine -- but others with actual red-green deficiency viewed the same image and couldn't tell the colors apart!
Using a Different Palette for a Discrete Variable {#RECIPE-COLORS-PALETTE-DISCRETE}
-------------------------------------------------
### Problem
You want to use different colors for a discrete mapped variable.
### Solution
Use one of the scales listed in Table \@ref(tab:TABLE-DISCRETE-FILL-AND-COLOR-SCALES).
Table: (\#tab:TABLE-DISCRETE-FILL-AND-COLOR-SCALES) Discrete fill and color scales
+-------------------------+---------------------------+-----------------------+
| Fill scale | Color scale | Description |
+=========================+===========================+=======================+
| `scale_fill_discrete()` | `scale_colour_discrete()` | Colors evenly spaced |
| | | around the color |
| | | wheel (same as `hue`) |
+-------------------------+---------------------------+-----------------------+
| `scale_fill_hue()` | `scale_colour_hue()` | Colors evenly spaced |
| | | around the color |
| | | wheel (same as |
| | | `discrete`) |
+-------------------------+---------------------------+-----------------------+
| `scale_fill_grey()` | `scale_colour_grey()` | Greyscale palette |
+-------------------------+---------------------------+-----------------------+
| `scale_fill_viridis_d()`| `scale_colour_viridis_d()` | Viridis palettes |
+-------------------------+---------------------------+-----------------------+
| `scale_fill_brewer()` | `scale_colour_brewer()` | ColorBrewer palettes |
+-------------------------+---------------------------+-----------------------+
| `scale_fill_manual()` | `scale_colour_manual()` | Manually specified |
| | | colors |
+-------------------------+---------------------------+-----------------------+
In the example here we'll use the default palette (hue), a viridis palette, and a ColorBrewer palette (Figure \@ref(fig:FIG-COLORS-PALETTE-DISCRETE)):
```{r FIG-COLORS-PALETTE-DISCRETE, message=FALSE, fig.show="hold", fig.cap="Default palette (using hue; top); A viridis palette (middle); A ColorBrewer palette (bottom)"}
library(gcookbook) # Load gcookbook for the uspopage data set
library(viridis) # Load viridis for the viridis palette
# Create the base plot
uspopage_plot <- ggplot(uspopage, aes(x = Year, y = Thousands, fill = AgeGroup)) +
geom_area()
# These four specifications all have the same effect
uspopage_plot
# uspopage_plot + scale_fill_discrete()
# uspopage_plot + scale_fill_hue()
# uspopage_plot + scale_color_viridis()
# Viridis palette
uspopage_plot +
scale_fill_viridis(discrete = TRUE)
# ColorBrewer palette
uspopage_plot +
scale_fill_brewer()
```
### Discussion
Changing a palette is a modification of the color (or fill) scale: it involves a change in the mapping from numeric or categorical values to aesthetic attributes. There are two types of scales that use colors: *fill* scales and *color* scales.
With `scale_fill_hue()`, the colors are taken from around the color wheel in the HCL (hue-chroma-lightness) color space. The default lightness value is 65 on a scale from 0–100. This is good for filled areas, but it's a bit light for points and lines. To make the colors darker for points and lines, as in Figure \@ref(fig:FIG-COLORS-PALETTE-LIGHTNESS) (right), set the value of `l` (luminance/lightness):
```{r FIG-COLORS-PALETTE-LIGHTNESS, fig.show="hold", fig.cap="Points with default lightness (left); With lightness set to 45 (right)", fig.width=4, fig.height=3.5}
# Create the base scatter plot
hw_splot <- ggplot(heightweight, aes(x = ageYear, y = heightIn, colour = sex)) +
geom_point()
# Default lightness = 65
hw_splot
# Slightly darker, set lightness = 45
hw_splot +
scale_colour_hue(l = 45)
```
The viridis package provides a number of color scales that make it easy to see differences across your data. See Recipe \@ref(RECIPE-COLORS-PALETTE-DISCRETE-COLORBLIND) for more details and examples.
The ColorBrewer package provides a number of palettes. You can generate a graphic showing all of them, as shown in Figure \@ref(fig:FIG-COLORS-PALETTE-BREWER):
```{r FIG-COLORS-PALETTE-BREWER, echo=FALSE, fig.cap="All the ColorBrewer palettes", fig.height=8, fig.width=8}
library(RColorBrewer)
par(mar = c(0, 3, 0, 0))
display.brewer.all()
```
```{r, eval=FALSE}
library(RColorBrewer)
display.brewer.all()
```
The ColorBrewer palettes can be selected by name. For example, this will use the "Oranges" palette (Figure \@ref(fig:FIG-COLORS-PALETTE-BREWER-NAME)):
```{r FIG-COLORS-PALETTE-BREWER-NAME, fig.cap="Using a named ColorBrewer palette", fig.width=4, fig.height=3.5}
hw_splot +
scale_colour_brewer(palette = "Oranges") +
theme_bw()
```
You can also use a palette of greys. This is useful for print when the output is in black and white. The default is to start at 0.2 and end at 0.8, on a scale from 0 (black) to 1 (white), but you can change the range, as shown in Figure \@ref(fig:FIG-COLORS-PALETTE-GREY).
```{r FIG-COLORS-PALETTE-GREY, fig.show="hold", fig.cap="Using the default grey palette (left); A different grey palette (right)", fig.width=4, fig.height=3.5}
hw_splot +
scale_colour_grey()
# Reverse the direction and use a different range of greys
hw_splot +
scale_colour_grey(start = 0.7, end = 0)
```
### See Also
See Recipe \@ref(RECIPE-LEGEND-REVERSE) for more information about reversing the legend.
To select colors manually, see Recipe \@ref(RECIPE-COLORS-PALETTE-DISCRETE-MANUAL).
For more about viridis, see <https://cran.r-project.org/web/packages/viridis/vignettes/intro-to-viridis.html>.
For more about ColorBrewer, see <http://colorbrewer2.org>.
Using a Manually Defined Palette for a Discrete Variable {#RECIPE-COLORS-PALETTE-DISCRETE-MANUAL}
--------------------------------------------------------
### Problem
You want to use different colors for a discrete mapped variable.
### Solution
In the example here, we'll manually define colors by specifying values with `scale_colour_manual()` (Figure \@ref(fig:FIG-COLORS-PALETTE-DISCRETE-MANUAL)). The colors can be named, or they can be specified with RGB values:
```{r FIG-COLORS-PALETTE-DISCRETE-MANUAL, fig.show="hold", fig.cap="Scatter plot with named colors (top left); With slightly different RGB colors (top right); With colors from the viridis color scale (bottom)", fig.width=4, fig.height=3.5}
library(gcookbook) # Load gcookbook for the heightweight data set
# Create the base plot
hw_plot <- ggplot(heightweight, aes(x = ageYear, y = heightIn, colour = sex)) +
geom_point()
# Using color names
hw_plot +
scale_colour_manual(values = c("red", "blue"))
# Using RGB values
hw_plot +
scale_colour_manual(values = c("#CC6666", "#7777DD"))
# Using RGB values based on the viridis color scale
hw_plot +
scale_colour_manual(values = c("#440154FF", "#FDE725FF")) +
theme_bw()
```
For fill scales, use `scale_fill_manual()` instead.
### Discussion
The order of the items in the values vector matches the order of the factor levels for the discrete scale. In the preceding example, the order of sex is f, then m, so the first item in values goes with f and the second goes with m. Here's how to see the order of factor levels:
```{r}
levels(heightweight$sex)
```
If the variable is a character vector, not a factor, it will automatically be converted to a factor, and by default the levels will appear in alphabetical order.
It's possible to specify the colors in a different order by using a named vector:
```{r eval=FALSE}
hw_plot +
scale_colour_manual(values = c(m = "blue", f = "red"))
```
There is a large set of named colors in R, which you can see by running `color()`. Some basic color names are useful: "white", "black", "grey80", "red", "blue", "darkred", and so on. There are many other named colors, but their names are generally not very informative (I certainly have no idea what "thistle3" and "seashell" look like), so it's often easier to use numeric RGB values for specifying colors.
RGB colors are specified as six-digit hexadecimal (base-16) numbers of the form `#RRGGBB`. In hexadecimal, the digits go from 0 to 9, and then continue with A (10 in base 10) to F (15 in base 10). Each color is represented by two digits and can range from 00 to FF (255 in base 10). So, for example, the color `#FF0099` has a value of 255 for red, 0 for green, and 153 for blue, resulting in a shade of magenta. The hexadecimal numbers for each color channel often repeat the same digit because it makes them a little easier to read, and because the precise value of the second digit has a relatively insignificant effect on appearance.
Here are some rules of thumb for specifying and adjusting RGB colors:
* In general, higher numbers are brighter and lower numbers are darker.
* To get a shade of grey, set all the channels to the same value.
* The opposites of RGB are CMY: Cyan, Magenta, and Yellow. Higher values for the red channel make it more red, and lower values make it more cyan. The same is true for the pairs green and magenta, and blue and yellow.
You may want to manually select colors based on the color scales in the viridis package, as described in Recipe \@ref(RECIPE-COLORS-PALETTE-DISCRETE). You can do so by calling `viridis()` and passing it the number of discrete categories you have. This will generate the RGB hexadecimal values. You can similarly generate the RGB values for the other color scales in the viridis package: "magma", "plasma", "inferno", and "cividis".
```{r}
library(viridis)
viridis(2) ## Specifying 2 discrete categories with the viridis color scale
inferno(5) ## Specifying 5 discrete categories with the inferno color scale
```
### See Also
A chart of RGB color codes: <http://html-color-codes.com>.
Using a Manually Defined Palette for a Continuous Variable {#RECIPE-COLORS-PALETTE-CONTINUOUS}
----------------------------------------------------------
### Problem
You want to use different colors for a continuous variable.
### Solution
In the example here, we'll specify the colors for a continuous variable using various gradient scales (Figure \@ref(fig:FIG-COLORS-PALETTE-CONTINUOUS)). The colors can be named, or they can be specified with RGB values:
(ref:cap-FIG-COLORS-PALETTE-CONTINUOUS) Clockwise from top left: default colors, two-color gradient (black and white) with `scale_colour_gradient()`, three-color gradient with midpoint with `scale_colour_gradient2()`, four-color gradient with `scale_colour_gradientn()`
```{r FIG-COLORS-PALETTE-CONTINUOUS, fig.show="hold", message=FALSE, fig.cap="(ref:cap-FIG-COLORS-PALETTE-CONTINUOUS)", fig.width=4, fig.height=3.5}
library(gcookbook) # Load gcookbook for the heightweight data set
# Create the base plot
hw_plot <- ggplot(heightweight, aes(x = ageYear, y = heightIn, colour = weightLb)) +
geom_point(size = 3)
hw_plot
# A gradient with a white midpoint
library(scales)
hw_plot +
scale_colour_gradient2(
low = muted("red"),
mid = "white",
high = muted("blue"),
midpoint = 110
)
# With a gradient between two colors (black and white)
hw_plot +
scale_colour_gradient(low = "black", high = "white")
# A gradient of n colors
hw_plot +
scale_colour_gradientn(colours = c("darkred", "orange", "yellow", "white"))
```
For fill scales, use `scale_fill_xxx()` versions instead, where `xxx` is one of `gradient`, `gradient2`, or `gradientn`.
### Discussion
Mapping continuous values to a color scale requires a continuously changing palette of colors. Table \@ref(tab:TABLE-CONTINUOUS-COLOR-SCALES). lists the continuous color and fill scales.
Table: (\#tab:TABLE-CONTINUOUS-COLOR-SCALES) Continuous fill and color scales
+--------------------------+----------------------------+-----------------------+
| Fill scale | Color scale | Description |
+==========================+============================+=======================+
| `scale_fill_gradient()` | `scale_colour_gradient()` | Two-color gradient |
+--------------------------+----------------------------+-----------------------+
| `scale_fill_gradient2()` | `scale_colour_gradient2()` | Gradient with a |
| | | middle color and two |
| | | colors that diverge |
| | | from it |
+--------------------------+----------------------------+-----------------------+
| `scale_fill_gradientn()` | `scale_colour_gradientn()` | Gradient with *n* |
| | | colors, equally |
| | | spaced |
+--------------------------+----------------------------+-----------------------+
|`scale_fill_viridis_c()` | `scale_colour_viridis_c()` | Viridis palettes |
+--------------------------+----------------------------+-----------------------+
Notice that we used the `muted()` function in the examples. This is a function from the scales package that returns an RGB value that is a less-saturated version of the color chosen.
### See Also
If you want use a discrete (categorical) scale instead of a continuous one, you can recode your data into categorical values. See Recipe \@ref(RECIPE-DATAPREP-RECODE-CONTINUOUS).
Coloring a Shaded Region Based on Value {#RECIPE-COLORS-AREA-VALUE}
---------------------------------------
### Problem
You want to set the color of a shaded region based on the *y* value.
### Solution
Add a column that categorizes the *y* values, then map that column to fill. In this example, we’ll first categorize the values as positive or negative:
```{r}
library(gcookbook) # Load gcookbook for the climate data set
library(dplyr)
climate_mod <- climate %>%
filter(Source == "Berkeley") %>%
mutate(valence = if_else(Anomaly10y >= 0, "pos", "neg"))
climate_mod
```
Once we've categorized the values as positive or negative, we can make the plot, mapping valence to the fill color, as shown in Figure \@ref(fig:FIG-COLORS-AREA-VALUE):
```{r FIG-COLORS-AREA-VALUE, fig.cap="Mapping valence to fill color-notice the red area under the zero line around 1950", fig.width=10}
ggplot(climate_mod, aes(x = Year, y = Anomaly10y)) +
geom_area(aes(fill = valence)) +
geom_line() +
geom_hline(yintercept = 0)
```
### Discussion
If you look closely at the figure, you'll notice that there are some stray shaded areas near the zero line. This is because each of the two colored areas is a single polygon bounded by the data points, and the data points are not actually at zero. To solve this problem, we can interpolate the data to 1,000 points by using `approx()`:
```{r}
# approx() returns a list with x and y vectors
interp <- approx(climate_mod$Year, climate_mod$Anomaly10y, n = 1000)
# Put in a data frame and recalculate valence
cbi <- data.frame(Year = interp$x, Anomaly10y = interp$y) %>%
mutate(valence = if_else(Anomaly10y >= 0, "pos", "neg"))
```
It would be more precise (and more complicated) to interpolate exactly where the line crosses zero, but `approx()` works fine for the purposes here.
Now we can plot the interpolated data (Figure \@ref(fig:FIG-COLORS-AREA-VALUE-INTERPOLATED)). This time we'll make a few adjustments -- we'll make the shaded regions partially transparent, change the colors, remove the legend, and remove the padding on the left and right sides:
```{r FIG-COLORS-AREA-VALUE-INTERPOLATED, fig.cap="Shaded regions with interpolated data", fig.width=10}
ggplot(cbi, aes(x = Year, y = Anomaly10y)) +
geom_area(aes(fill = valence), alpha = .4) +
geom_line() +
geom_hline(yintercept = 0) +
scale_fill_manual(values = c("#CCEEFF", "#FFDDDD"), guide = FALSE) +
scale_x_continuous(expand = c(0, 0))
```