class: center, middle, inverse, title-slide # Tips for effective data visualization ##
ENST 222: Environmental Data Science ###
https://enst222.github.io/website
--- layout: true <div class="my-footer"> <span> <a href="https://enst222.github.io/website" target="_blank">ENST 222: Environmental Data Science</a> </span> </div> --- class: middle # Designing effective visualizations --- ## Keep it simple .pull-left-narrow[ <img src="img/pie-3d.jpg" width="100%" style="display: block; margin: auto;" /> ] .pull-right-wide[ <img src="14-effective-dataviz_files/figure-html/pie-to-bar-1.png" width="100%" style="display: block; margin: auto;" /> ] --- ## Use color to draw attention .pull-left[ <img src="14-effective-dataviz_files/figure-html/unnamed-chunk-2-1.png" width="100%" style="display: block; margin: auto;" /> ] .pull-right[ <img src="14-effective-dataviz_files/figure-html/unnamed-chunk-3-1.png" width="100%" style="display: block; margin: auto;" /> ] --- ## Tell a story <img src="img/time-series-story.png" width="80%" style="display: block; margin: auto;" /> .footnote[ Credit: Angela Zoss and Eric Monson, Duke DVS ] --- class: middle # Principles for effective visualizations --- ## Principles for effective visualizations - Order matters - Put long categories on the y-axis - Keep scales consistent - Select meaningful colors - Use meaningful and nonredundant labels --- ## Data In September 2019, YouGov survey asked 1,639 GB adults the following question: .pull-left[ > In hindsight, do you think Britain was right/wrong to vote to leave EU? > >- Right to leave >- Wrong to leave >- Don't know ] .pull-right[ <img src="14-effective-dataviz_files/figure-html/unnamed-chunk-6-1.png" width="100%" style="display: block; margin: auto;" /> ] .footnote[ Source: [YouGov Survey Results](https://d25d2506sfb94s.cloudfront.net/cumulus_uploads/document/x0msmggx08/YouGov%20-%20Brexit%20and%202019%20election.pdf), retrieved Oct 7, 2019 ] --- class: middle # Order matters --- ## Alphabetical order is rarely ideal .panelset[ .panel[.panel-name[Plot] <img src="14-effective-dataviz_files/figure-html/unnamed-chunk-7-1.png" width="60%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Code] ```r ggplot(brexit, aes(x = opinion)) + geom_bar() ``` ] ] --- ## Order by frequency .panelset[ .panel[.panel-name[Plot] <img src="14-effective-dataviz_files/figure-html/unnamed-chunk-8-1.png" width="60%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Code] `fct_infreq`: Reorder factors' levels by frequency when using a barplot to visualize counts. Note: we're only mapping data to the x-axis here. `geom_bar()` then calculates the count for each x-axis category. ```r *ggplot(brexit, aes(x = fct_infreq(opinion))) + geom_bar() ``` ] ] --- class:middle ### This only works when you're mapping one variable to the x-axis. ### If you are mapping data values to the y-axis... ```r brexit %>% group_by(opinion) %>% summarize(total = n()) ``` ``` ## # A tibble: 3 × 2 ## opinion total ## <chr> <int> ## 1 Don't know 188 ## 2 Right 664 ## 3 Wrong 787 ``` --- ### 1) coerce to factor and reorder using other `forcats` functions. `fct_rev()` works here but there are [many other options](https://forcats.tidyverse.org/reference/index.html) to accommodate your data. ```r brexit %>% group_by(opinion) %>% summarize(total = n()) %>% * mutate(total = as.factor(total), * total= fct_rev(total)) %>% ggplot(aes(x = opinion, y = total)) + geom_bar(stat = "identity") ``` ] --- ### 1) coerce to factor and reorder using other `forcats` functions. <img src="14-effective-dataviz_files/figure-html/unnamed-chunk-11-1.png" width="60%" style="display: block; margin: auto;" /> --- ### 2) reorder based on value of the y-axis variable .panelset[ .panel[.panel-name[Code] ```r brexit %>% group_by(opinion, region) %>% summarize(total = n()) %>% * ggplot(aes(x = reorder(opinion, -total), y = total)) + geom_bar(stat = "identity") ``` ] .panel[.panel-name[Plot] <img src="14-effective-dataviz_files/figure-html/unnamed-chunk-13-1.png" width="60%" style="display: block; margin: auto;" /> ] ] --- ## Don't forget to clean up labels .panelset[ .panel[.panel-name[Plot] <img src="14-effective-dataviz_files/figure-html/unnamed-chunk-14-1.png" width="60%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Code] ```r ggplot(brexit, aes(x = opinion)) + geom_bar() + * labs( * x = "Opinion", * y = "Count" * ) ``` ] ] --- ## Alphabetical order (the default) is rarely ideal .panelset[ .panel[.panel-name[Plot] <img src="14-effective-dataviz_files/figure-html/unnamed-chunk-15-1.png" width="60%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Code] ```r ggplot(brexit, aes(x = region)) + geom_bar() ``` ] ] --- ### Instead, manually change to more logical level order if it makes more sense than by value .panelset[ .panel[.panel-name[Code] `fct_relevel`: Reorder factor levels using a custom order ```r brexit <- brexit %>% mutate( * region = fct_relevel( region, "london", "rest_of_south", "midlands_wales", "north", "scot" ) ) ``` ] .panel[.panel-name[Plot] <img src="14-effective-dataviz_files/figure-html/unnamed-chunk-16-1.png" width="60%" style="display: block; margin: auto;" /> ] ] --- ## Recode categorical variable values .panelset[ .panel[.panel-name[Recode] `fct_recode`: Change factor levels by hand ```r brexit <- brexit %>% mutate( * region = fct_recode( region, London = "london", `Rest of South` = "rest_of_south", `Midlands / Wales` = "midlands_wales", North = "north", Scotland = "scot" ) ) ``` ] .panel[.panel-name[Plot] <img src="14-effective-dataviz_files/figure-html/recode-plot-1.png" width="60%" style="display: block; margin: auto;" /> ] ] --- class: middle # Put long categories on the y-axis --- ## Long categories can be hard to read <img src="14-effective-dataviz_files/figure-html/unnamed-chunk-17-1.png" width="60%" style="display: block; margin: auto;" /> --- ## Move them to the y-axis .panelset[ .panel[.panel-name[Plot] <img src="14-effective-dataviz_files/figure-html/unnamed-chunk-18-1.png" width="60%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Code] ```r *ggplot(brexit, aes(y = region)) + geom_bar() ``` ] ] --- ## And reverse the order of levels .panelset[ .panel[.panel-name[Plot] <img src="14-effective-dataviz_files/figure-html/unnamed-chunk-19-1.png" width="60%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Code] `fct_rev`: Reverse order of factor levels ```r *ggplot(brexit, aes(y = fct_rev(region))) + geom_bar() ``` ] ] --- ## Clean up labels .panelset[ .panel[.panel-name[Plot] <img src="14-effective-dataviz_files/figure-html/unnamed-chunk-20-1.png" width="60%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Code] ```r ggplot(brexit, aes(y = fct_rev(region))) + geom_bar() + * labs( * x = "Count", * y = "Region" * ) ``` ] ] --- class: middle # Pick a purpose --- ## Segmented bar plots can be hard to read .panelset[ .panel[.panel-name[Plot] <img src="14-effective-dataviz_files/figure-html/unnamed-chunk-21-1.png" width="60%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Code] ```r *ggplot(brexit, aes(y = region, fill = opinion)) + geom_bar() ``` ] ] --- ## Use facets .panelset[ .panel[.panel-name[Plot] <img src="14-effective-dataviz_files/figure-html/unnamed-chunk-22-1.png" width="90%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Code] ```r ggplot(brexit, aes(y = opinion, fill = region)) + geom_bar() + * facet_wrap(~region, nrow = 1) ``` ] ] --- ## Avoid redundancy? <img src="14-effective-dataviz_files/figure-html/unnamed-chunk-23-1.png" width="90%" style="display: block; margin: auto;" /> --- ## Redundancy can help tell a story .panelset[ .panel[.panel-name[Plot] <img src="14-effective-dataviz_files/figure-html/unnamed-chunk-24-1.png" width="90%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Code] ```r ggplot(brexit, aes(y = opinion, fill = opinion)) + geom_bar() + facet_wrap(~region, nrow = 1) ``` ] ] --- ## Be selective with redundancy .panelset[ .panel[.panel-name[Plot] ``` ## Warning: The `<scale>` argument of `guides()` cannot be `FALSE`. Use ## "none" instead as of ggplot2 3.3.4. ``` <img src="14-effective-dataviz_files/figure-html/unnamed-chunk-25-1.png" width="90%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Code] ```r ggplot(brexit, aes(y = opinion, fill = opinion)) + geom_bar() + facet_wrap(~region, nrow = 1) + * guides(fill = FALSE) ``` ] ] --- ## Use informative labels .panelset[ .panel[.panel-name[Plot] <img src="14-effective-dataviz_files/figure-html/unnamed-chunk-26-1.png" width="90%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Code] ```r ggplot(brexit, aes(y = opinion, fill = opinion)) + geom_bar() + facet_wrap(~region, nrow = 1) + guides(fill = FALSE) + labs( * title = "Was Britain right/wrong to vote to leave EU?", x = NULL, y = NULL ) ``` ] ] --- ## Provide a bit more info .panelset[ .panel[.panel-name[Plot] <img src="14-effective-dataviz_files/figure-html/unnamed-chunk-27-1.png" width="90%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Code] ```r ggplot(brexit, aes(y = opinion, fill = opinion)) + geom_bar() + facet_wrap(~region, nrow = 1) + guides(fill = FALSE) + labs( title = "Was Britain right/wrong to vote to leave EU?", * subtitle = "YouGov Survey Results, 2-3 September 2019", * caption = "Source: https://d25d2506sfb94s.cloudfront.net/cumulus_uploads/document/x0msmggx08/YouGov%20-%20Brexit%20and%202019%20election.pdf", x = NULL, y = NULL ) ``` ] ] --- ## Let's do better .panelset[ .panel[.panel-name[Plot] <img src="14-effective-dataviz_files/figure-html/unnamed-chunk-28-1.png" width="90%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Code] ```r ggplot(brexit, aes(y = opinion, fill = opinion)) + geom_bar() + facet_wrap(~region, nrow = 1) + guides(fill = FALSE) + labs( title = "Was Britain right/wrong to vote to leave EU?", subtitle = "YouGov Survey Results, 2-3 September 2019", * caption = "Source: bit.ly/2lCJZVg", x = NULL, y = NULL ) ``` ] ] --- ## Fix up facet labels .panelset[ .panel[.panel-name[Plot] <img src="14-effective-dataviz_files/figure-html/unnamed-chunk-29-1.png" width="90%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Code] ```r ggplot(brexit, aes(y = opinion, fill = opinion)) + geom_bar() + facet_wrap(~region, nrow = 1, * labeller = label_wrap_gen(width = 12) ) + guides(fill = FALSE) + labs( title = "Was Britain right/wrong to vote to leave EU?", subtitle = "YouGov Survey Results, 2-3 September 2019", caption = "Source: bit.ly/2lCJZVg", x = NULL, y = NULL ) ``` ] ] --- class: middle # Select meaningful colors --- ## Rainbow colors not always the right choice <img src="14-effective-dataviz_files/figure-html/unnamed-chunk-30-1.png" width="90%" style="display: block; margin: auto;" /> --- ## Manually choose colors when needed .panelset[ .panel[.panel-name[Plot] <img src="14-effective-dataviz_files/figure-html/unnamed-chunk-31-1.png" width="90%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Code] ```r ggplot(brexit, aes(y = opinion, fill = opinion)) + geom_bar() + facet_wrap(~region, nrow = 1, labeller = label_wrap_gen(width = 12)) + guides(fill = FALSE) + labs(title = "Was Britain right/wrong to vote to leave EU?", subtitle = "YouGov Survey Results, 2-3 September 2019", caption = "Source: bit.ly/2lCJZVg", x = NULL, y = NULL) + * scale_fill_manual(values = c( * "Wrong" = "red", * "Right" = "green", * "Don't know" = "gray" * )) ``` ] ] --- ## Choosing better colors .center[.large[ [colorbrewer2.org](https://colorbrewer2.org/) ]] <img src="img/color-brewer.png" width="60%" style="display: block; margin: auto;" /> --- ## Use better colors .panelset[ .panel[.panel-name[Plot] <img src="14-effective-dataviz_files/figure-html/unnamed-chunk-33-1.png" width="90%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Code] ```r ggplot(brexit, aes(y = opinion, fill = opinion)) + geom_bar() + facet_wrap(~region, nrow = 1, labeller = label_wrap_gen(width = 12)) + guides(fill = FALSE) + labs(title = "Was Britain right/wrong to vote to leave EU?", subtitle = "YouGov Survey Results, 2-3 September 2019", caption = "Source: bit.ly/2lCJZVg", x = NULL, y = NULL) + scale_fill_manual(values = c( * "Wrong" = "#ef8a62", * "Right" = "#67a9cf", * "Don't know" = "gray" )) ``` ] ] --- ## There are many color palette packages out there! [Comprehensive list of R color palette packages](https://github.com/EmilHvitfeldt/r-color-palettes) Use the help files and examples on the web to figure out how to implement them in your plots. --- ## Select theme .panelset[ .panel[.panel-name[Plot] <img src="14-effective-dataviz_files/figure-html/unnamed-chunk-34-1.png" width="90%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Code] ```r ggplot(brexit, aes(y = opinion, fill = opinion)) + geom_bar() + facet_wrap(~region, nrow = 1, labeller = label_wrap_gen(width = 12)) + guides(fill = FALSE) + labs(title = "Was Britain right/wrong to vote to leave EU?", subtitle = "YouGov Survey Results, 2-3 September 2019", caption = "Source: bit.ly/2lCJZVg", x = NULL, y = NULL) + scale_fill_manual(values = c("Wrong" = "#ef8a62", "Right" = "#67a9cf", "Don't know" = "gray")) + * theme_minimal() ``` ] ] --- ## `ggplot()` has several built-in themes ```r ggplot(brexit, aes(x = fct_infreq(opinion))) + geom_bar() + labs(x = "", y = "Count") + * theme_bw() ``` <img src="14-effective-dataviz_files/figure-html/unnamed-chunk-35-1.png" width="60%" style="display: block; margin: auto;" /> --- ## `ggplot()` has several built-in themes ```r ggplot(brexit, aes(x = fct_infreq(opinion))) + geom_bar() + labs(x = "", y = "Count") + * theme_linedraw() ``` <img src="14-effective-dataviz_files/figure-html/unnamed-chunk-36-1.png" width="60%" style="display: block; margin: auto;" /> --- ## `ggplot()` has several built-in themes ```r ggplot(brexit, aes(x = fct_infreq(opinion))) + geom_bar() + labs(x = "", y = "Count") + * theme_light() ``` <img src="14-effective-dataviz_files/figure-html/unnamed-chunk-37-1.png" width="60%" style="display: block; margin: auto;" /> --- ## `ggplot()` has several built-in themes ```r ggplot(brexit, aes(x = fct_infreq(opinion))) + geom_bar() + labs(x = "", y = "Count") + * theme_dark() ``` <img src="14-effective-dataviz_files/figure-html/unnamed-chunk-38-1.png" width="60%" style="display: block; margin: auto;" /> --- ## `ggplot()` has several built-in themes ```r ggplot(brexit, aes(x = fct_infreq(opinion))) + geom_bar() + labs(x = "", y = "Count") + * theme_minimal() ``` <img src="14-effective-dataviz_files/figure-html/unnamed-chunk-39-1.png" width="60%" style="display: block; margin: auto;" /> --- ## `ggplot()` has several built-in themes ```r ggplot(brexit, aes(x = fct_infreq(opinion))) + geom_bar() + labs(x = "", y = "Count") + * theme_classic() ``` <img src="14-effective-dataviz_files/figure-html/unnamed-chunk-40-1.png" width="60%" style="display: block; margin: auto;" /> --- ## `ggplot()` has several built-in themes ```r ggplot(brexit, aes(x = fct_infreq(opinion))) + geom_bar() + labs(x = "", y = "Count") + * theme_void() ``` <img src="14-effective-dataviz_files/figure-html/unnamed-chunk-41-1.png" width="60%" style="display: block; margin: auto;" /> --- ## `ggplot()` has several built-in themes ```r ggplot(brexit, aes(x = fct_infreq(opinion))) + geom_bar() + labs(x = "", y = "Count") + * theme_test() ``` <img src="14-effective-dataviz_files/figure-html/unnamed-chunk-42-1.png" width="60%" style="display: block; margin: auto;" /> --- ## There are many `ggplot()` theme packages out there as well [Themes to spice up visualizations with ggplot2](https://towardsdatascience.com/themes-to-spice-up-visualizations-with-ggplot2-3e275038dafa) --- class: inverse ## ae-10-ugly-viz-challenge ### [code share](https://codeshare.io/WdR33l) ### [score sheet](https://docs.google.com/spreadsheets/d/1VJj5mH2h3QwvilC3gMoQNVhHtZqj8qdGQ2ApFgQArmk/edit?usp=sharing) - [data viz checklist](https://datainnovationproject.org/wp-content/uploads/2017/04/2_Data-Visualization-Checklist_May2014-2-1.pdf) - [modify ggplot theme](https://ggplot2.tidyverse.org/reference/theme.html)