For this week's Tidy Tuesday challenge focused on unions, I opted to craft a dumbbell chart that contrasts the average wages of union members and non-union workers based on their educational background from 2013 to 2022.
Dumbbell charts are particularly effective when you want to emphasize changes or differences between two data points while keeping the visualization simple and easy to understand. They are commonly used in fields such as economics, finance, healthcare, and data analysis to illustrate trends, improvements, or comparisons.
Loading packages, data, fonts and difining colors
# Loading packages library(tidyverse)
library(tidytuesdayR) library(showtext)
library(glue) library(ggtext) library(ggchicklet) # Loading and filtering data wages <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2023/2023-09-05/wages.csv')
# Filter the data df <- wages %>% filter(facet == "demographics: college or more" & year > 2012) df2 <- wages %>% filter(facet == "demographics: less than college" & year > 2012) # Loading fonts and colors font_add_google("Poppins", "poppins") font_add('fa-reg', 'c:/Users/info/OneDrive/Dokumente/fonts/Font Awesome 6 Free-Regular-400.otf') font_add('fa-brands', 'c:/Users/info/OneDrive/Dokumente/fonts/Font Awesome 6 Brands-Regular-400.otf') font_add('fa-solid', 'c:/Users/info/OneDrive/Dokumente/fonts/Font Awesome 6 Free-Solid-900.otf') showtext_auto() bg <- "white" col1 <- thematic::okabe_ito()[1] col2 <- thematic::okabe_ito()[2] col3 <- thematic::okabe_ito()[4] col4 <- thematic::okabe_ito()[3] grey1 <- "lightgrey" grey2 <- "darkgrey"
Text generation
# text creation twitter <- glue("<span style='color:{col4};font-family:fa-brands;'></span>") mastodon <- glue("<span style='color:{col4};font-family:fa-brands;'></span>") link <- glue("<span style='color:{col4};font-family:fa-solid;'></span>") data <- glue("<span style='color:{col4};font-family:fa-solid;'></span>") quote <- glue("<span style='color:{col4};font-family:fa-solid;'></span>") space <- glue("<span style='color:{bg}'>-</span>") space2 <- glue("<span style='color:{bg}'>--</span>") # can't believe I'm doing this union <- glue("<span style='color:{col1}'><b>Union Members</b></span>") nonunion <- glue("<span style='color:{col2}'><b>Non Union Workers</b></span>") wage <- glue("<span style='color:{col3}'><b>overall wage</b></span>") less <- glue("<span style='color:{grey2}'><b>less than college</b></span>") more <- glue("<span style='color:{grey1}'><b>college or more</b></span>") t <- glue("<b>Wages of {union} and {nonunion} by Educational Background<br>from 2013 to 2022</b>") s <- glue("Educational Background: {less} | {more} ({wage})") cap <- glue("{twitter}{space2}@web_design_fh{space2} {space2}{mastodon}{space2}@frankhaenel @fosstodon.org{space2} {space2}{link}{space}{space2}www.frankhaenel.de<br> {data}{space2}Union{space}Membership,{space}Coverage,{space}and{space}Earnings{space}from{space}the{space}CPS{space}by{space}Barry{space}Hirsch{space}(Georgia{space}State{space}University),David{space}Macpherson{space}(Trinity{space}University),{space}and{space}William{space}Even{space}(Miami{space}University)<br> {quote}{space2}Macpherson,{space}David{space}A.{space}and{space}Hirsch,{space}Barry{space}T..{space}2023.{space}“{space}Five{space}decades{space}of{space}CPS{space}wages,{space}methods,{space}and{space}union-nonunion{space}wage{space}gaps{space}at{space}Unionstats.com.”<br>{space2}{space2}Industrial{space}Relations:{space}A{space}Journal{space}of{space}Economy{space}and{space}Society{space}00:{space}1–9.")
Plot
# Define bar_height bar_height <- 0.2 # Create the plot ggplot(data = df) + ggchicklet:::geom_rrect( aes( xmin = union_wage, xmax = nonunion_wage, ymin = year - bar_height, ymax = year + bar_height, ),color=grey1,fill=grey1, # Use relative npc unit (values between 0 and 1) # This ensures that radius is not too large for your canvas r = unit(0, 'npc') ) + ggchicklet:::geom_rrect(data=df2, aes( xmin = union_wage, xmax = nonunion_wage, ymin = year - bar_height, ymax = year + bar_height, ),color= grey2,fill=grey2, # Use relative npc unit (values between 0 and 1) # This ensures that radius is not too large for your canvas r = unit(0, 'npc') ) + geom_point(data = df, aes(x = union_wage,y = year),color=col1,size = 6) + geom_point(data = df2, aes(x = union_wage,y = year),color=col1,size = 6) + geom_point(data = df, aes(x = nonunion_wage,y = year),color=col2,size = 6) + geom_point(data = df2, aes(x = nonunion_wage,y = year),color=col2,size = 6) + geom_point(data = df, aes(x = wage,y = year),color=col3,size = 3) + geom_point(data = df2, aes(x = wage,y = year),color=col3,size = 3) + labs(title = t, subtitle = s, caption = cap, x = "Mean hourly earnings in dollars", y = "Year") + theme_minimal() + theme(plot.margin = margin(10, 10, 10, 10), plot.title = element_markdown(size = 18, hjust = 0, lineheight = 1.3, family = "poppins"), plot.subtitle = element_markdown(size = 15, hjust = 0, lineheight = 1.3, family = "poppins"), plot.caption = element_markdown(size = 9, hjust = 0, lineheight = 1.3, color = grey2, family = "poppins"), axis.title = element_markdown(size = 8, color = grey2, family = "poppins"), axis.text = element_markdown(size = 8, color = grey2, family = "poppins")) + ylim(2012, 2023)
R Code Documentation
Introduction
This document provides documentation for the R code used to create a data visualization plot.
Code Overview
The R code in question is used to create a data visualization plot that compares union and non-union wages over the years, with a focus on demographics. It uses the 'ggplot2' package for data visualization and 'showtext' for font handling.
Data Source
The code reads data from an external CSV file using the 'readr' package. The data source is a CSV file hosted on GitHub, containing wage-related information.
Code Components
- Loading Libraries: The code starts by loading necessary R libraries, including 'tidyverse', 'tidytuesdayR', 'showtext', 'glue', and 'ggtext'.
- Loading Fonts and Colors: Custom fonts and color variables are defined and loaded to be used in the plot.
- Data Filtering: The code filters the data to create two data frames ('df' and 'df2') based on specific criteria.
- Title and Subtitle Creation: Functions for generating the title and subtitle of the plot are defined, incorporating variables and fonts.
- Plot Creation: The 'ggplot' function is used to create the plot, including bar-like structures representing wage ranges, points indicating mean hourly earnings, and various visual elements.
- Styling: The code applies styles and themes to the plot, including font families, colors, and margins.
Output
The output of the code is a dumpbell chart that visually represents the comparison of union and non-union wages over the years for different demographics.