##### GRAPHING DATA I # Here we will use helicopter data to make boxplots (which compare categories well) and scatter plots for continuous predictors. # Start RStudio and set your working directory – from the top menu: Session... Set Working Directory... Choose Directory. Below I assume you use the Desktop for today's in-class work. # do this: File...New File... R Script. Copy all this text here into that Source Window and save it (automatically will be a ____.R file). # Import Dataset – In the Environment tab of the upper right window, click on Import Dataset...From Text (readr)...and in the top box (File/URL) copy-paste this link in: # https://sciences.ucf.edu/biology/d4lab/wp-content/uploads/sites/23/2021/09/helicopter-data.csv # And then click Update (at the upper right). Below you can change the name of the input data matrix (I assume you call it “data”). Click in the Code Preview to make changes show. Click Import (bottom right of window) when you’re ready. You should see the data in the upper left window. # You have now set the working directory, imported this script and a renamed data set, and told R to work with that data set. #### NOTE: In future classes we assume you can do all the above steps. # Boxplots - To compare medians and quartiles of groups # If you look at ID you’ll notice it is numbers. Notice that ID is numeric. If ID was alphanumeric (A, B, etc.), R would automatically make a boxplot based on categories of ID. Instead, we plotted numeric IDs so R made a scatterplot. But ID is actually a category. Let’s fix that: data$fID <- factor(ID) # this converts the continuous variable into a factor (i.e., category) # Notice that we made a new column in Data – keeping the original “ID” column as well. # Attach data – this tells R you are working only with this data set (you may later load multiple data sets at once). Type: attach(data) # in the Console and hit Enter. # Now enter: boxplot(Time ~ fID) # To specify one boxplot per each level of design (ID). # Now add axis labels by typing: boxplot(Time ~ fID, xlab='Helicopter Treatments', ylab='Flight Times (sec)') # What can you infer about the helicopter experiment already? # Let’s now use ggplot to make fancier boxplots. Type: library(ggplot2) # this assumes you have already installed ggplot2 in your computer # And make a boxplot using ggplot: ggplot(data, aes(x=fID, y=Time)) + geom_boxplot(aes(fill = fID)) + theme_classic() # Now click on the Help tab (above the plot) and in the search box enter geom_boxplot and Enter to see details of this command. Notice all the choices you could use? Let’s try a few: ggplot(data, aes(x=ID, y=Time)) + geom_boxplot(aes(fill = fID), notch=T, show.legend=F) + theme_classic() # Notice how notches could look cool but don’t always work? Do colors help? Are axis labels big enough to print in a journal column? # Let’s try a nice general formatting package: install.packages('cowplot') library(cowplot) # this package helps make ggplots be more publication-ready # Now re-run the ggplot command above but substitute theme_cowplot for the former theme_classic. Hint: click your cursor into the Console and simply use up and down arrows on the keyboard to scroll through prior commands. # Better fonts? What if you want fonts embiggened? And what about some other titles, etc.? ggplot(data, aes(x=fID, y=Time)) + geom_boxplot(aes(fill = fID),show.legend=F) + theme_cowplot() + theme(axis.text = element_text(size = 20)) + # Too Big? Change the font size here labs(x = 'Paper Helicopter Design IDs', y = 'Hang Time (sec)', title ='This is a silly data set for graphing practice', subtitle = 'I can’t wait to have my own data', caption = 'patent pending 2023 (just kidding)') # Too much? Too silly? change any code lines above as you wish to make it better. # And what about those colors? Are they color-blind friendly? # Try this install.packages(“viridis”) # if not already installed library(viridis) # default color spectra avoids red to be color-blind friendly # For more info see https://www.thinkingondata.com/something-about-viridis-library/ # Now let’s use that palette: ggplot(data, aes(x=fID, y=Time)) + geom_boxplot(aes(fill = fID),show.legend=F) + theme_cowplot() + scale_fill_viridis(discrete=T, option= 'D') + # NOTICE the discrete bit here? theme(axis.text = element_text(size = 12)) + labs(x = 'Paper Helicopter Design IDs', y = 'Hang Time (parsecs)', title ='All copters were faster than the Millenium Falcon', subtitle = 'What if I just showed my advisor These Data?', caption = 'Yeah. That’ll work') # What if you want darker colors to the right? Add “direction =-1” in the scale_fill_viridis( ) # One last viridis trick. We used the default (“D”) option above. For other spectra, replace D in option= “D” with another option: # 'magma' (or 'A') # 'inferno' (or 'B') # 'plasma' (or 'C') # 'viridis' (or 'D') # 'cividis' (or 'E') # 'rocket' (or 'F') # 'mako' (or 'G') # 'turbo' (or 'H') # Are some of those color palettes too extreme? Lots of other palettes exist, such as RColorBrewer: install.packages('RColorBrewer') library(RColorBrewer) # Or the Wes Anderson palettes (!): for more info see https://github.com/karthik/wesanderson install.packages("wesanderson") # Bottom LIne: There are lots of graphing options. # Job #1 = Ensure your graphs efficiently tell the data story. Remember Edward Tufte’s argument. # Job #2 = make it visually appealing. Think of 1st impressions. # All this will take a buncha code fussing, so like with all R code – annotating, saving and organizing code files will help you save time later. # Explore!