1 Syllabus
1.1 Special Topics: Data analysis using R
Date: 06 January 2023
Instructor: Dr. Robert Leaf
Office: GCRL Oceanography 119
Office Hours: Make a time to see me in-person or online.
Email: robert.leaf@usm.edu
Phone: 2-4296
Course Meeting Day and Time: T,Th, 4:00 to 5:15 PM
Course Meeting Location: GCRL’s Caylor Computer Lab
1.2 Course Description and Objectives
This course examines the fundamental concepts and techniques for programming in the R statistical programming language. I am convinced that data analysis, data manipulation, data visualization, and reproducible research necessitates command of quantitative tools. Although there are many specialized and general programming languages, the R programming language offers exceptional utility for analysis and is used widely in academia, industry, and by federal and state scientific groups. The demand for skilled data analysis practitioners is rapidly growing and this course prepares you to tackle real-world data analysis challenges.
The primary components of the course:
- Introduce the basics of R programming
The course will introduce stereotypical programming concepts, in particular code modularisation, writing and using functions, and code re-usability. We will focus on understanding software engineering concepts such as project build and code testing. Participants will establish a working knowledge of R, R Studio, and relevant packages
- Review aspects of project organization
A typical data analysis project involves several many components, each including several data files and different binary scripts with code. Keeping these files organized can be challenging and requires a suite of analytical tools.
- Perform operations on vectors and understand how to use advanced functions
Learn how to wrangle, analyze and visualize data using base R operations and specialized packages (e.g. tidyverse and ggplot2)
- Promote a reproducible research workflow
Finally, we will examine how to write markdown documents for high throughput data presentation which permits you to incorporate text and code into a document.
1.3 At the conclusion of this course:
Students will be able to recognize problems that can be solved using statistical programming and reproducible research approaches. The skills of sharing, automation, and organization enable making research more reproducible. By practicing and reinforcing the use of quantitative tools, participants will be better able to make insights that would otherwise be hidden.
1.4 Course Materials
R for Data Science by G. Grolemund and H. Wickham (https://r4ds.had.co.nz/). This is R4DS in the syllabus.
bookdown: Authoring Books and Technical Documents with R Markdown by Y. Xie (https://bookdown.org/yihui/bookdown/). This is BD in the syllabus.
Tufte, E. R. (2001). The Visual Display of Quantitative Information. Cheshire, Connecticut: Graphics Press. This is Tufte in the syllabus.
1.5 Course Scheduling
Class Number | Day | Assignments | Reading |
---|---|---|---|
1 | Thursday, January 19, 2023 | Syllabus, RStudio and R, R Packages, Useful Shortcuts | Leaf 01, 02, 03, R4DS 08 |
2 | Tuesday, January 24, 2023 | Leaf Lab at Bays and Bayous Jan. 24 and 25. | |
3 | Thursday, January 26, 2023 | Data input and output | Leaf 04 |
4 | Tuesday, January 31, 2023 | Data Classes | Leaf 05 |
5 | Thursday, February 2, 2023 | Leaf at Southern Division of AFS Meeting | |
6 | Tuesday, February 7, 2023 | Working in R | Leaf 06 |
7 | Thursday, February 9, 2023 | Indexing and Logical Operators | Leaf 08 |
8 | Tuesday, February 14, 2023 | Loops | Leaf 09 |
9 | Thursday, February 16, 2023 | Leaf Lab at MS Chapter of the AFS Annual Meeting | |
10 | Tuesday, February 21, 2023 | Mardi Gras Holiday | |
11 | Thursday, February 23, 2023 | Loops (more loops) | Leaf 09 |
12 | Tuesday, February 28, 2023 | R for Data Science, Script Anatomy and Organization, Assignment 01 Due | R4DS 01, 02 |
13 | Thursday, March 2, 2023 | Data Transformation | R4DS 05 |
14 | Tuesday, March 7, 2023 | Data Transformation | R4DS 05 |
15 | Thursday, March 9, 2023 | Pipes | R4DS 18 |
16 | Tuesday, March 14, 2023 | USM Spring Break | |
17 | Thursday, March 16, 2023 | USM Spring Break | |
18 | Tuesday, March 21, 2023 | Data Wrangling and Tibbles | R4DS 09, 10 |
19 | Thursday, March 23, 2023 | Tidy Data | R4DS 12 |
20 | Tuesday, March 28, 2023 | Relational Data | R4DS 13 |
21 | Thursday, March 30, 2023 | Factors | R4DS 15 |
22 | Tuesday, April 4, 2023 | Dates and Times | R4DS 16 |
23 | Thursday, April 6, 2023 | Pipes, Assignment 02 Due | R4DS 19 |
24 | Tuesday, April 11, 2023 | Graphical Display | Tufte |
25 | Thursday, April 13, 2023 | ggplot2 I | R4DS 03 |
26 | Tuesday, April 18, 2023 | ggplot2 II | R4DS 03 |
27 | Thursday, April 20, 2023 | ggplot2 III | R4DS 28 |
28 | Tuesday, April 25, 2023 | Rmarkdown I, Assignment 03 Due | R4DS 27, 29, 30 |
29 | Thursday, April 27, 2023 | Rmarkdown II | R4DS 27, 29, 30 |
30 | Tuesday, May 2, 2023 | Preliminary Project Presentations I | |
31 | Thursday, May 4, 2023 | Bookdown | BD 01 to BD 02 |
32 | May 8 to May 11, 2023 | Final Project Presentation - Date and Time TBD, Assignment 04 Due | |
1.6 Course Workload Statement
Students are expected to invest considerable time outside of class in learning the material for this course. The expectation of the University of Southern Mississippi is that students should spend approximately 2 to 3 hours outside of class each week for every hour in class working on reading, assignments, studying, and other work for the course. Time management is thus critical for student success. All students should assess their personal circumstances and talk with their advisors about the appropriate number of credit hours to take each term. Resources for academic support can be found at https://www.usm.edu/success.
1.7 Course Evaluation
Percentage | Letter Grade |
---|---|
93-100 | A |
90-92 | A- |
86-89 | B+ |
83-85 | B |
80-82 | B- |
76-79 | C+ |
73-75 | C |
70-72 | C- |
66-69 | D+ |
63-65 | D |
60-62 | D- |
< 60 | F |
1.8 Assignment Policy and Procedures
All assigned work (Assignments and Project) will be due at the beginning of class on its assigned due date. You will be submitting your code to me, via email at .r files and I will check that the code runs properly, grade the assignment, and provide feedback within five business days. Late work will not be given full credit.
To receive full credit, all code must run on all my machine and return all required components of the assignment. You may turn in any assignment as many times as necessary to ensure that you receive credit.
1.9 Grading scale
Evaluation type | Number | Points per item | Total points |
---|---|---|---|
Assignments | 4 | 10 | 40 |
Preliminary Project Presentations | 1 | 20 | 20 |
Final Project | 1 | 10 | 10 |
1.10 Content of this online material.
The material presented on this site is derived from a few different online and published sources. These sources are not explicitly cited and the intention is for the presented material to be referenced with the following books (on reserve in the library).
Crawley, M. J. (2013). The R book. New York: Wiley. ISBN: 9781118448908 1118448901 9781118448946 1118448944 9781118448960 1118448960
Teetor, P. (2011). R cookbook. Beijing: O’Reilly. ISBN: 9780596809157 0596809158
Tufte, E. R. (2001). The Visual Display of Quantitative Information. Cheshire, Connecticut: Graphics Press. ISBN: 0-9613921-4-2
Wickham, Hadley (2014). Advanced R. Routledge. ISBN-10 : 9781466586963
Wickham, Hadley and Grolemund, Garret (2017). R for Data Science. O’Reilly Media. ISBN-13: 978-1491910399
Wickham, Hadley (2016).ggplot2: Elegant Graphics for Data Analysis (Use R). Springer. ISBN-13: 978-3319242750