Instructor:
Class Meetings:
Office Hours:
This time is purposefully scheduled for you to drop in and ask any questions about the course. Please feel free to stop by. If the time doesn’t work for you, message me and we can try to arrange something.
Mentor Information:
This course employs the use of a mentor to aid you in navigating the course. Our course mentor will assist us in class, and host 1 – 2 mentor sessions throughout the week. Mentor Sessions may review course content, provide practice problems, practice interview skills, or provide homework help.
Mentors Info:
Rhys Howell (howellrh@grinnell.edu)
Mentor Sessions: Noyce 2401 Thursdays 8-9pm
Course Description:
Welcome to the Spring 2026 section of Grinnell College’s STA-230. This course introduces core topics in data science using R programming. This includes introductions to getting and cleaning data, data management, exploratory data analysis, reproducible research, and data visualization. This course incorporates case studies from multiple disciplines and emphasizes the importance of properly communicating statistical ideas. Prerequisite: MAT-209 or STA-209. Suggested CSC-151 or computer programming experience.
Texts:
No required texts, some free texts may be recommended throughout the semester.
There may also be readings or links from other sources which will be provided as necessary.
After completing this course, students will be able to do the following:
The current plan for the course is as follows:
You will have the freedom to choose your partner for the first lab, afterwords lab partners may be assigned. During labs it is essential that you and your partner work together, making certain that each of you understand your work equally well.
Most labs will begin with a brief “preamble” section that we will go through together as class. The purpose of this section is to introduce the topic of the lab and ensure a smooth start to each class meeting.
This class will employ a mastery-based grading system. All Components will be graded on a Satisfactory (S) or not-Satisfactory (NS) scale.
We will have in class labs almost every class. In general these will be due 10pm on the Sunday following the week in which they were released. In order to receive a Satisfactory:
We will have approximately weekly homework assignments due on Sundays at 10pm. In order to receive a Satisfactory:
There will be 4 midterms:
There will be no final exam held for the class.
This course will rely on the ideas of specifications grading and mastery grading. These systems, inspired by adult learning theory, are designed to create a “low-threat” learning environment where:
Note: I reserve the right to update requirements for grades as circumstances dictate over the course of the semester (e.g. if the number of assignments or labs changes).
Letter grades for the entire course will be assigned according to the bundles in the table below. You will receive the grade corresponding to the bundle for which you meet all the requirements. All bundles list minimum amounts, you may exceed the requirements for a bundle and still qualify for it. All numbers in the table are the minimum number of satisfactory grades achieved.
| Grade | Labs (11 Possible) | Homework (8 Possible) | Midterms (4 Possible) | |
|---|---|---|---|---|
| C | 7 | 5 | 2 | |
| B | 9 | 6 | 3 | |
| A | 11 | 8 | 4 |
D: 2 requirements of a C are met F: 0-1 requirements of a C are met Half letter grades (C+,B+): all of the lower tier (C/B) requirements met, one of the higher tier (B/A) requirements met. Half letter grades (B-,A-): all of the lower tier (C/B) requirements met, two of the higher tier (B/A) requirements met.
One of the fundamental principles behind this grading scheme is that you will have opportunities to re-try assignments if they do not originally obtain a satisfactory grade. My goal in using this schema is to reduce the stress that accompanies typical grading rubrics and give you permission to make mistakes and learn as much as possible. Ultimately, my goal is for each student to learn as much as possible, and I would be very happy to have every student earn an A.
Because this course involves a large amount of group work, absences impact not only yourself but also your classmates. As such, class attendance is absolutely necessary. You are permitted two missed classes this semester without penalty. Every absence following this will result in a 1/3 drop in letter grade (e.g., B+ to B). Being more than 5 minutes late will count as an unexcused absence. Exceptions granted for sports and extra curricular activities (speak with me in advance).
In the event of an unplanned absence (e.g. illness), please let me know as soon as possible if you will miss class, ideally at least 30min in advance of the start of class. You may be removed from a group to work on your own if I have already arranged groups for that day. If you are absent you are still responsible for all homeworks and labs to be submitted on time.
Tokens reflect that life inevitably rears its ugly head in
some fashion and ruins our best-laid plans. You begin the course with
3 tokens. Tokens may be used for:
There may be opportunities to earn more tokens as the semester progresses by reading select research papers and answering a short quiz. Attending any mentor session will earn you a token for each session.
You are welcome to email me whenever, though I rarely check my email on evenings or weekends. I’m generally happy to answer questions this way, but all R code troubleshooting must be done in class, during office hours, or during mentor sessions. Please start your email header with “STA230”. If for some reason I do not respond to your email in a timely manner, such as 24H during the week, please send a follow-up email.
Software is increasingly an essential component of statistics and
will play a role in this course. We will primarily use R,
an open-source statistical software program.
You are welcome to use your own personal laptop, or a Grinnell
College laptop, during the course. R is freely available
and you can download it and it’s UI companion, R Studio,
here (note: R must be downloaded and installed before
R Studio):
R from http://www.r-project.org/R Studio from http://www.rstudio.com/You may also work on a classroom computer, all of which will have
R and R Studio pre-installed.
Finally, Grinnell hosts an online version of R Studio that you may use while on campus internet: https://rstudio.grinnell.edu/
R for Data Science? (From Prof. Will
Rebelsky)If you’ve spent any time reading about data science online you’ll undoubtedly have noticed the prominence of the Python programming language. Indeed, research from Cal State University found Python was the most popular data science language in private industry, being mentioned in 42% of data scientist job postings. However, R, which was mentioned in 20% of job postings, is not far behind and offers a few advantages when approaching data science from a statistical perspective (hence this course having the STA prefix).
Both R and Python provide plenty of functions for data manipulation. However, because R was created by academic statisticians, it offers very strong data visualization and statistical modeling packages. On the other hand, Python is a general-purpose programming language that excels in production, deployment, and machine learning. Regardless of each language’s strengths and weaknesses, as an introductory course our focus is on the fundamental skills and thought processes used in data science – which is something that can be accomplished regardless of the tools used (which will change over time anyways).
You can expect to spend 12 hours per week on this course, including all in-class and out of class time. This number is based off of the Grinnell Guidelines for credit-hours. Some weeks will be more, some weeks will be significantly less. If you find that you are frequently spending significantly more than 9 hours working on material for this course outside of class each week, please let me know.
Please do not cheat. You do not need to cheat to do well in this class. The policy at Grinnell College removes ALL discretion that I may exercise; any work that is suspected of violating the academic honest policy will be submitted to the Committee on Academic Standing.
In virtually all cases, unless otherwise specified, the use of generative AI is strictly prohibited.
At Grinnell College you are part of a conversation among scholars, professors, and students, one that helps sustain both the intellectual community here and the larger world of thinkers, researchers, and writers. The tests you take, the research you do, the writing you submit-all these are ways you participate in this conversation.
The College presumes that your work for any course is your own contribution to that scholarly conversation, and it expects you to take responsibility for that contribution. That is, you should strive to present ideas and data fairly and accurately, indicate what is your own work, and acknowledge what you have derived from others. This care permits other members of the community to trace the evolution of ideas and check claims for accuracy.
Failure to live up to this expectation constitutes academic dishonesty. Academic dishonesty is misrepresenting someone else’s intellectual effort as your own. Within the context of a course, it also can include misrepresenting your own work as produced for that class when in fact it was produced for some other purpose. A complete list of dishonest behaviors, as defined by Grinnell College, can be found here.
This Syllabus is based off material taken from a variety of Professors at Grinnell including, but not limited to, Professors (William) Rebelsky, Miller, and Nolte. Course content and organization is heavily based off previous courses by Profs. Miller and Rebelsky.