Syllabus
The Boring But Important Details
Course Name and Number
- CS 307 - Modeling and Learning in Data Science
- Section: MLD
Location and Time
The Fall 2024 version of the course is a blend of online and in-person*.
- Lecture: Asynchronous Online, ClassTranscribe
- Discussion: Friday, 9:30 AM - 10:45 AM, 1320 Digital Computer Laboratory
Course Staff
Please refer to the course staff by their given names. For example, your instructor is named Dave1. If you refer to the staff as “Professor” or “TA,” we might refer to you as “student,” which seems odd.
Instructor
Teaching Assistants
- Veda Kailasam [Email] [Office Hours]
- Xuying Ning [Email] [Office Hours]
- Aniket Vashishtha [Email] [Office Hours]
- Jiaru Zou [Email] [Office Hours]
Learning Objectives
After this course, students are expected to be able to:
- Identify supervised (regression and classification) and unsupervised (clustering) learning problems and their subtasks.
- Understand the bias-variance tradeoff and its relationship to model complexity, overfitting, and generalization.
- Validate and select machine learning models and their parameters using techniques such as cross-validation.
- Prepare and process data for use with machine learning methods.
- Formulate practical, real-world problems as machine learning problems.
- Evaluate effectiveness of machine learning methods when used as a tool for data analysis or as a component of a system.
- Implement simple machine learning methods from scratch using Python’s
numpy
. - Apply machine learning methods to real data using frameworks such as Python’s
scikit-learn
andpytorch
.
Course Content
Course Description
Course Catalog: Introduction to the use of classical approaches in data modeling and machine learning in the context of solving data-centric problems. A broad coverage of fundamental models is presented, including linear models, unsupervised learning, supervised learning, and deep learning. A significant emphasis is placed on the application of the models in Python and the interpretability of the results.
The above description is based on the Illinois Course Catalog. This version of the course may deviate slightly from this description. The course website will provide an overview of the course content and schedule.
Topics
Tentative subjects include:
- Basics: Supervised and Unsupervised Learning, Parametric versus Nonparametric Methods, Bias-Variance Trade-Off, Cross-Validation, No Free Lunch, Model Selection and Evaluation
- Regression: Linear Regression, Decision Trees, KNN
- Classification: Logistic Regression, Decision Trees, KNN, LDA, QDA, Naive Bayes
- Extensions: Regularization (Ridge, Lasso, Elastic Net), Ensemble Learning (Bagging, Boosting, Random Forests)
- Unsupervised: PCA, K-Means Clustering, Hierarchical Clustering, Mixture Models, EM Algorithm
Towards the end of the semester, we will use any remaining and available time to introduce neural networks and deep learning.
Textbooks
There is no required textbook for CS 307. Instead, course content will be distributed through a combination of lectures, notes, and additional (freely available) resources.
Prerequisites
The stated prerequisite for CS 307 is STAT 207 and a linear algebra course, preferably one of MATH 225, MATH 227, MATH 257, MATH 415, MATH 416, or ASRM 406. Students will be expected to have experience with probability, statistics, and Python programming as taught in STAT 107 and STAT 207. Comparable experiences may be acceptable, but do consider speaking with an advisor if you find yourself in that situation.
Attendance
Discussion section attendance is encourage but not explicitly required. Attendance does not directly have an effect on your grade.
Assessments
CS 307 will use three types of assessments: homework, labs, and quizzes.
With the exception of quizzes, all course assignments are due at 11:59 PM, Central (Urbana) time, on the listed due date.
- Homework is due on Mondays.
- Labs are due on Wednesdays.
All assessment deadlines are listed on the landing page of this website.
Homework
Throughout the semester, there will be a total of ten homework assignments, administered through PrairieLearn.
Additional information and instructions can be found on the homework policy page of the course website:
Labs
There will be a total of ten labs throughout the semester, submitted through a combination of PrairieLearn and Canvas. Each lab will have two components, a model and a report. Additional information and instructions can be found on the lab policy page of the course website:
Quizzes
There will be three quizzes throughout the semester taken at the Computer-Based Testing Facility (CBTF). Additional information (including dates and times) and instructions can be found on the quiz policy page of the course website:
Course Communication
We will use several forms of communication for this course. The website will be the one-stop-shop for all course information. Course announcements will be sent via email. Be sure you are regularly checking your @illinois.edu
email account2.
If you would like to communicate with the course staff, our preferred methods of communication, in order, are:
- Office Hours
- Discussion Forum (Ed)
Email should largely be reserved for private matters. As much as possible, we would appreciate you asking questions about the course where we can respond so that other students benefit from your questions! It’s cliche to say, but if you have a question, someone else is probably thinking it!
Office Hours
The current-week office hours schedule can be found on the home page.
Except where specified otherwise, office hours are held in the basement of Siebel Center.
Office hours are by far our preferred forum for discussing individual, specific questions. In office hours, our response time will be literally instant. Also, since we are both present in the same physical location (or together on Zoom), follow-up is both expected, and easy. Using asynchronous forms of communication such as the discussion forum or email will have a slower response rate and a much lower communication bandwidth. In other words, please come to office hours!
Office hours will be a rather informal meeting. As such, if course staff and a student are engaged in causal conversation not directly related to a pressing matter in CS 307, like a homework question, please just jump into the conversation and interrupt! If office hours are “busy” the instructor may institute an informal queuing system, but the hope is to keep office hours more relaxed and informal.
If you would like to schedule a private meeting outside of regular office hours, please send an email suggesting two possible times, on two different days.3 We have a preference for time-slots directly adjacent to current office hours. Please also indicate a brief agenda for the meeting. Requests to schedule a meeting at a time less than 24 hours in the future are unlikely to be granted. Requests without an agenda will be denied. Please provide some sense of what you would like to discuss.
Discussion Forum
This course will use Ed as our discussion forum.
Please register your account with your University email.4
The course staff will attempt to check Ed at least once a day during the week, thus you can often expect a response within 24 hours, except for weekends. If you need a quicker response, you should consider office hours as an alternative.
Private posts have been disabled. Any private matters should be discussed over email where your identity is known. Some anonymous posting is disabled. You may post anonymously to your classmates, but not the course staff. The course staff will know the identity of all posters.
Additional Ed policy can be found in a pinned post on Ed.
Email Policy
CS 307 will follow a strict email policy. Instead of email, consider using the discussion forum! Any quick, non-private communication should take place there.
If you’d like to email the instructor or course staff, consider the following:
- Is your question about course administration? If so, have you read the syllabus? If your question is easily answered in the syllabus, we will either refer you to the syllabus, or ignore your email.
- Is your question about part of an assignment? First and foremost: You should ask it in office hours. After that, consider the discussion board. As a last resort, use email, but there is a good chance you will be re-directed to the discussion board.
If you choose to send an email, you must adhere to the following three rules. If you do not, your email will be considered less import than other emails which follow the rules and response time will be slower.
- All email must originate from an
@illinois.edu
email address.5 - Your subject line must begin with exactly the following: [CS 307]
- After the above, put a single space, followed by a useful but short description of your message.
Some examples:
## good
[CS 307] Grade feedback question
## bad
## improper format
## non-descriptive subject
[cs307] hi
## bad
## improper format
[CS307] Grade feedback question
## bad
## improper format
## subject too long
## information found in syllabus or website
[CS 307]when is the quiz and what is covered on the quiz?
If your email is sent between 9:00 AM Monday and 11:59 PM Thursday, and you follow the above directions, we will try our best to respond within 24 hours. Questions about an assessment sent the same day the assessment is due will likely not receive a response before the assessment is due. Plan accordingly.
Code Discussion
If your question is technical in nature, there are several steps you can take to insure a speedy response on Ed.
First and foremost, you should ask “Google”6 before you ask the course staff. Take the error message you obtained, or a brief description or your issue, and search it with Google. The ability to solve problems this way is an extremely valuable skill, possibly one of the most important you should learn (but are not taught) during your academic career. Make a legitimate effort to solve the problem on your own. You won’t always be able to, and if you can’t, post on Ed or stop by office hours.
If you need to ask the course staff, include the following in your discussion forum post:
- All code that is required to re-create the error.
- Staff should be able to run your code, without any modification, and obtain the same error or output.
- The exact error message received.
In this course, for everything expect quizzes, we greatly prefer over-sharing to under-sharing code. We would rather everyone learn from others’ “mistakes” than have everyone experience the same issues over and over again. However, if you simply try to copy and paste other students’ code to get through the homework, you will likely fail the quizzes. The course staff reserves the right to change this policy if we feel it is being abused.
Generative AI
CS 307 does not prohibit the use of generative AI tools such as OpenAI’s ChatGPT or Microsoft Copilot. You are free to use these tools more or less without restriction.
However, you should be aware that you will not have access to these tools during quizzes in the CBTF. It would be wise to keep this information in mind if you choose to use generative AI to assist with homework and labs. As a broad heuristic, it would be a bad idea to simply give homework questions as a prompt in an attempt to obtain an answer without doing work yourself. A reasonable approach would be to use generative AI to help explain concepts and code that you do not understand. But don’t forget: these systems have a tendency to hallucinate.
Course Staff Emails
Role | Name | |
---|---|---|
Instructor | David Dalpiaz | dalpiaz2@illinois.edu |
Teaching Assistant | Veda Kailasam | vedak2@illinois.edu |
Teaching Assistant | Xuying Ning | xuyingn2@illinois.edu |
Teaching Assistant | Aniket Vashishtha | aniketv2@illinois.edu |
Teaching Assistant | Jiaru Zou | jiaruz2@illinois.edu |
Course Technology
Use of Python is required to complete the course. Visual Studio Code will be our supported IDE, but alternative tools may be used as a substitute.
Learning Management
A mixture of Canvas, Ed, PrairieLearn, PrairieTest, and ClassTranscribe will be used for Learning Management.
- Ed - Discussion Forum
- PrairieLearn - Homework, Lab (Models), MPs, Quizzes
- PrairieTest - Quiz Scheduling
- Canvas - Lab (Reports)
- ClassTranscribe - Lecture Video
Grading
Assessment Weights
Assessment | Percentage |
---|---|
Homework | 25 |
Lab (Models) | 20 |
Lab (Reports) | 10 |
Quiz 01 | 15* |
Quiz 02 | 15* |
Quiz 03 | 15* |
Subscores for each category are the average of the assignments for that category. All assignments are equally weighted within a category. Grade information for each individual assignment can be found on the platform used to submit the assignment.7
Because of the provided buffer points and the favorable quiz weighting, overall course percentages will not be rounded.
Quiz Weights
The weights for the three quizzes will start at 15% for each, but will be adaptively adjusted based on potential improvements that you make throughout the course. All quizzes will be cumulative.
The following three rules will govern the changes to your quiz weights:
- If Quiz 02 > Quiz 01, 5 percentage points shift from Quiz 01 to Quiz 02.
- If Quiz 03 > Quiz 01, 5 percentage points shift from Quiz 01 to Quiz 03.
- If Quiz 03 > Quiz 02, 5 percentage points shift from Quiz 02 to Quiz 03.
In the most extreme case, where a student makes consistent improvements, the resulting quiz weights would be:
- Quiz 01: 5%
- Quiz 02: 10%
- Quiz 03: 30%
Fill in the blanks and run the following code to calculate your quiz weights.
Buffer Points
Except for quizzes, assignments that are autograded (homework and lab models) allow you to obtain what we call buffer points. Buffer points can help your grade, but are not extra credit. So how do buffer points work?
Suppose there were only four homework assignments and a student obtained scores of:
[100, 100, 105, 105]
These average to 102.5, but with buffer points, subscores cannot exceed 100, so their Homework subscore used for final grade calculations would be 100. Alternatively, suppose their scores were:
[95, 95, 105, 105]
Here, these average to 100, so their subscore for Homework would be 100.
This demonstrates that buffer points can help you get to a subscore of 100 if you loose points on some assignments of the same type, but cannot move a subscore past 100 and help you make up for point loss on assignments of a different type.
Specific details for obtaining buffer points can be found in the relevant homework and lab policy documents.
Grading Scale
A | B | C | D | |
---|---|---|---|---|
Plus | 99 | 87 | 77 | 67 |
Neutral | 93 | 83 | 73 | 63 |
Minus | 90 | 80 | 70 | 60 |
The instructor reserves the right to lower, but not raise, letter grade cutoffs. However, this policy should not create an expectation that this will happen. Asking for a change in cutoffs will make any change in cutoffs less likely. Grading in the course is not competitive. There is nothing (other than some statistical realities) that would prevent the entire class from receiving a grade of A.
Final letter grades will be posted to Canvas as soon as is reasonably possible.
Grade Calculator
When using this calculator, enter all grades as a percentage, as a number between 0 and 100.
Grade Disputes
If you feel an assignment was graded incorrectly, you have one week from the date you received a grade for the assignment to discuss it with the instructor. Do not bring grade disputes to any other course staff such as teaching assistants. Teaching assistants do not have authority to modify grades.
You may not simply ask for a re-grade, but instead must justify to the instructor why the grading was done incorrectly. By disputing any grading, you agree to allow the instructor to review the entire assessment in question for other errors missed during grading. Requests must be sent via email.9 Grade disputes over trivial points will likely be met with frustration.10
After one week, grading is final except for exceptional circumstances.
Academic Integrity
The official University of Illinois policy related to academic integrity can be found in Article 1, Part 4 of the Student Code. Section 1-402 in particular outlines behavior which is considered an infraction of academic integrity. These sections of the Student Code will be upheld in CS 307. Any violations will be dealt with in a swift, fair, and strict manner. In short, do not cheat, it is not worth the risk. You are more likely to get caught than you believe. If you think you may be operating in a grey area, you most likely are.
Additional Information
Safety
The university values your safety. Please review the Run-Hide-Fight documentation provided by the Division of Public Safety.
Disability Accommodations
To obtain disability-related academic adjustments or auxiliary aids, students with disabilities must contact the course instructor and the Disability Resources and Educational Services (DRES) as soon as possible. To contact DRES, you may visit 1207 S. Oak St., Champaign, call 217-333-4603, email disability@illinois.edu or go to the DRES website.
To ensure appropriate accommodation is provided in a timely manner, please provide your Letter of Accommodation during the first week of class. Letters received after a relevant assessment has been administered will likely cause logistical issues that could result in an inability to accommodate.
The Extended Syllabus
For some thoughts on teaching philosophy, some explanation of policies, and some general tips for success, please see The Extended Syllabus.
Changes
The instructor reserves the right to make any changes he considers academically advisable. Such changes, if any, will be announced. Please note that it is your responsibility to keep track of the course proceedings.
Footnotes
David is listed in the syllabus because that is an official name on record with the University, but Dave is preferred.↩︎
If you aren’t already, you should get into the habit of checking your University email at least once a day.↩︎
A total of four suggested times.↩︎
Accounts registered with an email other than an
@illinois.edu
account will be removed.↩︎Depending on the situation, failure to follow this rule may make a response impossible.↩︎
“Google” here refers to any search engine, and now, in the year 2024, a generative AI could also be used.↩︎
PrairieLearn for everything except for Lab Reports which are submitted to Canvas.↩︎
Or, use the calculator below.↩︎
Failure to follow the email policy will result in your request being denied.↩︎
A grade on a single assignment is not reflective of your overall grade in the course. The generous buffer points should more than make up for a single point deduction on a single assignment.↩︎