-
Notifications
You must be signed in to change notification settings - Fork 4
/
syllabus.html
377 lines (330 loc) · 19 KB
/
syllabus.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
---
layout: page
title: Syllabus
weight: 0
---
<section class="main-container page-head">
<div class="main">
<!-- <p class="head-subtitle link"><a href="syllabus_2020.pdf" target="_blank">PDF version</a></p>-->
</div>
</section>
<section class="main-container text">
<div class="main">
<h2 class="title">Course Description</h2>
<p>CS 181 provides a broad and rigorous introduction to machine
learning, probabilistic reasoning and decision making in uncertain
environments. We will discuss the motivations behind common machine
learning algorithms, and the properties that determine whether or not
they will work well for a particular task. You will derive the
mathematical underpinnings for many common methods, as well as apply
machine learning to challenges with real data. In doing so, our goal
is that you gain a strong conceptual understanding of machine learning
methods that can empower you to pursue future
theoretical and practical directions. </p>
<h3>Where CS181 fits with other ML/AI courses</h3>
<p> The goal of CS 181 is to combine mathematical derivation and coding assignments
to provide a strong and rigorous conceptual grounding in
machine learning (e.g. being able to reason about how different
methods should behave in different circumstances). Students
interested primarily in theory may prefer Stat195 and other learning
theory offerings. Students interested primarily in practice may
prefer CS109a and other data science offerings. Students
interested in a more advanced, optimization-based orientation
may prefer CS 183. Students looking for specialized topics may prefer CS28x and other graduate
seminars.</p>
<h3>Prerequisites</h3>
<p>The material is aimed at an <em>advanced
undergraduate level</em>. Students should be comfortable with writing
non-trivial programs (e.g., CS 51, CS 61, or equivalent). All staff-provided
code will be in Python. Students should also have a background in
probability theory (e.g., STAT 110 or equivalent), and
familiarity
with calculus and linear algebra
(e.g., AM 22a or Math 21ab, or equivalent).
</p>
<p>
Motivated students without all of these prerequisites may also be able to
fill in gaps in their knowledge. Part I of
<a href="https://mml-book.com" target="_blank">Math for Machine Learning</a>
is a useful resource for mathematical
background (specifically Sections 2.1-2.6; 3.1-3.5; 4.1-4.2; 5.1-5.6;
6.3).
This year we are also planning additional homework zero
style material as well as additional sections throughout the semester
to help with mathematical background.
</p>
<h2 class="title">Course Logistics</h2>
<h3>Lecture, Section, Office Hours</h3>
<p>
<strong>Team</strong>
The CS181 team consists of two course instructors--
Finale Doshi Velez and David Parkes ---as well as a large
staff of TFs lead by two
co-head TFs. We are all dedicated to helping you to learn the
fundamentals of machine learning.
</p>
<p>
<strong>Lectures</strong>
Lectures will be used to introduce new content as well as explore the
content through conceptual questions. They will be given over
Zoom and also recorded, and involve both slides and iPad-based
discussion.
We plan two sessions each week, in the designated class time. The
instructors will endeavor to hang around after class to answer
questions. Students are encouraged to attend live
so that they can ask questions, including over chat.
<em>We recognize that not everyone is comfortable with using their
video camera during class. But we strongly encourage this and would
like 'video on' to be a norm.</em><br>
Attending live lecture is an expectation of CS 181 students. You are expected to attend at least 7 live lectures (~1/3 of lectures) with your camera turned on, unless your circumstances don’t permit. If you would like to request an attendance exemption, see <a href="https://edstem.org/us/courses/3955/discussion/223627">this Ed post</a> for instructions.<br>
<!--During lecture, I may remind the class
about upcoming deadlines, clarify points in the homework, and respond
to questions about upcoming assignments and midterms. Not all of
these interactions may make it to the class announcements. Thus, if
you miss a lecture, we strongly recommend asking friends about
anything that was mentioned.-->
</p>
<p>
<strong>Sections</strong>
Sections will employ a flipped classroom format, in which students
will work on questions that will be good preparation for
homework and the midterms. The teaching staff will introduce the
questions, assist students in solving them, and wrap up with the
solutions. These solutions will be posted.
The section cycle “restarts” each Monday, when a new section begins. Each week’s section covers the previous week's Tuesday and Thursday lectures. So for example, the sections from 3/1 to 3/4 cover content from Tuesday, 2/23 and Thursday, 2/25's lectures.
<!-- <em>The staff may post
additional practice questions or pointers to other practice
resources. We do not guarantee solutions for these additional
resources. </em>-->
</p>
<!-- <p>
While sections attendance is optional, attendance will be taken and
strong participation is one way we may choose to decide letter grades
for students who are near a boundary. Section is also a great place
to find study partners!
</p>-->
<p>
<strong>Office Hours</strong>
We will be holding a lot of office hours on Zoom. Please make use
of these office hours! We have in mind structing them with
different break-out rooms per problem set question or
topic of interest.
<!--In addition to getting questions answered by the
staff, office hours are also a great place to find study partners.-->
</p>
<p><strong>Zoom Policy</strong>
We strongly prefer you participate on Zoom with your camera turned on, unless your circumstances don’t permit.
<p><strong>Tablet</strong>
We will expect students to have access to a tablet, to help with
communication with staff and each other during office hours and sections, and have notifed
the Office of Undergraduate Education about this so that they can be
prepared to help if you do not have access to one.
<h3>Materials and Resources</h3>
<p>
<strong>Textbook</strong>
There is no official textbook for the course. There is a set
of course notes available
<a href="https://github.com/harvard-ml-courses/cs181-textbook"
target="_blank">here</a>. We should emphasize, though, that these are due to the awesome effort of a past
CS181 student who decided to create a course textbook as an
(unusually ambitious!) senior
thesis. There may still be some bugs, and if you find any
please be a good
citizen and put in a pull request.
</p>
<p>
<strong>Course Website</strong>
The course web site will be used for posting
section notes and links to assignments, and includes pointers to other
resources we'll use, including Ed and GradeScope.
</p>
<p>
<strong>GradeScope</strong>
GradeScope will be used for submiting assignments and
posting grades.
</p>
<p>
<strong>Ed</strong>
Most communications with the course staff should go via <a href="https://edstem.org/us/join/6mkCz8">Ed</a>
rather than email. In particular, the Ed site for the course will be used for three purposes:
<ul>
<li>
<strong>Content</strong> questions are technical questions posted to
the teaching staff and other students. (Please keep in mind collaboration policies when
asking about code or solutions.) <!-- The course staff will <em>not</em>
be responsible for immediate responses but will answer when
possible; technical questions to TFs should be brought to office
hours (or to section when appropriate).-->
</li>
<li>
<strong>Clarification</strong> questions are posts
to the teaching staff about logistical
details (Is there really class on XYZ holiday or is that a mistake?)
or questions about homework phrasing or typos (Should question 1a of
the homework be asking for the integral of x, not y?). We will
make every effort to respond to these questions as quickly as
possible. Tag these questions as "clarification."
</li>
<li>
<strong>Private Message</strong> These may include procedural things such
as requests for additional time on midterms, additional late days,
regrades; you may also have other concerns that you wish to share.
We ask you to send those as <em>private</em> messages on Ed with
the <em>appropriate tag</em>: Regrade, Extension, Special Midterm, and
Other. We will be using these tags to make sure that the right
people get your request. Most such procedural
requests should come this way.
</li>
</ul>
</p>
<p>
Ed is not a formally secure, private, or
confidential form of communication, and what you send may be seen by
the entire course staff. If you have a sensitive concern,
<em> please also directly email the
two co-instructos</em>.
</p>
<h2 class="title">Requirements and Grading</h2>
<p>The main grading components of the course are the
<strong>six homework assignments</strong>
(<strong>10% each</strong>), <strong> one
practical</strong> (<strong>10% </strong>), and <strong>two midterms</strong> (<strong>15% each</strong>, in March and April).
Participation in section, office hours, Ed, and lecture may be used to bump
up a grade that for a student who ends up near a
letter-grade boundary. Similarly, any bonus component of the
course, such as an exceptionally creative practical solution,
will only be a factor for students on grade boundaries.</p>
<p>
<strong>Grading errors</strong> If you believe there has been a grading error,
submit a regrade request through GradeScope. However, please note that a) we will regrade the
entire assignment, which may result in your total grade going up or down, and b) we will
only allow one regrade request per problem set. Regrade requests are due 1 week after grades are released.
</p>
<h3 id="homework">Homeworks</h3>
<p>The homework assignments help you practice the core
concepts that we cover in the course. They involve
components that are theoretical and conceptual and also require some
programming. Homework solutions must be submitted in LaTeX and will
be returned with grades and solutions. Due to the volume of the
grading, it may not always be possible for the staff to provide
detailed feedback. It is your responsibility to look at the
solutions, identify gaps, and come to office hours to fill
in those gaps. We also have one "practical" assignment, which can be
done with one other student, and that is more
open-ended in nature. You will be asked to explore
different machine learning algorithms on a particular data
set, with a passing grade for beating some baselines and
bonuses for an especially creative or successful
approach. </p><br>
<p> <strong>Collaboration Policy</strong>
You may work with others, but your write-up must be entirely written
by yourself in your own words. You may help each other debug code,
but again, the code must be written by you. Include the names of
anyone you worked with in your write-up. <!-- We encourage you to spend
time thinking about and understanding the homework on your own before
collaborating with others to practice for the midterms.--> <em>It is
an honor code violation to copy parts of another person's assignment
or jointly type up an assignment.</em>
You can make use of textbooks and online sources to help in answering
questions, but you must cite your sources (and you should be ready to
explain your answer to a member of the teaching staff.) <em>It is
an honor code violation to look up solutions to the
specific questions that we ask from the internet
or other sources (e.g. friends from previous years).</em>
</p>
<p> <strong>Late Days Policy</strong> Homework should be submitted electronically
on the due date, via the Gradescope course website. This
is a strict deadline, enforced by the site, so submit early enough
that you don't accidentally discover that your local clock is slow.
You have <strong>six late days</strong> that can be used for homework
assignments. Up to <strong>two late days</strong> can be used on any assignment.
Start early and plan ahead! The staff will give 50% credit to assignments turned in
past their late days at their discretion. <em> It is almost always in
your interest to turn in partial or late homework rather than not
turning in any homework at all. It is an honor code violation to
look at the solutions if you haven't yet turned in your
assignment.</em>
</p>
<p> <strong>Sickness (and other Life Events) Policy</strong>
In general, we expect you to use your late days when you are
sick.
<!--
The whole purpose of a general late day policy is to reduce burden on
the staff (we don't have to adjudicate what is sick enough, what are
valid reasons e.g. travel, family events, etc. for an extension) as
well as allow you some privacy around those decisions.
-->
At the same time, we understand that
sometimes life throws a set of circumstances that impact your
perfomance in the course, and all the more so given
the current global pandemic and working and studying conditions.
Should this become a problem for you, please
let the two co-insuctors know, via email, so that we can help determine a plan to navigate a
tough situation.
If you find that you have used up all your late days, for
example, and have
more illness then please reach out to us. Most likely, we will
ask you to start a correspondence with your resident dean to verify their
support of your extra needs (we would not need a
doctor's note, but rather your resident dean would provide
us with what we need to know to appropriately adjust).
<h3>Midterms</h3>
<p>
Midterms are a chance to demonstrate what you have
learned. Midterms will be closed-book, timed (1hr 20 mins), and proctored via Zoom breakout rooms.
We’ll have two time windows to handle different time zones and load balance.
<!--
Section problems, homework, and concept questions are all great
starting points for study. You will be allowed to bring in one sheet
of 8.5 by 11 paper, front and back, as notes and the staff
may,-->
To the extent possible, we will also provide
you with what we think you need to be able to answer the question
without needing to memorize too many things. <em>It is an honor code
violation to communicate with anyone about the midterm while
you take the midterm, and to communicate in any way with other
students. You should also be careful not to share information about the midterm with any students
who need to take a midterm at a different time.</em><br>
<strong>Illness</strong>
If you have an acute illness at the time of a
midterm, then you must let the co-instructors know in advance
of the midterm and get a doctor's note and
send it to us as soon as possible.
We will likely also follow up
with your resident dean and determine the best way to handle
the situation.
</p>
<h2 class="title">Philosophy</h2>
<p>
The goal of the course is to instill a strong technical background
for you to robustly, successfully, and responsibly apply machine learning in the world. Thus, in
addition to the derivations and the practical components, each class will
include some illustrations and discussion of real world
applications of machine learning. There will also be a lecture
and part of an assignment
that is devoted to the
ethical implications of machine learning as part of
the Embedded EthiCS program.
</p>
<p>
Given the the increasing use of machine learning systems,
the users and developers of these systems must hold themselves to high
professional and ethical standards. One can cause real harm by
pursuing a good cause via poor engineering choices. Quoting one of
our favorite superheroes: with great power (to run any kind of
analysis) comes great responsibility (to do it properly)!
</p>
<p>
Relatedly, we expect all participants in this course--- co-instructors,
teaching staff, and students---to be committed to a open, professional, and
inclusive environment. We want everyone to be comfortable
in the course and empowered to learn. These qualities take
cultivation and effort.<!-- I will start with the premise that we're all
decent people trying our best and expect you to do the same. --> We
welcome constructive feedback to improving the course
environment and want you to reach out to the two
co-instructors, or members of the teaching staff, with any
concerns.
</p>
</div>
</section>