Skip to content

Syllabus

aronwc edited this page Oct 3, 2013 · 38 revisions

Table of Contents

Overview

  • Course: CS595: Machine Learning and Social Media
  • Instructor: Dr. Aron Culotta
  • Meetings: T/R 1:50-3:05 p.m. Stuart 204
  • E-mail: aculotta at iit.edu
  • Phone: 312-567-5261
  • Office Hours: T/R 10:00 a.m. - 11:00 a.m.
  • Office: Stuart Hall 229B
Description: This seminar will explore the latest research developing machine learning methods to analyze online social media. Topics include: sentiment classification, information extraction, clustering, topic modeling, information diffusion, and social network analysis. Emphasis will be placed on the application of this technology to areas such as public health, crisis response, politics, and marketing.

Learning Objectives

By the end of the course, students will:

  • Understand the core technical problems in applying machine learning algorithms to social media data.
  • Understand the state-of-the-art approaches to these problems.
  • Have applied knowledge of this field to a novel, substantial course project.

Grading

  • 200 points - Paper summaries (10 @ 20 points each)
  • 100 points - Paper presentations (1 @ 100 points each)
  • 200 points - Project (1 @ 200 points each)
  • 500 total points
Percent Grade
100-90 A
89-80 B
79-70 C
69-60 D
< 60 E

Assignments

Paper Summaries

You will write 10 paper summaries (one per week). On the first day of class, the instructor will assign which papers you will write summaries for. Summaries are due the night before the paper is discussed. To complete these assignments, you should

  1. Find the paper to read from the Schedule (e.g.,schulz13multi)
  2. Create a new file in the paper directory with your iit email name (e.g., aculotta.md)
  3. Add your summary and click "Commit Changes"
I've included an example summary here. Each summary should contain the following:
  • Overview: Write a short paragraph summarizing the content of the paper.
  • Algorithm: Describe in more detail the primary algorithm proposed or applied in the paper.
  • Hypothesis: List the hypotheses the authors test in the paper (note that these are not always explicitly stated).
  • Data: Describe the data used in the experiments
  • Experiments: Briefly describe how are the experiments are organized.
  • Results: Describe the results and their significance.
  • Assumptions: List some of the important assumptions the authors make in their work.
  • Synthesis: Are there claims you disagree with? What would you do differently? What would you do next?
  • Related Papers: List 2-3 papers that are most similar to this paper. For each, briefly list how this paper is different.
Do not simply copy phrases from the paper! It should be evident that you have attempted to digest the content; so it should be put in your own words and filtered through the lens of your understanding.

Paper Presentation

For a subset of the papers that you write summaries for, you will also present your summary to the class and lead the discussion. The presentation should be a more detailed version of the summary. In addition to the components above, the presentation should contain discussion questions for the class.

Presentations should include some visual aide (Powerpoint or something similar). As the discussion leader, you should also prepare by reading some of the papers related to the discussion paper, to provide context.

Your presentations will be graded on:

  • How well did you summarize the material?
  • Did you understand the main points and conclusions of the paper?
  • Do you understand why the system works the way it did?
  • Did you bring in background information or related papers?
  • Did you ask questions that demonstrate an understanding of the content?
  • Did you engage the other students?
  • Did you enhance the students' understanding of the material?

Project

Teams of students will complete a final project that applies some ideas from the class. Teams can consist of at most two students. Here are some guidelines:

  • Your project should be hosted on github. If you are on a team, pick one member's account.
  • Add a link to your project's github repository here.
  • Add a descriptive README at the root of your project.
  • You may use off-the-shelf analysis tools in your work; no need to start from scratch. I've listed some examples under Software
  • Here are some projects from a CMU course that are good examples to help you come up with ideas. I also recommend browsing the Data page.
The 200 points is broken down into:
  • 50 points - Presentation: Is the presentation clear, well-organized, and thorough?
  • 50 points - Code: Can I reproduce your results by running your code? Is the code well-written, debugged, and documented?
  • 50 points - Report: Follow the similar format as the papers we read for class. Your report should be 5-6 pages, including all references and figures. Are the main algorithms, hypotheses, and assumptions clearly stated? Are the comparisons with related work sound? Sections include:
    • Introduction: What did you do and why? What are the research questions of your analysis? What is your hypothesis?
    • Data: What data did you collect and how?
    • Methods: What did you do with the data, precisely?
    • Experiments: These should answer your research questions and test your hypotheses.
    • Related work: How have others approached this problem? What makes your approach different?
    • Conclusions and Future Work: What should we have learned from reading your paper? What's left to do?
  • 50 points - Scientific rigor: Are your claims supported by the experimental results? Have you attempted to rule out all other reasonable competing hypotheses? Are the experiments soundly developed and executed?
Clone this wiki locally