Skip to content

Project Assignments

Anna Sapienza edited this page Oct 13, 2021 · 7 revisions

Project Assignments Overview

The point of the Project Assignments is to try out the skills you've learned in the course on your own dataset. We've been working on understanding networks and natural language processing, so the idea is to find a dataset to analyze that will let you show off what you can do.

This year, you should find a dataset from a Wiki source, but remember to work with something that interests you - that way the project will be much more fun to work on.

Here, are some example of Wiki datasets that it might be fun to work with

You will be working together in groups just as for the first two assignments.

Project Assignment A

The first part of the final project is a 3 minute movie, which should explain the central idea/concept that you will investigate in your final project. You're making the movie so that the TAs, Anna, and I can give you feedback, and so that other groups can 'steal' your ideas (and you can steal ideas from them). The movie must contain the following:

  • An explanation of the central idea behind your final project (what is the idea?, why is it interesting? which datasets did you need to explore the idea?, how did you download them)
  • An outline on the elements you'll need to get to your goal & the implementation plan..
  • A walk-through of your preliminary data-analysis, addressing
    • What is the total size of your data? (MB, number of rows, number of variables, etc.)
    • What is the network you will be analyzing? (number of nodes? number of links?, degree distributions, what are node attributes?, etc.)
    • What is the text you will be analyzing?
    • How will you tie the two together?

But other than that, there are no constraints on the video format. And we do appreciate funny/inventive/beautiful movies, although the academic content is most important. Note that we'll display the movie to the entire class.

I've put some example videos here for your viewing pleasure https://github.com/suneman/socialgraphs2016/wiki/Videos

Handing in the assignment: Simply upload your video to youtube (the higher the resolution the better) and submit the link to peergrade.

Note that since Project Assignment A now requires significant data-work, you have 2 weeks to create the video presentation.

Project Assignment B

The deliverables for the Final project will be

  • A website. The website should contain your analysis, it should tell the story about the data that you're interested in getting across. The website should not be technical, but rather aim at using visualization and explanation to get your insights across to a non-scientific reader.
  • An explainer Notebook. The Notebook should contain all the behind the scenes stuff, details on the dataset, why you've selected this particular dataset, explanations of your choices regarding network analysis, etc. You should link to the notebook from the site.

The idea is that you can create much more complex, fun and interactive analysis (and visualizations) on line. So the website is a way for you to present your work in a way that everyone can understand it ... including dynamic visualizations, interactive analysis, etc, etc ... that would not work on a piece of paper. (Also, it'll hopefully be something cool you can show your friends <-- sorry, I know I'm a nerd).

More about the website

This part of the assignment is quite free. The main point of the website is to present your idea/analyses to the world in a way that showcases your use of what you've learned in class. It can be as simple as an oldfashioned static web-page, and as complicated as you want it to be. Let your creativity run wild (but keep in mind that this is not a coding class - we care mostly about content and analysis).

The website should be self-contained and tell the story of your dataset without the need for the Explainer Notebook (the purpose of the notebook is to provide additional details for interested readers). Here are some requirements

  • The page should say clearly what the dataset is and give the reader some idea of its most important properties (kind of Project Assignment A-style).
  • The page should contain your network and text analysis (that's the main part).
  • There should be download options for data sets (so the user can play around).
  • You must link to the Explainer Notebook (more details below) that explains the details of your analysis (including all of the machine learning, the model selection, etc). You can achieve this with a link to an IPython notebook displaying on the nbviewer.

For hosting, I recommend using your DTU website or github pages.

More on the explainer notebook

The notebook should contain your analysis and code. Please structure it into the following 4 sections

  1. Motivation
  • What is your dataset?
  • Why did you choose this/these particular dataset(s)?
  • What was your goal for the end user's experience?
  1. Basic stats. Let's understand the dataset better
  • Write about your choices in data cleaning and preprocessing
  • Write a short section that discusses the dataset stats (here you can recycle the work you did for Project Assignment A)
  1. Tools, theory and analysis. Describe the process of theory to insight
  • Talk about how you've worked with text, including regular expressions, unicode, etc.
  • Describe which network science tools and data analysis strategies you've used, how those network science measures work, and why the tools you've chosen are right for the problem you're solving.
  • How did you use the tools to understand your dataset?
  1. Discussion. Think critically about your creation
  • What went well?,
  • What is still missing? What could be improved?, Why?
  1. Contributions. Who did what?
  • You should write (just briefly) which group member was the main responsible for which elements of the assignment. (I want you guys to understand every part of the assignment, but usually there is someone who took lead role on certain portions of the work. That’s what you should explain).

  • It is not OK simply to write "All group members contributed equally".

Some additional notes:

Make sure that you use references when they're needed and follow academic standards.

I envision Part 3: Tools, theory and analysis as the central part of the assignment, where you basically go through the steps in the analysis. So the structure of this part would be something like

  • Explain the overall idea
  • Analysis step 1
    1. explain what you're interested in
    2. explain the tool
    3. apply the tool
    4. discuss the outcome
  • Analysis step 2
    1. explain what you're interested in
    2. explain the tool
    3. apply the tool
    4. discuss the outcome
  • Analysis step 3,
    • ... and so on until the analysis is done
Clone this wiki locally