Skip to content

Commit

Permalink
Update group-project-page.md
Browse files Browse the repository at this point in the history
  • Loading branch information
gcohenfr authored Sep 7, 2023
1 parent 69f55d9 commit d953564
Showing 1 changed file with 63 additions and 42 deletions.
105 changes: 63 additions & 42 deletions docs/group-project-page.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,11 @@ For this project, you will need to formulate formulate a research question of yo
and then identify and use a dataset to answer to the question. We list some suggested data sets below; however, we encourage you to use another data set that interests you. If you are unsure whether your dataset is adequate, please reach out to a member of the
teaching team.

### Deliverable 1: Team Contract
**Important Note:** Although this is a group project, some related assignments will be submitted individually. You can (and are encouraged to) discuss it with your group members. However, *every* student will submit their own assignment and will receive an individual grade.

A group contract is a document to help you formalize the expectations you have for your group members and what they can expect of you. It will help you think about what you need from each other to work effectively as a team! You will create and agree on this contract as a team. **Each member should "sign" (you can just type out your name) at the bottom of the submission**. At a minimum, your group contract must address the following:
### Deliverable 1: Team Contract (1% weight)

A **group contract** is a document to help you formalize the expectations you have for your group members and what they can expect of you. It will help you think about what you need from each other to work effectively as a team! You will create and agree on this contract as a team. **Each member should "sign" (you can just type out your name) at the bottom of the submission**. At a minimum, your group contract must address the following:

#### Goals

Expand All @@ -28,61 +30,53 @@ What rules can we agree on to help us meet our goals and expectations?

How will we address non-performance regarding these goals, expectations, policies and procedures?

### Deliverable 2: Project Proposal
#### Data:

Each group is expected to prepare a written proposal within 500 words
(about 1 page) with a research question of interest, and the dataset they plan to use.
The proposal should be done in a Jupyter notebook, and
then submitted both as an `.html` file (`File` → `Download As` → `HTML`) and
an `.ipynb` file that is reproducible (i.e. works and runs without any additional files).
Many data analyses start with a question or questions we are intrested in. Then we find or collect data to answer the question(s). Decide with your team members a topic of interest you want to work on and identify a dataset to work with. Include a brief description of the dataset (2 or 3 sentences) and a link or reference on how to access the dataset. Note that in the next individual assignment you will be asked to fully describe the dataset.

Only one member of your team needs to submit. You must submit **two files**:

### Individual assignments

Although this is a group project, the following 3 assignments related to a dataset chosen by the group will be submitted individually. You can (and are encouraged to) discuss them with your group members. However, you don't need to come with a common solution and *every* student has submit their own assignment and will receive an individual grade.

All assignments must be done in a Jupyter notebook, and then submitted both as an `.html` file (`File` → `Download As` → `HTML`) and an `.ipynb` file that is reproducible (i.e. works and runs without any additional files).

#### Assignment 1: Data and Question(s)

*This is an individual assignment*. Every student needs to write and submit their own assignment. You must submit **two files** in Canvas:

- the source Jupyter notebook (`.ipynb` file)
- the rendered final document (`.html` file)

Each proposal should include the following sections:
#### Data:

- Title
- Introduction
- Exploratory Data Analysis
- Methods: Plan
- References
In the contract you identified a dataset to work with your group. In this assignment, provide a full description of the dataset chosen.

#### Introduction
Note that the selected dataset will probably contain more variables than you need. In fact, exploring how the different variables in the dataset affect your model may be crucial part of the project. Regardless of which variables you plan to use, provide a full descriptive summary of the dataset. This is *not* an Exploratory Data Analysis, just a characterization of the data. Include information such as: number of observations, number of variables, name and type of variables, etc. You may want to use a table or bullet points to describe the variables in the dataset.

Begin by providing some relevant background information on the topic so
that someone unfamiliar with it will be prepared to understand the rest
of your proposal.
Include a brief description of the dataset indicating how the data has been collected or where it comes from.

Clearly state the question you will try to answer with your project.
Your question should involve one random variable of interest (the response) and one or more explanatory variables.
Of the response variable, explain whether your project is focused on prediction, inference, or both.
#### Question:

Identify and describe the dataset that will be used to answer the
question. Remember, this dataset is allowed to contain more variables
than you need, in fact, exploring how the different variables in the dataset affect your model,
is a crucial part of the project.
Clearly state the question you will try to answer using the selected dataset. Your question should involve one random variable of interest (the response) and one or more explanatory variables. Describe clearly how the data will help you address the question of interest. Explain whether your question is focused on prediction, inference, or both.

Also, be sure to frame your question/objectives in terms of what is
already known in the literature. Be sure to include at least two
scientific publications that can help frame your study (you will need to
include these in the References section).
It is fine to have the same question as other group members. However, you don't need to agree on a unique common question for the group project. In fact, usually many questions can be answered with the same dataset. Regardless of how many questions are proposed within each group, *each team member* needs to state and justify a question of interest.

#### Preliminary Results
#### Assignment 2: Exploratory Data Analysis and Visualization

In this section, you will:
In this assignment, you will:

- Demonstrate that the dataset can be read from the web into R.
- Clean and wrangle your data into a tidy format.
- Plot the relevant raw data, tailoring your plot in a way that addresses your question.
- make sure to explore the association of the explanatory variables with the response.
- your Exploratory Data Analysis (EDA) must be comprehensive with high quality plots.
- Any summary tables that is relevant to your analysis.
- Propose a visualization that you consider relevant to address your question or to explore the data.
- propose a high quality plot or set of plots of the same kind (e.g, histograms of different variables)
- explain why you consider this plot relevant to address your question or to explore the data

Be sure to not print output that takes up a lot of screen space.
**Note**: this visualization does not have to illustrate the results of a methodology. Instead, you are exploring which variables are relevant, potential problems that you anticipate to encounter, groups in the observations, etc.

It is fine to have idea as other group members. However, you don't need to agree on a unique common visualization for the group project. In fact, usually the exploratory data analysis will have many different visualizations! Regardless of how many plots are proposed within each group, *each team member* needs to propose one visualization and justify their choice.

#### Methods: Plan
#### #### Assignment 3: Methods and Plan

The previous sections will carry over to your final report (you’ll be
allowed to improve them based on feedback you get). Begin this *Methods*
Expand Down Expand Up @@ -147,13 +141,40 @@ Only one member of your team needs to submit. You must submit **two files**:

#### Introduction

The instructions for this section are the same in your proposal. Just be
sure to improve this section by incorporating feedback, and changing
things based on your own improved understanding of the project (now that
more time has passed since the proposal).
Just be sure to improve this section by incorporating feedback, and changing
things based on your own improved understanding of the project.

Begin by providing some relevant background information on the topic so
that someone unfamiliar with it will be prepared to understand the rest
of your proposal.

Identify and describe the dataset that will be used to answer the
question. Remember, this dataset is allowed to contain more variables
than you need, in fact, exploring how the different variables in the dataset affect your model,
is a crucial part of the project.

Also, be sure to frame your question/objectives in terms of what is
already known in the literature. Be sure to include at least two
scientific publications that can help frame your study (you will need to
include these in the References section).

#### Methods and Results

#### Preliminary Results

In this section, you will:

- Demonstrate that the dataset can be read from the web into R.
- Clean and wrangle your data into a tidy format.
- Plot the relevant raw data, tailoring your plot in a way that addresses your question.
- make sure to explore the association of the explanatory variables with the response.
- your Exploratory Data Analysis (EDA) must be comprehensive with high quality plots.
- Any summary tables that is relevant to your analysis.

Be sure to not print output that takes up a lot of screen space.



Here is where you’ll include your work from the “Preliminary Results” in
your proposal, along with the additional results you planned to conduct,
as indicated in the “Methods: Plan” section of your proposal. Be sure to
Expand Down

0 comments on commit d953564

Please sign in to comment.