diff --git a/docs/group-project-page.md b/docs/group-project-page.md index cdbd386..9b4ccbf 100644 --- a/docs/group-project-page.md +++ b/docs/group-project-page.md @@ -8,9 +8,11 @@ For this project, you will need to formulate formulate a research question of yo and then identify and use a dataset to answer to the question. We list some suggested data sets below; however, we encourage you to use another data set that interests you. If you are unsure whether your dataset is adequate, please reach out to a member of the teaching team. -### Deliverable 1: Team Contract +**Important Note:** Although this is a group project, some related assignments will be submitted individually. You can (and are encouraged to) discuss it with your group members. However, *every* student will submit their own assignment and will receive an individual grade. -A group contract is a document to help you formalize the expectations you have for your group members and what they can expect of you. It will help you think about what you need from each other to work effectively as a team! You will create and agree on this contract as a team. **Each member should "sign" (you can just type out your name) at the bottom of the submission**. At a minimum, your group contract must address the following: +### Deliverable 1: Team Contract (1% weight) + +A **group contract** is a document to help you formalize the expectations you have for your group members and what they can expect of you. It will help you think about what you need from each other to work effectively as a team! You will create and agree on this contract as a team. **Each member should "sign" (you can just type out your name) at the bottom of the submission**. At a minimum, your group contract must address the following: #### Goals @@ -28,61 +30,53 @@ What rules can we agree on to help us meet our goals and expectations? How will we address non-performance regarding these goals, expectations, policies and procedures? -### Deliverable 2: Project Proposal +#### Data: -Each group is expected to prepare a written proposal within 500 words -(about 1 page) with a research question of interest, and the dataset they plan to use. -The proposal should be done in a Jupyter notebook, and -then submitted both as an `.html` file (`File` → `Download As` → `HTML`) and -an `.ipynb` file that is reproducible (i.e. works and runs without any additional files). +Many data analyses start with a question or questions we are intrested in. Then we find or collect data to answer the question(s). Decide with your team members a topic of interest you want to work on and identify a dataset to work with. Include a brief description of the dataset (2 or 3 sentences) and a link or reference on how to access the dataset. Note that in the next individual assignment you will be asked to fully describe the dataset. -Only one member of your team needs to submit. You must submit **two files**: + +### Individual assignments + +Although this is a group project, the following 3 assignments related to a dataset chosen by the group will be submitted individually. You can (and are encouraged to) discuss them with your group members. However, you don't need to come with a common solution and *every* student has submit their own assignment and will receive an individual grade. + +All assignments must be done in a Jupyter notebook, and then submitted both as an `.html` file (`File` → `Download As` → `HTML`) and an `.ipynb` file that is reproducible (i.e. works and runs without any additional files). + +#### Assignment 1: Data and Question(s) + +*This is an individual assignment*. Every student needs to write and submit their own assignment. You must submit **two files** in Canvas: - the source Jupyter notebook (`.ipynb` file) - the rendered final document (`.html` file) -Each proposal should include the following sections: +#### Data: -- Title -- Introduction -- Exploratory Data Analysis -- Methods: Plan -- References +In the contract you identified a dataset to work with your group. In this assignment, provide a full description of the dataset chosen. -#### Introduction +Note that the selected dataset will probably contain more variables than you need. In fact, exploring how the different variables in the dataset affect your model may be crucial part of the project. Regardless of which variables you plan to use, provide a full descriptive summary of the dataset. This is *not* an Exploratory Data Analysis, just a characterization of the data. Include information such as: number of observations, number of variables, name and type of variables, etc. You may want to use a table or bullet points to describe the variables in the dataset. -Begin by providing some relevant background information on the topic so -that someone unfamiliar with it will be prepared to understand the rest -of your proposal. +Include a brief description of the dataset indicating how the data has been collected or where it comes from. -Clearly state the question you will try to answer with your project. -Your question should involve one random variable of interest (the response) and one or more explanatory variables. -Of the response variable, explain whether your project is focused on prediction, inference, or both. +#### Question: -Identify and describe the dataset that will be used to answer the -question. Remember, this dataset is allowed to contain more variables -than you need, in fact, exploring how the different variables in the dataset affect your model, -is a crucial part of the project. +Clearly state the question you will try to answer using the selected dataset. Your question should involve one random variable of interest (the response) and one or more explanatory variables. Describe clearly how the data will help you address the question of interest. Explain whether your question is focused on prediction, inference, or both. -Also, be sure to frame your question/objectives in terms of what is -already known in the literature. Be sure to include at least two -scientific publications that can help frame your study (you will need to -include these in the References section). +It is fine to have the same question as other group members. However, you don't need to agree on a unique common question for the group project. In fact, usually many questions can be answered with the same dataset. Regardless of how many questions are proposed within each group, *each team member* needs to state and justify a question of interest. -#### Preliminary Results +#### Assignment 2: Exploratory Data Analysis and Visualization -In this section, you will: +In this assignment, you will: - Demonstrate that the dataset can be read from the web into R. - Clean and wrangle your data into a tidy format. -- Plot the relevant raw data, tailoring your plot in a way that addresses your question. - - make sure to explore the association of the explanatory variables with the response. - - your Exploratory Data Analysis (EDA) must be comprehensive with high quality plots. -- Any summary tables that is relevant to your analysis. +- Propose a visualization that you consider relevant to address your question or to explore the data. + - propose a high quality plot or set of plots of the same kind (e.g, histograms of different variables) + - explain why you consider this plot relevant to address your question or to explore the data -Be sure to not print output that takes up a lot of screen space. +**Note**: this visualization does not have to illustrate the results of a methodology. Instead, you are exploring which variables are relevant, potential problems that you anticipate to encounter, groups in the observations, etc. + +It is fine to have idea as other group members. However, you don't need to agree on a unique common visualization for the group project. In fact, usually the exploratory data analysis will have many different visualizations! Regardless of how many plots are proposed within each group, *each team member* needs to propose one visualization and justify their choice. -#### Methods: Plan +#### #### Assignment 3: Methods and Plan The previous sections will carry over to your final report (you’ll be allowed to improve them based on feedback you get). Begin this *Methods* @@ -147,13 +141,40 @@ Only one member of your team needs to submit. You must submit **two files**: #### Introduction -The instructions for this section are the same in your proposal. Just be -sure to improve this section by incorporating feedback, and changing -things based on your own improved understanding of the project (now that -more time has passed since the proposal). +Just be sure to improve this section by incorporating feedback, and changing +things based on your own improved understanding of the project. + +Begin by providing some relevant background information on the topic so +that someone unfamiliar with it will be prepared to understand the rest +of your proposal. + +Identify and describe the dataset that will be used to answer the +question. Remember, this dataset is allowed to contain more variables +than you need, in fact, exploring how the different variables in the dataset affect your model, +is a crucial part of the project. + +Also, be sure to frame your question/objectives in terms of what is +already known in the literature. Be sure to include at least two +scientific publications that can help frame your study (you will need to +include these in the References section). #### Methods and Results +#### Preliminary Results + +In this section, you will: + +- Demonstrate that the dataset can be read from the web into R. +- Clean and wrangle your data into a tidy format. +- Plot the relevant raw data, tailoring your plot in a way that addresses your question. + - make sure to explore the association of the explanatory variables with the response. + - your Exploratory Data Analysis (EDA) must be comprehensive with high quality plots. +- Any summary tables that is relevant to your analysis. + +Be sure to not print output that takes up a lot of screen space. + + + Here is where you’ll include your work from the “Preliminary Results” in your proposal, along with the additional results you planned to conduct, as indicated in the “Methods: Plan” section of your proposal. Be sure to