-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
docs: draft of project, taking ideas from the issue. Closes #5
- Loading branch information
1 parent
f11bfc6
commit 7c25991
Showing
1 changed file
with
56 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
# Group project assignment {#sec-assignment} | ||
|
||
To maximize how much you learn and how much you will retain, you as a | ||
group will take what you learn in the course and apply it to create a | ||
reproducible project. This project ... | ||
|
||
During the last session of the course you will work on this assignment. | ||
In the last \~20 minutes of this session, the lead instructor will ... | ||
and re-generate your report to check that it is reproducible. | ||
|
||
## Specific tasks | ||
|
||
You will be collaborating as a team using ... to manage your group | ||
assignment. We will set up the project with ... for you so you can | ||
quickly start collaborating together on the project. | ||
|
||
Your specific tasks are: | ||
|
||
Sequence of steps for project: | ||
|
||
- Starting point: | ||
- Learning how to identify what file storage format (e.g. csv or | ||
SAS dataset) there are and knowing how convert those files into | ||
more efficient formats (like Parquet or a SQL database) | ||
- Give them a few server environment types, and the same data but | ||
with different starting formats. And then they figure out the | ||
next steps based on that information | ||
- Multiple data is big enough to prevent doing it normal way (1 Gb | ||
or larger?) | ||
- Explaining why the original data format might not be ideal and then | ||
converting the data into more efficient format | ||
- Identify what the desired sample is for the dataset, only select and | ||
filter data they need for analysis | ||
- Split the data into smaller chunk to prototype code (running code on | ||
all the data later) | ||
- Run basic analysis (descriptive statistics)... Not modeling | ||
- Implement some code to run with parallel processing | ||
- Identifying which format data or items can be downloaded, and | ||
converting that to that format | ||
|
||
Assumptions: | ||
|
||
- Assume they have taken the intermediate course (need to know | ||
functionals and function-based workflows), and either have read or | ||
taken the advanced course or are familiar enough with targets | ||
|
||
## Quick "checklist" for a good project | ||
|
||
## Expectations for the project | ||
|
||
What we expect you to do for the group project: | ||
|
||
What we don't expect: | ||
|
||
Essentially, the group project is a way to reinforce what you learned | ||
during the course, but in a more relaxed and collaborative setting. |