-
Notifications
You must be signed in to change notification settings - Fork 129
User_Dataset Management
When you create a challenge, all datasets (and code) are stored in "My Datasets" automatically. But you can also upload data manually and reference it in your competition.
- Quick start
- Creating datasets
- Removing datasets
- Modifying datasets
- Downloading datasets
- Using datasets in YAML file
- Switching datasets with the editor
When you are logged into Codalab, go to My Competitions>My Datasets>Create Dataset. Fill out the form:
You can upload different types of "data" (or code):
- Public data: Will be made visible for download.
- Input data: Will be made available to code submissions (but not be downloadable).
- Reference data: Will be made available to the scoring program (but not visible to participants nor to their code).
- Ingestion program: Organizer-provided code that will be run when code submissions are made.
- Scoring program: Organizer-provided code rating the predictions against the solutions.
- Starting kit: Organizer-provided kit, which may include sample code and sample submissions.
Upload the your file DataName.zip
. After you upload, you should see your new dataset in the data table. The KEY can be used to refer to it from the YAML file, e.g.
public_data: dac49905-dda0-4857-922a-02ca957ec8fd
You can also use the editor. In Competitions I’m running, find your competition and click “Edit”:
Find the menus for Input Data, Reference Data, Public Data, Ingestion Program, Scoring Program, and Starting Kit. Select the right dataset, as don’t forget to SAVE YOUR CHANGES.
- Login
- Click on "My Datasets" in the top right near your username
- Click the "Create dataset" button in the top left of the content area
- Fill in all of the relevant information
- Click the "Upload" button
- If everything was successful, the dataset should appear in the list of datasets
- Login
- Click on "My Datasets" in the top right near your username
- Find the dataset you want to delete, click the "DEL" button on the right
- Confirm that you want to delete it -- the dataset may already be in use in a competition, which you should be warned about.
- Login
- Click on "My Datasets" in the top right near your username
- Click the "Download" button to the right of the dataset you want to download
For public_data
, input_data
, reference_data
, ingestion_program
, or scoring_program
you change the file name to the UUID of the dataset. For example:
phases:
0:
phasenumber: 0
reference_data: af5e8c26-73b0-485a-a8b9-a572dd88d828
scoring_program: 21a8f881-2e53-4c71-b841-47e78e0b4040
input_data: few2f881-2rp3-4221-b121-j4kj3934jt42
Warning: OLD WAY OF representing public data, you should use public-data
instead of this:
datasets:
1:
name: Training Data
key: 21a8f881-2e53-4c71-b841-47e78e0b4040
description: Training data
- Login
- Go to "My Codalab"
- Go to "Competitions I'm running" tab
- Find the competition you want to edit, click "Edit"
- Scroll to the phase you want to modify and you can select a new
public_data
,input_data
,reference_data
,ingestion_program
, orscoring_program
from datasets you have uploaded. - Click "Save" to complete
NOTE: You cannot change Dataset entries with uploading a new competition.yaml
, this is a limitation of the current Competition Edit Form