It's pretty tricky to keep all the language tracks in good working order, and it's not uncommon that people discover typos in the README or the test suites, inconsistencies between the README and the tests, missing edge cases, factual errors, logical errors, and confusing ambiguities.
We welcome contributions of all sorts and sizes, from reporting issues to submitting patches, to hanging out in the support chat to help people get unstuck.
We are grateful for any help in making Exercism better!
This guide covers several common scenarios pertaining to improving the language tracks themselves. There are other guides about contributing to other parts of the Exercism ecosystem.
- The Command-Line Client
- The Website
- The Exercises API (used by both the command-line client and the website)
- We Will Gladly Help You Help Us
- Code of Conduct
- Overview
- Updating an Exercise Test Suite
- Tweaking a README
- Porting an Exercise to Another Language Track
- Implementing a Completely New Exercise
- Improving Consistency By Extracting Shared Test Data
- Writing a New Test Suite Generator
- Starting a New Track
- Maintaining a Track
- Useful Tidbits
It can be confusing and intimidating to figure out how to fix even a tiny thing in a small project, much less a sprawling 50 repository beast like Exercism.
We'll do everything we can to help you get started.
The two best ways to get help are to
- jump into the support chat.
- open a GitHub issue.
We are happy to help out with all sorts of things, including figuring out the whole git and pull request thing.
Don't be shy, we're a friendly bunch!
If you have questions that you're not comfortable asking out in the open, email Katrina at [email protected].
Help us keep Exercism welcoming. Please read and abide by the Code of Conduct.
Each language track is implemented in its own repository. This has several benefits:
- It's easier to get started contributing, since you don't need to wade through setup instructions for 20 different languages.
- There's less noise for people who are maintaining a language track, since they won't be seeing pull requests and issues about languages they're not maintaining.
- Build tools can be tailored to each language.
- Continuous integration runs more quickly, since it only needs to install a single language environment, and run the tests for one single track.
We use the following terminology:
- Language - A programming language.
- Track - A collection of exercises in a programming language.
- Problem - A generic problem description.
- Exercise - A language-specific implementation of a problem description.
We've given each language track an ID, which is a url-friendly version of the
language name. For example, C++ has the ID cpp
. This ID is used throughout
the Exercism ecosystem.
Each language-specific repository can be found under the Exercism GitHub
organization, named with the track ID, prefixed with x
.
https://github.com/exercism/x{TRACK_ID}
For example, the C++ repository is exercism/xcpp.
Many languages implement an exercise based on the same generic problem description. So you might have a "leap year" exercise in Haskell, JavaScript, Go, Ruby, and Python. The basic premise will be the same, but each language will tailor the exercise to fit the idioms and idiosyncracities of that language.
We try to keep the generic descriptions generic--we avoid implementation-specific examples, and try not to be too prescriptive about suggesting how a problem might be solved.
The README of each exercise is pieced together from various bits and pieces of this shared metadata, along with some information that is custom to the language track in question.
Some of the problems also have a JSON file containing shared test cases. These are used to hand-craft a test suite generator, allowing us to quickly regenerate test suites when edge cases or errors are discovered.
The generic problem descriptions live in this repository (exercism/x-common).
Once you find the correct repository, you can fork it and then clone it.
The README in each individual language track repository covers how to set up the development environment for that language.
Often all you need is a running language environment along with the relevant testing library.
If the test suite was generated, then editing the solution will require a couple of extra steps. This is covered in detail in a separate section of this guide.
The test suite is usually named with test or Test in the filename, though some language tracks have other conventions (e.g. spec is fairly common, and sometimes it's just a matter of a different file extension).
If you're unsure where to make the change, ask us, and we'll help you figure it out.
Once you've updated the test suite, there are a few things you'll want to check.
- Make sure the reference solution is still passing.
- If the exercise is versioned, and the change will mean that existing solutions on the site will not pass the new tests, then increment the version number, both in the test and in the reference solution.
- Run the full, track-level test suite, if available. If the track has a way to automatically run all the tests against their reference solutions, it will be documented in the README.
- Run configlet, the track-level linter.
You can also take a look at the .travis.yml
file to see what the continuous
integration system will do to verify the track.
Take a look at our pull request guidelines. You don't need to get it perfect the first time around; we'll work with you to get the patch merged.
Some language tracks are experimenting with generating test suites from shared test data. This is because various interesting edge cases are discovered as people discuss solutions, but these edge cases are usually then only added to a single language track. By standardising the inputs and outputs, it becomes easier to and improve the exercises across all the languages.
There are two possible scenarios, described below.
- You want to change or add inputs or outputs.
- You want to change something about the test suite itself.
Once you've made the change, then follow the instructions about verifying your change and submitting a patch as described above, in the section about updating an exercise test suite.
If you want to add a new test or change some inputs or outputs, then the change needs to be made in the exercism/x-common repository, not directly to the test suite itself.
Find the JSON file for the problem in question. For example, if you want to change
the Clock problem, then look for clock.json
.
Submit a pull request with the change.
When that pull request has been merged, then the various languages that implement that problem will need to have their test suites regenerated. Track maintainers can do this, though we're always happy if you want to submit a patch with the regenerated test suite.
The instructions for regenerating a test suite should be described in the README of the language-specific repository.
Follow the guidelines for setting up a development environment, verifying the change, and submitting a pull request, as described in the main section about updating an exercise test suite.
If you are not changing inputs/outputs, but rather the structure of the test suite, then that change will need to be made within the generator itself. It lives in the language-specific repository along with the exercises, and the process for regenerating the exercise should be described in the README of the repository.
Follow the guidelines for setting up a development environment, verifying the change, and submitting a pull request, as described in the main section about updating an exercise test suite.
The Exercism exercise README treads a very fine line between useful ambiguity and confusing vagueness. Because the README is the same whether you're solving the problem in C++ or in Lua, the problem description needs to be high-level enough to allow for the syntactic, semantic, and philosophical differences in the various languages.
In other words: no specific references to syntax or data structures of a specific language can be used to further clarify a problem.
However, within this purposeful ambiguity might lie some opportunities for making an exercise description more clear. Typical issues to be attentive to:
- poorly worded sentences
- outdated information
- incorrect directives
- typos
Each language's test suite provides the precise specification for the exercise, which allows the user to view the problem from a perspective that is interesting and idiomatic for that specific language.
In addition, there's some language-specific content that gets woven into the README, usually a quick reminder about how to run the tests, and where to find more documentation.
Each generic problem is identified by a slug. For example, the problem Crypto
Square is crypto-square
. There are two metadata files for each problem:
<slug>.md
which contains the generic problem description which makes up the bulk of the README, and<slug>.yml
which contains a short one-line description of the problem as well as other metadata, such as the source that inspired the problem in the first place.
There aren't any rules about what a good exercism problem README looks like. If in doubt, open up a GitHub issue describing your suggestion.
Once you've made your change submit a pull request.
Each language track may optionally contain a SETUP.md
file in the root of
the repository. This file should contain helpful, generic information about
solving an exercism problem in the target language.
The contents of the SETUP.md
file gets included in the README.md that gets
delivered along with the test suite and any supporting files when a user runs
the exercism fetch
command from their terminal.
It would also be useful to explain in a generic way how to run the tests. Remember that this file will be included with all the problems, so it gets confusing if we refer to specific problems or files.
If a language track has specific expectations, these should also be documented here.
To get a list of all the exercises that can be ported to a track,
go to the url http://exercism.io/languages/:track_id/contribute
.
For example here is the list of exercises that have not yet been implemented for the Ruby track: http://exercism.io/languages/ruby/contribute
Each unimplemented exercise links to existing implementations of the exercise in other language tracks, so that people can use those example solutions and test suites as inspiration.
We are also extracting canonical inputs and outputs for a given exercise and storing them in JSON format in the x-common repository. We've accomplished this on a few exercises, but there are many more to do.
Although this page is now implemented, you can still get this information from the raw data
served by the API endpoint http://x.exercism.io/v3/tracks/:track_id/todo
.
For example, here's the list of exercises that have not yet been implemented in the Elm track: http://x.exercism.io/v3/tracks/elm/todo
It can be pretty unwieldy to read the JSON directly. To make it easier, install a browser extension that formats the JSON nicely, or copy/paste the response body into http://jsonlint.com/ and click "validate JSON", which not only validates it, but pretty-prints it.
The description of the problem can be found in the
x-common repository, in a file named
after the problem slug: <slug>.md
.
When you decide to implement an exercise
- check that there are no open pull requests for the same exercise
- open a "work in progress" (WIP) pull request
The way to open a WIP pull request even if you haven't done any work yet is:
- Fork and clone the repository
- Check out a branch for your the exercise
- Add an empty commit
git commit --allow-empty -m "dibs: I will implement exercise <slug>"
(replace with the actual name of the exercise). - Push the new branch to your repository, and open a pull request against that branch.
Once you have added the actual exercise, then you can rebase your branch onto the upstream master, which will make the WIP commit go away.
The exercise should consist of, at minimum:
- A test suite
- A reference solution that passes the test (see #reference-solution)
Each language track might have additional requirements; check the README in the repository for the track.
Once you've created an exercise, you'll probably want to provide feedback to people who submit solutions to it. By default you only get access to exercises you've submitted a solution for.
You can fetch the problem directly using the CLI:
$ exercism fetch <track_id> <slug>
Go ahead submit the reference solution that you wrote when creating the problem. Remember to archive it if you don't want other people to comment on it.
A problem must have a unique slug. This slug is used as
- the directory name within each language-specific repository
- the basename for the metadata files (in this repository)
- to identify the exercise in
config.json
- Create
<slug>.md
and<slug>.yml
. - Bonus:
<slug>.json
with inputs/outputs for the test suite. - Submit a pull request.
- Do the same as when porting an exercise. Reference the PR in x-common if it hasn't been merged yet, this must not be merged until the exercism/x-common PR is merged.
TODO: elaborate.
TODO: elaborate.
If you're interested in adding problems for a language that we don't yet have, email Katrina and she'll set up a new repo for that language.
Then you can fork and clone per usual.
In order to launch the track needs:
- At least 10 problems implemented.
- A handful of people who can check in regularly and provide feedback on solutions.
- Documentation in
docs/
for how to get started / run the tests
Description of what is required for docs/
can be found in the x-api CONTRIBUTING guide
Once that is in place, the repository needs to be added as a submodule to
exercism/x-api, and
the "active"
key in config.json
must be flipped to true
.
We don't deploy x-api automatically, so it will go live the next time the submodules are updated (daily, for the most part).
For a track that is set as "active": false
in the config.json
, exercism fetch
will not automatically pull down problems. You can still test the language by
fetching problems directly, e.g.:
exercism fetch cpp bob
This will allow you to do some dry-run tests of fetching exercises, double checking the instructions for each problem and submitting the problem solution for peer review.
It is recommended that you configure a Travis continuous integration build with your language track to verify that your example problem solutions satisfy the tests provided for each problem.
You can include advice and helpful links for your language track in the
SETUP.md
file.
Maintaining a language track generally consists of:
- Reviewing/merging pull requests.
- Discussing improvements in the exercises.
- Implementing or porting new exercises.
- Improving the development tooling (e.g. implementing continuous integration).
- Language-Specific support.
- Adding/improving language-specific documentation.
Ideally a track will have several maintainers, for two reasons:
- more lively
- spread the workload
More Lively
We've noticed that as soon as there are at least two people maintaining the same track we get rich discussions about quality and idioms. There's a lot more activity, and it's a lot more fun.
Spread the Workload
We don't want to burn people out, and it's really nice to be able to go on vacation or get busy at work without worrying too much about a growing backlog of unanswered issues and unreviewed and unmerged pull requests.
Caveat
There's a small chance that when more people are involved there's a bit of diffusion of responsibility (worth googling and reading about if you haven't heard the term before).
In general:
- Avoid merging your own pull requests (but it's fine if it's really simple).
- If the change is significant, get a second opinion.
- If it's insignificant or simple or uncontroversial, go ahead and merge.
- If nobody else responds within a certain amount of time, go ahead and merge it anyway, if you feel like it's good enough (we can always fix things later).
Many maintainers have mentioned that they like to get a second pair of eyes even for simple fixes, because it's so easy to for typos and really silly things to slip in.
Even for simple fixes (documentation, typos) branches let others see what's going on in the repository. If it's insignificant, go ahead and merge it yourself.
Sometimes it's just silly to create a branch. In that case, go ahead and put it in master, unless there's a track-level policy about not doing that.
When you start working on an issue, claim it (either assign it to yourself or just add a comment that you're taking it).
If you have a big list of similar, related things, it's fine to create a single issue with a todo list, and people can claim individual things in the comment thread.
The tracks should implement the exercise idiomatically in the language at hand, without veering too far from the README as described (does expanding the exercise introduce new ideas or just add more work? Is this better off treated as a new, separate exercise?).
If there are interesting corner cases, then these should be added to the README, they help make the discussions better.
Exercises should not enforce a single way to solve the problem, if possible. The more interesting exercises allow several approaches, and create rich opportunities for discussing trade-offs when people submit their solutions.
Don't be afraid to 'forego' exercises that don't make sense in the language, or that are not particularly interesting.
Here are a few bits and pieces that are referenced from some of the scenarios in this guide.
- Put the name of the exercise in the subject line of the commit.
E.g.
hamming: Add test case for strands of unequal length
- Don't submit unrelated changes in the same pull request.
- If you had a bit of churn in the process of getting the change right, squash your commits.
- If you had to refactor in order to add your change, then we'd love to see two commits: First the refactoring, then the added behavior. It's fine to put this in the same pull request, unless the refactoring is huge and would make it hard to review both at the same time.
Once you've submitted a pull request, one or more of the track maintainers will review it. Some tracks are less active and might not have someone checking in every day. If you don't get a response within a couple of days, feel free to ping us in the support chat.
It's only when we get a bunch of people having conversations about the solutions that we really discover what makes a problem interesting, and in what way it can be improved.
Some changes to the test suites will invalidate existing solutions that people have submitted.
We think this is totally fine, however sometimes people start leaving feedback saying this doesn't pass the tests. This is technically true, but since the solution passed the tests at the time it was written, it's generally more useful to just discuss the code as it is, rather than enforce strict adherence to the most recent version of the tests.
Some language tracks have implemented a simple, manual versioning system to help avoid unnecessary discussions about failing the current test suites.
If the exercise is versioned, then the test suite will probably have a book-keeping type test at the very bottom that asserts against a value in the reference solution. If the change you're making is backwards-incompatible, then please increment the version in both the test suite and the reference solution.
TODO: expand on notes below.
TODO
TODO
TODO (boilerplate, header files, etc)
The reference solution is named something with example
or Example
in the path.
The solution does not need to be particularly great code, it is only used to verify that the exercise is coherent.
If you change the test suite, then make sure the reference solution is fixed to pass the updated tests.
Each language track has a config.json
file. Important keys are:
problems
- actively served viaexercism fetch
deprecated
- implemented, but aren't served anymoreforegone
- will not be implemented in the trackignored
- these directories do not contain problems
The configlet
tool uses those categories to ensure that
- all the
problems
are implemented, deprecated
problems are not actively served as problems, andforegone
problems are not implemented.
In addition, it will complain about problems that are implemented but are not
listed in the config under the problems
key. This is where the ignored
key
is useful. Ignored directories don't get flagged as unimplemented problems.
A problem might be foregone for a number of reasons, typically because it's a bad exercise for the language.
The config.json
also has an optional test_pattern
key. This is a regex that
test filenames will match. If test files contain /test/
, then this key can be
deleted.
If the config.json
file is incomplete or broken, a lot of other things break.
To make things easier we made a small tool to help verify the config:
https://github.com/exercism/configlet#configlet
You can download the latest release from the releases page in the configlet
repo, or you can use the
bin/fetch-configlet
command from the root of the language track repository,
which will make a guess at what operating system and architecture you have and
attempt to download the right one.
Verify the config by calling bin/configlet .
(notice the dot). This says
_check the config of the language track that is stored right here).
If you're concerned that you haven't done it right, don't worry. Submit your pull request, and we'll help you get the details sorted out.
We recommend forking the project first, and then cloning the fork.
git clone [email protected]:<YOUR_USERNAME>/<REPO_NAME>.git
This will give you a remote repository named origin
that points to your own copy of the project.
In addition to this, we recommend that you add the original repository as a secondary remote.
git remote add upstream https://github.com/exercism/<REPO_NAME>.git
The names origin
and upstream
are pretty standard, so it's worth getting used to the names in your own work.
When working on your fork it tends to make things easier if you never touch the master branch.
The basic workflow becomes:
- check out the master branch
- pull from
upstream
to make sure everything is up to date - push to
origin
so you have the latest code in your fork - check out a branch
- make the changes, commit them
- rebase onto the upstream master (and resolve any conflicts)
- push your branch up to
origin
- submit a pull request
If you're asked to tweak your work, you can keep pushing it to the branch, and it automatically updates the pull request with the new commits.
Commit messages are communication and documentation. They're a log of more than just what happened, they're about why it was done.
The longer a project is active, the more people involved, the larger the codebase, the more important it is to have good commit messages.
There's an excellent lightning talk by Ryan Geary called Do Your Commit Messages Suck?. It's funny and enlightening, and you should watch it.
Tim Pope wrote an article that has very clear guidelines about how to write excellent commit messages. Please read it.
Imagine that you're submitting a new problem called "spinning-wheel" to the Ruby track.
Here's a fairly typical set of commits that you might end up making:
433a0e1 added spinning wheel tests
1f7d4aa pass spinning wheel
cf21737 oops
be4c410 rename example file
bb89a77 update config
All of these commits are about a single thing: adding a new problem. They should be a single commit. They don't have to start out that way (life is messy), but once you're done, you should squash everything into one commit, and rename it cohesively:
f4314e5 add spinning wheel problem
If you've already made changes on your master so that it has diverged from the upstream you can reset it.
First create a backup of your branch so that you can find any changes. Just in case.
git checkout master
git checkout -b backup
git checkout master
Next, fetch the most recent changes from the upstream repository and reset master to it.
git fetch upstream
git reset --hard upstream/master
If you do a git log at this point you'll see that you have exactly the commits that are in the upstream repository. Your commits aren't gone, they're just not in master anymore.
To put this on your GitHub fork, you'll probably need to use the --force
flag:
git push --force origin master
Squashing commits into a single commit is particularly useful when the change happened in lots of little (sometimes confusing) steps, but it really is one change.
There are a number of ways to accomplish this, and many people like to use an interactive rebase, but it can be tricky if you haven't set git up to open your favorite editor.
An easier way to do this is to un-commit everything, putting it back into the staging area, and then committing it again.
Using the example from above, we have 5 commits that should be squashed into one.
433a0e1 added spinning wheel tests
1f7d4aa pass spinning wheel
cf21737 oops
be4c410 rename example file
bb89a77 update config
To un-commit them, use the following incantation:
$ git reset --soft HEAD^^^^^
Notice that there are 5 carets, one for each commit that you want to un-commit. You could also say:
$ git reset --soft HEAD~5
If you do a git status
now, you'll see all the changed files, and they're
staged and ready to commit. If you do a git log
, you'll see that the most
recent commit is from someone else.
Next, commit, as usual:
$ git commit -m "Add spinning wheel problem"
Now if you do a git status
you may get a warning saying that your origin and
your branch have diverged. This is completely normal, since the origin has 5
commits and you have 1 (different) one.
The next step is to force push this up to your branch, which will automatically update the pull request, replacing the old commits with the new one.
$ git push --force origin spinning-wheel
If you're completely new to git, there are a number of resources that can help get you feeling more comfortable with it.
If you've been using git for a while, but it feels like repeating magical incantations (while praying that nothing goes wrong), then you may find these helpful:
- Git for Ages 4 and Up - video of a presentation/demo
- Think Like a Git
- Explain Git with D3 - interactive site
- Pro Git - "The Book" for learning Git in detail
- Git Branching Tutorial - interactive tutorial, very visual
You'll often be asked to rebase your branch before we merge a pull request as Exercism likes to keep a linear project commit history. This is accomplished with git rebase. It takes the current branch and places all the commits at the front of the branch that you're rebasing with.
For example, rebasing the current branch on top of upstream/master:
git rebase upstream/master
Project commit history:
-- current branch --
/
--- master branch ----
TODO: add more sections:
- how to rebase commits in a branch
- how to merge something locally (for example when there are conflicts, or if you want to fix a small thing without nagging the contributor about it)