Overview

Background

Focstest lets you check your homework for Olin's Foundations of Computer Science (FoCS) course automatically.

The name is a portmanteau of "doctest" and FoCS, because the FoCS homework problems are presented similar to Python's doctests, with the OCaml REPL input and expected output:

Code a function linear of type float -> float -> float -> float -> float -> float where linear a b x y c takes the value c which should live in the interval [a, b] and rescales it so that it lives at the "corresponding" place in the interval [x, y]. This is called linear scaling. So if c is the midpoint of [a, b], then linear a b x y c should return the midpoint of [x, y]; if c is a quarter of the way from a to b, then linear a b x y c is a quarter of the way from x to y; and so on.

If a and b are the same, raise an exception using failwith. (Why?)

Sample output:
# linear 0. 1. 50. 100. 0.;;
- : float = 50.
# linear 0. 1. 50. 100. 1.;;
- : float = 100.
# linear 50. 100. 0. 1. 75.;;
- : float = 0.5
# linear 50. 100. 0. 1. 80.;;
- : float = 0.6
# linear 50. 100. 0. 1. 110.;;
Exception: Failure "value out of range".

Function stubs for each question are provided in an OCaml file:

(* Question 1 *)

let linear (a:float) (b:float) (x:float) (y:float) (c:float):float =
  failwith "Not implemented"

The students complete the functions to meet the requirements and tests from the webpage:

(* Question 1 *)

let linear (a:float) (b:float) (x:float) (y:float) (c:float):float =
  if c < a || c > b then failwith "value out of range" else
  (* provide a better error than Division_by_zero *)
  if a = b then failwith "input range of zero; a and b are equal" else

  let ab_range = b -. a
  and xy_range = y -. x
  and relative_c = c -. a
  in
  relative_c /. ab_range *. xy_range +. x

Focstest parses the sample outputs from the webpage and runs your in-progress homework solutions against them, comparing the outputs and checking for errors.

It began as a script I made while a FoCS student, and later as a course assistant. At the time I didn't know how to write "real" programs in OCaml that interfaced with libraries and IO, so I reached for Python because I was most comfortable with it. Unfortunately, using another language to run the OCaml REPL presents some challenges that are discussed more in Running the Tests.

Structure

At a high-level, there are three phases of work handled by three functions:

Fetching the webpage: get_html
Parsing the tests: parse_html_tests
Running the tests: run_test (and the class PrintingTestRunner)

The interactions between the student, Focstest, FoCS website, and the OCaml interpreter are diagrammed below.

Sequence diagram explaining interaction between user, Focstest, and the OCaml binary

Fetching the Webpage

First, the webpage that contains the tests and questions must be fetched. Each homework's url looks like https://rpucella.net/courses/focs-sp22/homeworks/1/index.html and the template OCaml files look like homework1.ml.

To save the user from passing the long url for the homework for each invocation of focstest, the homework number is parsed from the OCaml file and used to construct the url. The user can specify a url in cases where the parsing fails or naming pattern differs.

To cut down on repeated network requests, the url's contents are saved to a temporary cache directory with a default maximum age of 30 minutes. If the user does not have a network connection, and a cached version of any age exists, the cached version is used.

This caching logic is implemented in the fetch_url function and described visually below:

Flowchart explaining the caching logic

Alternatively, a local HTML file can be specified and used directly.

Parsing the Tests

While doctest needs to parse tests out of arbitrary Python docstrings, the FoCS tests are already separated into <pre><code></code></pre> HTML elements. The hard work of extracting the code block elements is done with the beautifulsoup library. To parse the test inputs, the code block text is passed to a multi-line regex that matches on the REPL prompt # and the expression terminator ;;. The lines of text following each parsed input are parsed as the expected output of the tests. Depending on the test, the expected output may be the value and type signature of an expression, a raised exception, or printed output.

Running the Tests

For each test, the OCaml interpreter binary (ocaml) is executed and passed a statement to load the homework file (#use "homework1.ml";;) with the test input, e.g. 1 + 2;;. After waiting for the process to finish with a suitable timeout, the output is split on the interpreter prompt (# ) to separate the output of the REPL startup, file loading, and the test itself (e.g. - : int = 3).

The file and test outputs are scanned for exceptions or errors.

If loading the homework file returns errors, the test results may be invalid, so an OcamlFileError exception is raised that stops the program and alerts the user to the issue.

Errors returned from test input may mean several things:

the test checks that an exception is raised
the test demonstrates a syntax or type error
there is an error in code called by the test
the test calls an incomplete function stub that raises a "Not implemented" exception.

In some of those cases the output of the test is valid, so the parsed error is returned as a value alongside the test output, to be inspected later.

The actual loop of test running is done with a class called BaseTestSuiteRunner, which handles running multiple groups of tests ("suites") and tracking their results. This class is extended by PrintingTestRunner, which handles printing the colored output of tests and suites to the terminal.

Interfacing with the OCaml binary

The current design runs the ocaml interpreter binary for each test, loading the homework file before running the test input. This method works consistently, isolates any problems with parsing output to an individual test case, and meets a variety of goals:

runs cross-platform (students commonly use Linux, Windows, and MacOS)
detects errors from the loaded homework file
evaluate with a timeout to catch infinite loops and long-running tests

On the other hand, using a new OCaml environment for each test has two big downsides:

Increased run time (it can take 50 to 200 milliseconds to load an unimplemented homework file), running 80 empty tests can take 10 seconds.
Tests that depend on a constant or function defined in a previous test will fail.

I tried to mitigate the first problem by letting the user select which suites to run, and skipping suites when the first test raises a "Not implemented" error. Running several processes at once could further improve the speed, but hasn't been attempted.

I don't have a mitigation for the second issue, but the provided tests are rarely structured that way.

From what I've researched, there are two potential solutions that still meet most of the original goals:

Using asyncio

Python's asyncio.subprocess module allows cross-platform reading/writing of a sub-process's stdout/stdin with a timeout. This would mean that a single OCaml process could be created, with each test output read as it's evaluated instead of at the end.

The subprocess library does technically allow reading and writing directly to stdin/stdout, but doing so correctly to avoid pipe blocking and allow for reading with a timeout would be complicated.

As an example, the implementation of Popen.communicate is nontrivial and has different implementations for Windows and Unix, whereas asyncio's Process.communicate is pretty straightforward.

Porting to OCaml

Since originally writing Focstest, I've learned more about writing OCaml programs. There are ways to interface directly with the interpreter, but it would still involve some text parsing of the output and wouldn't allow adding timeouts from what I understand. The ocaml branch has a minimal example

Provide feedback

Saved searches

Use saved searches to filter your results more quickly