This document describes the data model for the problem instances.
A problem instance, or a scenario, is a JSON file containing the following top-level elements:
- label
- hash
- service_intentions (a fancy name for 'train including all its requirements')
- routes
- resources
- parameters
Let's go through them. You may use the small sample scenario as a concrete example. This looks like this: It contains just 2 trains to be scheduled, 2 routes (one for each train), and 13 resources.
This is just a human-readable identifier for the instance. It is of no concern otherwise.
A machine-readable identifier for the instance. This hash must be referenced when submitting a solution for this instance. See Output Data Model.
This is the list of trains (or, in fancy speech, a "service") to be scheduled. Each individual service_intention describes a specific train. In particular, it specifies which route a train can take (see below) and all requirements that the scheduler needs to observe when planning this particular train. In the example, we have two service_intentions:
An individual service_intention looks like this:
In general, it consists of
- id: identifier for the train, or "service"
- route: a reference to the route graph, see below for details on the routes
- section_requirements: a list of individual section_requirements. This is where the actual requirements for this train(service) are specified. Before we look at these (subsection section_requirements below), it is helpful to first discuss the model for the routes.
Our example above has three section_requirements for train 111. We will study the meaning of the section requirements in detail below, but for the moment you may think of them as follows:
- The first one specifies that basically the train starts in 'A0 and must not depart before 08:20:00
- The second one says that for some intermediate stop in 'B' (this will match to a certain position in the route of this train), a minimum stopping time of 3 minutes must be observed and the train must not leave 'B' before 08:30:00
- Finally, the third one requires that the train arrive in 'C' no later than 08:50:00
Recall from the Quick Introduction that a route is modeled as a directed acyclic graph (DAG). It describes the possible ways for the train to move through the railway network. The nodes in the graph are the events that occur along the way, such as "arrival at station X", "releasing resource Y", etc. A directed arc in the route graph is called a route_section. For a route_section, we call the node at its tail the entry event and the node at its head the exit event from this route_section. The route_sections have associated with them a list of resource occupations. These are the resources that a train on this route will occupy, while it is traveling on this arc, i.e. in the timespan between the entry event and the exit event. Each arc also has a minimum running time, which gives the minimum duration between entry and exit event. There is no maximum duration, by the way. A train may spend as much time on a section as it likes. Of course, it keeps using the resources of this route section during this time.
As an example, we again look at our sample scenario. The following images illustrate route 111 in that scenario (although route 113 is identical):
In a minute, we explain how to derive a directed graph from the data in the JSON. But I give you the picture first. The DAG for route 111 actually looks as follows: Nodes are the events and arcs are the individual route_sections. The red labels are route_alternative_markers (see the explanation on route paths just below), the black labels are the sequence_number of the individual route_section.
Note: There is also a helper script that constructs a networkx graph for each service intention in a problem instance. You are free to use it for the challenge if you find it helpful.
Modeling a graph in a JSON structure in a human-readabl way is not exactly a joy. We have taken the approach that paths (i.e. linear subgraphs) in the graph can be grouped into so-called route paths. Then these route_paths are "glued together" at the appropriate places. We see that route 111 in our sample scenario is made up of 5 route_paths:
Each route_path is a path in the route graph. In the following image, all route_paths are encircled for clarity. For example, the green path illustrates the route_path with id=1. It consists of 7 route_sections. Similarly, the orange one corresponds to route_path with id=2. It consists only of one route_section.
Answer: By labelling events with the same route_alternative_marker_label.
For example, consider again the same route_paths with id 1 and 2: They both specify a marker label 'M1' for the exit_event from the first route section. These are highlighted in the following snapshot:
Consequently, those events are "merged together" in the resulting graph:
Note:
- For consistency, route_section 4 of route_path 1 also lists the same marker label 'M1' for its entry_event. This makes sense, because this represents the same event as the exit_event from section 1. In other words, all three events are merged into the same node in the graph.
- There are many ways to cut the graph into segments of linear paths. In the end, all that matters is the resulting DAG. The route paths are only an aid for a human editing the file manually.
- "Officially", the route_alternative_markers are modeled as lists. However, you can be assured that in all problem instances for this challenge, this list will have length at most one. In other words, there is never more than one label.
A specialty of the route graph are the section markers that may be associated to certain arcs (route_sections). These provide the link to the section requirements in the service intentions. We will discuss them in more detail below. In this sample DAG, there are the following section_markers (in blue):
In the JSON, you find them here:
In the formal data model, a route has
- an id
- a list of route_paths, which themselves are a list of route_sections. These are the interesting objects. They are built as follows
As an example, let's look at the route_section with sequence_number 5, located in our sample route 111 of the example. It looks like this:
We explain the meaning of the individual fields:
Field | Format | Description |
---|---|---|
sequence_number | integer | an ordering number. The train passes over the route_sections in this order. This is necessary because the JSON specification does not guarantee that the sequence in the file is preserved when deserializing. |
penalty | non-negative float | used in the objective function for the timetable. If a train uses this route_section, this penalty accrues. This field is optional. If it is not present, this is equivalent to penalty = 0. |
route_alternative_marker_at_entry | text | a label for the entry event into this route_section. Route sections from other route_paths with the same label are "glued" together, i.e. become the same node in the route graph. See examples above |
route_alternative_marker_at_exit | text | dito for the exit event from this route_section |
starting_point and ending_point | text | used in visualisations of the timetable. It has no meaning otherwise. But note that each route_section begins where the last one ends, which, if you think about it, kinda makes sense... |
minimum_running_time | ISO duration | minimum time (duration) the train must spend on this route_section |
resource_occupations | List of resource_occupation | see below |
section_markers | List of text | labels that mark this route_section as a potential section to fulfil a section_requirement that has any of these as section_marker. Note: In all our problem instances, each route_section has at most one section_marker, i.e. the list has length at most one. |
Field | Format | Description |
---|---|---|
resource | text | a reference to the id of the resource that is occupied |
occupation_direction | text | a description of the direction in which the resource is occupied. This field is only relevant for resources that allow "following" trains, which does not occur in the problem instances for this chalenge. You may ignore this field. See also the description of resources below. |
Now we we can talk about the section_requirement [element of a service_intention]
With the understanding of the routes, the meaning of the section requirements in a service intention now makes more sense.
Each section_requirement references a section_marker. This means that "this requirement must be satisfied on any route_section that carries this label as a section_marker". The section requirement can specify four types of requirements:
- that a certain time window be respected for the entry event by setting entry_earliest and/or entry_latest accordingly
- same thing for the exit event by setting exit_earliest and/or exit_latest
- a minimum stopping time be observed on this route section by setting min_stopping_time. This stopping time will be in addition to the minimum_running_time
- the connections to other trains to be observed
If this is a bit much to digest, don't worry. We will see in the planning rules the formal rules that you need to observe when constructing solutions. For now, let's just look at some examples before defining the formal data model:
- service_intention 111 in our example has three section_requirements:
- The first of these is for section_marker 'A'. We said above that this sort of means that "departure from station A" must not occur before 08:20:00. What it really means is: the entry_event into this route_section must not occur before 08:20:00
- The second is for section_marker 'B' and requires that on the corresponding route_section
- The third is for section_marker 'C' and requires that the "arrival at that station" should be no later than 08:40:00
- We give an example for a connection below.
Finally section_requirement can specify relative factors entry_delay_weight and exit_delay_weight. These weights are used in the calculation of the objective function.
Summarizing: The formal model for a section_requirement is as follows
Field | Format | Description |
---|---|---|
sequence_number | integer | Needed because JSON deserialization may not preserve order Important Note: If you look at the route graph for service intention 111 in the example above, you see that however you travel through the graph (from a source to a sink node) you will always pass section_markers A, then B and finally C. This is exactly the same sequence as the section_requirements in the service_intention. We guarantee that this is always the case for all our problem instances, see Axiom 3 below!. That is, the route graphs are such that you will always automatically pass all required section_markers and you will do so in the correct sequence. |
type (typ) | text | a text field describing what this requirement is meant to represent, such as start of a train, a scheduled stop, etc. Has no effect on processing. You may ignore it. |
min_stopping_time | ISO duration | see text above |
entry_earliest and/or entry_latest | HH:MM[:SS] formatted time-of-day | see text above |
exit_earliest and/or exit_latest | HH:MM[:SS] formatted time-of-day | see text above |
entry_delay_weight and exit_delay_weight | non-negative float | used to calculate total delay penalties in the objective function |
connections | list of connections, see below | see below |
Connections are directed. They point from a train that gives a connection to another train that accepts the connection. In our model, a connection is listed under the train that gives it.
Here is an example: It is from problem instance 02 (direct link to the line) and specifies a connection from the service_intention '18013' onto service_intention '18224' on section_marker 'WAE_Halt'. A minimum connection time of 2 minutes and 30s must be observed.
The model is as follows:
Field | Format | Description |
---|---|---|
id | text | technical id. Irrelevant during processing |
onto_service_intention | text | reference to the service_intention that accepts the connection |
onto_section_marker | text | reference to a section marker. Specifies which route_sections in the onto_service_intention are candidates to fulfil the connection |
min_connection_time | ISO duration | minimum duration required between arrival and departure event. See Business Rules for details. |
Back to the top-level of our sample scenario:
Resources are used to model which parts of the track infrastructure are used by a train while on a certain route_section. A resource that is used is modeled as a resource_occupation on the route_section, see above. Resource occupations always begin at the entry into a route_section and end at the exit from a route_section.
Remark: Very often, however, the same resource may be occupied again on the next route_section. This corresponds to the fact that trains typically need a wave of several "green signals" ahead of them, where each signal protects the block section to the next one. Each such block section may be considered a resource in our model.
In general, we use two kinds of resources to model different behaviour and level-of-detail:
- blocking resources which means the resource must be released by a train before another train can start occupying them
- following resources, which allow two trains to occupy them concurrently, as long as
- they are separated at entry and exit by at least the following_separation
- the order of the trains is the same at entry and exit, i.e. they do not overtake each other while using this resource.
However, in the problem instances provided for this Challenge, you will only find resources of blocking type. To participate in this challenge, it is therefore sufficient for you to consider only blocking logic.
The model for a resource is as follows:
Field | Format | Description |
---|---|---|
id | text | unique identifier for the resource. This is referenced in the resource_occupations |
release_time | ISO duration | describes how much time must pass between release of a resource by one train and the following occupation by the next train. See Business Rules for details. |
following_allowed | bool | flag whether the resource is of following type (true) or of blocking type (false). As mentioned, all resources in all the provided problem instances have this field set to false |
Can be used to set global or solver-specific guidelines for the instance. Do not change these, as it would change the hash of the instance. They are of no concern otherwise.
We guarantee that all sample files and problem instances used in this challenge have the following properties. Let's call them 'axioms'
A service_intention always has a entry_earliest time set for its first section_requirement and has an exit_latest time set for its last _section_requirement.
For example, here is the sample instance:
There is a one-to-one correspondence between service_intentions and routes and moreover, they use the same id.
For example, the sample instance has two service_intentions, 111 and 113, and two routes with the same id:
The route graph for a service_intention (see the detailed description for an in-depth discussion) has the following properties
- It is a directed acyclic graph
- A path from a source node (i.e. from a node without incoming arcs) to a sink node (a node without outgoing arcs) always visits all required section_markers and does so in the correct order (given by the service_intention)
- In addition, such a path will always start with a section_marker for the first section_requirement and end with a section_marker for the last section_requirement of the service_intention
The sequence_number of a route_section is unique among all route_sections in a route. For example, route 111 in the sample instance has 5 route_paths, each containing a varying number of route_sections. These sequence_numbers are globally unique in a route. It will never happen that two route_pahts contain route_sections with the same sequence_number.
As a consequence, the pattern <route.id>#<route_section.sequence_number> is a key for a route_section globally over the whole problem instance.