Skip to content
rklancer edited this page Sep 28, 2012 · 24 revisions

Problems arising from lack of a unified model schema

(TL;DR: When you look closely, there are a lot of inconsistencies in the dark recesses of the MD2D data model that are just waiting to bite us in the behind. The unified model schema outlined in the next section could help.)

Here are some related issues with the MD2D model object implemented in modeler.js and md2d.js:

  • There are set and get methods for top-level properties of the model such as temperature_control, but for atom properties there is no getAtomProperties method to match setAtomProperties.

  • The serialization of atom properties use upper case keys such as X and Y, but setAtomProperties uses different, lower-case keys such as x and y for the same properties. The same keys should be used in both cases, for ease of use and so that the more code can be shared.

  • Atoms are just one kind of "physics object" that we need to serialize or edit the properties of -- some others are obstacles, elements, and radial bonds -- but there are no setter/getter methods such as setObstacleProperties for these other kinds of object.

  • The serialization and deserialization path for the properties of atoms, obstacles, elements, and radial bonds are convoluted and pointlessly different for each type of object -- compare obstacle deserialization and atom deserialization

  • The "tick history" saves and restores per-atom properties but it does not save and restore other types of properties such as per-obstacle properties or toplevel properties like temperature_control. This can be seen by visiting http://lab.dev.concord.org/examples/interactives/interactives.html#interactives/gas-laws-page-4.json, allowing the model to run so that the obstacle moves to the right, stopping the model, seeking to the beginning by typing model.seek(0) in the console, and then clicking the play button. The atoms' positions change back to their starting points but the obstacle's position does not. (Note that using the reset button is not sufficient to demonstrate the problem because it reloads all model properties from the serialized JSON.)

  • There is meta-information about atom properties specifying, for example, which properties are "saveable" -- i.e., which properties are transient and which properties need to be serialized in order to accurately save the state of the model. However, there is nowhere to store similar meta-information about other objects such as radial bonds.

  • Adding a new atom property to the model requires providing meta-information about the property in many different places -- see fac22a5

  • Although there is a unified way to notify an observer that a toplevel property such as temperature_control has changed, there is no way to notify an observer that the potential energy has changed. We rely instead on observing tick events, but this is not reliable because the potential energy (or something else, such as the x-position of obstacle #2) can change as a result of user action while the model is stopped.

  • The effective default value of certain properties are defined in different ways, in different places, in the modeler and engine: e.g., here, here, and here

  • The MML parser needs access to the default values of certain properties in order to construct a correct model JSON file, but it cannot access this information at all, or in a consistent way, from modeler.js.

Proposed design

Summary

At the top of modeler.js, define a schema object that contains the names of top-level properties of the model (such as height), a set of metadata about each property, and the names of each type of object contained in the model (such as obstacles and atoms). Recursively use the same, or a very similar, format for describing each property of each object type (such as the charge property of atoms).

Refer to the metadata defined in this schema throughout modeler.js when, for example, allocating storage arrays, storing tick-history items, serializing, deserializing, issuing "change" events to listeners, or pushing data back and forth from the underlying computational engine.

Sketch of information needed in schema

Toplevel (properties: width, chargeShading, viscosity, ...)

  • Should this property be serialized?
  • What is the default value of this property?
  • Is this property read-only, or read-write? (This would help us use the model's standard getter and observer-notification methods for calculated properties such as the potential energy.)
  • Should this property be passed to the engine when the engine is constructed?
  • Should the updated value of this property be passed to the engine when the property changes?
  • Should changing this property trigger a model-state recalculation? (Consider: changing the gravitationalField potential instantaneously changes the total energy.)
  • Should this property be persisted in the tick history?
  • Is this property mostly view related? (This is a hint for developers and for use by a controller that configures a view; the model doesn't construct a view or maintain a reference to it.)

Of course, we should be able to define a custom setter for each property in order to do the right thing when it changes.

Atoms (properties: x, charge, marked...)

Same as the above, plus:

  • When serializing this property, should the entire array of values be removed if all values are the default value?

Examples of interpretation of some of this information:

  • Per-atom properties such as ax and px (accleration and momentum) make sense as read-only properties because these are completely determined by atom positions, and the velocity and mass of the atom, respectively.

  • Should this property be passed the engine: the engine should continue to use an array of properties indexed by atom, because access to x[j] is faster than access to atoms[j][X] when you consider that the atoms[j] dereference can't meaningfully be cached in inner loops. However, modeler should operate on transposed array of atoms indexed by property -- what we have so far called the results array. This means we should optimize what we pass back and forth between model and engine.

  • Should this property be persisted in the tick history? (For example, we might want to mark atoms in such a way that marks added now persist when the history is scrubbed backwards. That should be a policy choice we can flip simply by changing a value in the schema; and we might even allow specific models to override that policy choice by somehow overriding the default schema.)

Obstacles, Radial Bonds, Elements

All of the above information is needed. In addition, the following might be useful:

  • is this property an atom index? (It might or might or might not be useful to explicitly represent this information)

Textboxes

Although textboxes is represented in the model JSON file as an array of individual objects, just like atoms and 'radialBonds, the array is really meant to be passed wholesale to a view, which interprets it. Therefore it might make sense to define textboxes simply as a toplevel passthrough` property which the model serializes and deserializes as-is without attempting to infer anything about its contents.

How the schema might be used

Deserialization could be done by looping over the list of declared object types and then the serialized arrays of object properties, while repeatedly calling setObjectProperties(index, objectType, { properties });. Note that this can be done in a GC-friendly way by reusing the { properties } object in the loop, rather than constructing a new throwaway object for each call. (Serialization can be done in a similar way, consulting the serializable value from the schema to determine which properties to serialize.)

Remove all serialization/deserialization code and default value handling from the engine. This can be handled by the model object.

Have the engine publish a simplified schema containing just a list the object types it knows about (atoms, radial bonds, etc) and the properties it assigns to each one. Include the data type ('float32', etc.) of each property. modeler.js can consult this list, and accessor functions built into the engine, in order to access the engine's representation of the properties the modeler needs to access:

// in engine:
objects: [
  { name: "atoms",
    properties: [
      { name: "x", type: "float32" },
      { name: "y", ... },
      //...
    ]
  },
  //...

 // in modeler:

 atoms = engine.get('atoms');
 // x is now an array of x-values of atoms, indexed by atom
 x = atoms[engine.index.x];

When a property is changed by a setter (outside of a model "tick"), check the recalculateState value from the schema in order to determine whether to recalculate the model thermodynamic properties. Use a setter in the so that observers of these properties are appropriately notified.

Make the schema available globally so that the mml parser can access it (note that this means being able to require the modeler from the mml parser, which usually run as a command-line script. However, this should be made possible by the require.js refactoring)

Speed and memory considerations

Inheritance considerations

Clone this wiki locally