Skip to content

Latest commit

 

History

History
188 lines (131 loc) · 11.8 KB

README.md

File metadata and controls

188 lines (131 loc) · 11.8 KB

polyglot-emf

Making the Eclipse Modeling Framework (EMF) polyglot.

Overview

The Eclipse Modeling Framework (although it'd be equally valid to call it the "Ed Merks Framework") is an Eclipse project written in Java that provides lower-level capabilities and facilities that help with implementing modeling languages, environments, and tools.

An annotated diagram of EMF's main capabilities/facilities

EMF's main capabilities/facilities are:

  1. A runtime for the JVM to manage models in-memory. Objects in any EMF model (or more accurately: an EMF Resource, an instance of EResource) are instantiations of EObjects, meta typed as instantiations of EClasses. The runtime includes a "Command" sub system to manipulate EMF models using "deltas" (∂s).

  2. A serialization to, and deserialization from XML Metadata Interchange (XMI).

  3. A meta meta model, called Ecore. Ecore is described by Ed Merks as "[...] the de facto reference implementation of OMG's EMOF (Essential Meta-Object Facility)". Basically, Ecore is precisely enough to meta model "anything".

  4. A generator from Ecore to Java classes sub typing EObject.

Some features are:

  • A Ecore model (which specifies a meta model) is itself an EMF model, so items 3 and 4 re-use items 1 and 2. Relations between EMF and Ecore models

  • An EMF model can be either "dynamic" (using DynamicEObjectImpl) or "static" (meaning that each concept/EClass is reified through an implementing sub type of EObject). In Java, the idioms for using EMF dynamically versus using EMF statically differ quite a bit: the dynamic idiom hinges completely on the reflective part of the API of the EMF runtime.

Motivation

EMF is a JVM-only framework. This is a pity because EMF works extremely well (as proven by it being used as "middleware" in numerous modeling tools and environments), and is the only reference implementation of EMOF that I know of.

This Git repository is intended to collaborate on proposals to make EMF polyglot. It'd be very useful to be able to use capabilities/facilities from the list above in, and across various languages.

(added by Federico Tomassetti, with some modification:) This could open the possibility to create an ecosystem richer than the EMF ecosystem, with a variety of interoperable tools so that, based on common model and meta model formats, we could:

  • Have systems for storage and collaboration (Modelix)
  • Have systems for parsing, based on ANTLR (a subset of the features of Xtext, maybe something similar to textX)
  • Perhaps a way to interact with textual editors involving the Language Server Protocol
  • A way to plug-in web editors like WebEditKit and ProjectIt
  • Interaction with MPS in various forms
  • Work with multiple code generators supporting this format
  • Have the possibility of building different stages of these systems in different languages such as Kotlin, Java, Python, TypeScript, Javascript, C#

Ports of (parts of) EMF to JavaScript and Python are available, but:

  • It's unclear to which extent they are true ports because of a lack of a specification for EMF other than its implementation and the documentation for that.
  • EMF has an internal test suite, but this is coupled to Java/the JVM pretty tightly.

Links

For the EMF project itself:

For relevant OMG standards:

For re-implementations or "inspired by" implementations (some in other languages):

  • ecore.js (JavaScript).
  • PyEcore (Python), and its documentation.
  • JSOI (JVM (mostly)). This project is interesting because (as I understood it from Horacio) it provides an alternative serialization for EMF Resources (models) to JSON that's type-based, rather than containment-based. The serialized EObjects are organized by type, as lists, instead of as a tree (or trees) by containment. Also see this IEEE article by Horacio about it.
  • emfjson-jackson, a Maven module that implements (de-)serialization of EMF Resources (from and) to JSON. The JSON format is described here.
  • CrossEcore - for Java, Typescript, C# (.Net) and Swift
  • For C++:

Specifically about Ecore:

Specifically on XMI:

  • Two sources for an XML Schema for the XMI format: OMG, EMF Github repo. These don't match exactly: how do they differ effectively?

Specifically about JSON Schema:

Software and frameworks using EMF:

Various:

Use cases

Contributed by Federico Tomassetti:

Parsing and processing (short term)

In the very short term, I see the need we have to combine a parser written in Kotlin with a processing stage written in Python. The parser is written using ANTLR and then we translate the parse tree to an AST implemented using Kotlin data classes and the Kolasu framework. To use the parser from a Python program we are just thinking of invoking the parser, make it output JSON and load such JSON from Python.

Now, we can derive the meta model of our AST by examining the Kotlin data classes through reflection or parsing Kotlin code. Once we get this model we could serialize it in XMI or in a transposition of XMI to JSON. We could then load such a meta model in Python and generate classes. We could potentially do that using PyEcore, if I understood correctly. Ideally, we could also evolve PyEcore to use Python data classes, but this is not strictly necessary. To enable this scenario we would just need to have a mechanism that from our Kotlin data classes generate the meta model on XMI or JSON-XMI.

Then we would need to translate also the actual AST instances (the model). At the moment we are serializing JSON and unserializing the JSON on the Python side. This JSON could be based on JSON-XMI instead of our own format.

Accessing Modelix from different languages (medium term)

We have APIs to work with Modelix from Kotlin (and Java). However, it makes sense to work with models stored in Modelix from all sort of other languages, in particular from TypeScript. At the moment we can work with Modelix only using dynamic API. For example, if we have a concept Car we do not have a class Car, we just use the class Node and set properties specifying the name (e.g., “plate” or “year” or “color”). We do not have a class with methods such as “getPlate” or “setColor”.

It could be useful to generate those classes. If we were exposing the meta model in some common format, like XMI, we may be able to reuse existing code generators, and then combine them with a runtime that's “Modelix-aware”.

Proposals

Some proposals for separate work packages:

  1. Describe a JSON Schema for "XMI in JSON", based on/extracted from ecore.js, emfjson-jackson and PyEcore. This would be useful as JSON Schema is a standard that's increasingly supported by tools, and standards/frameworks/specifications such as OpenAPI (formerly Swagger).
    • If ecore.js, emfjson-jackson and PyEcore differ: find a middle ground, and try to advocate/establish that through PRs?

Repo structure

  • mps/ holds an JetBrains' MPS project. This project holds a model with some JSON Schemas which are authored using a JSON Schema language implementation that can be here. Clone this repository, open MPS on mps-open-source/mps-open-source, and build the entire project. After that, you can open mps/ as a project in a new MPS window, and open roots in the EMF model in the schemas module. Building that module results in JSON schemas in mps/solutions/schemas/source_gen/EMF/.
  • ideas.md holds a collection of ideas/TODOs.