librarian/model

The Librarian model contains the definitions of the core concepts that are needed to describe software libraries and control flow graphs (CFGs) that call parts of those libraries. The model consists of three layers:

Concepts are the fundamental building block of the model. A concept describes any entity that can be found in a library or a CFG, e.g. a function, a datatype or the call to a function.
Paradigms are a combination of a collection of concepts that are relevant for the paradigm and an optional collection of concept instances that can always be found in the paradigm. Examples for paradigms are object-orientation, functional programming or logic programming.
Ecosystems are a special type of paradigm. They represent a particular combination of paradigms, syntax and tooling that make up a language ecosystem. Python for example has concepts from the object-oriented and functional paradigm, represents those concepts via the syntax of the Python language and allows executing CFGs expressed in that syntax via the Python interpreter. All the information required to represent and execute a program in a given ecosystem is encoded in the definition of an ecosystem model.

Librarian comes with a collection of general purpose concepts that can be found across many paradigms and ecosystems. It also comes with basic paradigm definitions for object-oriented and functional programming. Lastly it also provides an ecosystem for Python.

In the following sections we will first describe how to define such components, then the mentioned builtin components are described in detail.

1. Definition of Model Components

The model offers three macros to define concepts, paradigms and ecosystem: defconcept, defparadigm and defecosystem. They can be found in the librarian.model.syntax namespace.

1.1. Defining Concepts

(defconcept name [optional vector of parent concepts ...]
  :attributes { ... A datascript schema for the concept ... }
  :preprocess { ... A map from attributes to preprocessor functions ... }
  :postprocess A postprocessor function
  :spec A Clojure spec to validate concept instances)

Defines a new concept with name name in the current namespace. The concept is described by a sequence of key-value pairs. A concept description can contain the following pairs, all of which are optional:

:attributes: A map describing datascript attributes, i.e. a database schema for the concept.
:preprocess: A map from attributes to preprocessor functions for those attributes. Useful to mirror attribute values.
:postprocess: A function that takes a datascript database and the id of an instance of this concept. The function returns a datascript transaction that should be executed as part of the transaction that adds the given concept. Useful to compute derived attributes for concept instances.
:spec: A Clojure spec that should be used to check the validity of supposed instances of this concept.

In addition to the key-value pairs the concept description can be preceded by a vector of concepts that the newly defined concept should inherit from.

Example:

(require '[librarian.model.syntax :refer [defconcept]]
         '[clojure.spec.alpha :as s]
		 '[librarian.helpers.spec :as hs]
		 '[my.other.concepts :refer [parent-concept1 other-concept]])

(defconcept parent-concept2) ; Useless, but allowed.

(defconcept my-concept [parent-concept1 parent-concept2]
  :attributes {::x {:db/doc "A test attribute."}
               ::y {:db/valueType :db.type/ref
			        :db/doc "A reference to another concept."}}
  :postprocess (fn [db id] [:db/add id ::x 42]) ; All instances get x=42 auto-assigned.
  :spec ::my-concept)

(s/def ::my-concept (hs/entity-keys :req [::x ::y]))
(s/def ::x int?)
(s/def ::y (hs/instance? other-concept))

It is strongly recommended that all concept attributes are fully-qualified keywords (::attribute instead of :attribute) to prevent accidental collisions with other concepts. It is also recommended to create a separate namespace for each concept (not like the example where parent-concept2 and my-concept are defined in a single namespace).

1.2. Defining Paradigms

(defparadigm name [optional vector of parent paradigms ...]
  :concepts { ... A map of concept aliases to concepts ... }
  :builtins [ ... A collection of concept instance descriptions ... ])

Defines a new paradigm with name name in the current namespace. The name is followed by an optional vector of paradigms that should be included in the new paradigm. Then a sequence of key-value pairs follows:

:concepts: A map of unnamespaced concept alias keywords to concepts.
:builtins: A collection of builtin concept instances in the defined paradigm. The predefined instances should be created via librarian.model.syntax/instanciate. Builtins are intended to define things like the global Object class in Java.

Example:

(require '[librarian.model.syntax :refer [defparadigm instanciate]]
		 '[my.concepts :refer [my-concept]]
		 '[my.other.concepts :refer [other-concept]]
		 '[my.other.paradigms :refer [parent-paradigm]])

(defparadigm my-paradigm [parent-paradigm]
  :concepts {:my-concept my-concept
             :other-concept other-concept}
  :builtins [(instanciate my-concept
			   :y (instanciate other-concept
				    :foo "bar"))])

In this example a paradigm with two concepts is created. Instances of both concepts are defined as builtins. See the docstring of the instanciate function for details on how it works.

Note: All instances described via instanciate will be processed via the correspondig concept's :preprocess and :postprocess functions as well as validated via :spec to prevent the addition of inconsistent or invalid builtins. The same processing and validation steps are also performed by the scraper.

The concept aliases :my-concept and :other-concept assign a shorthand name for each concept that is relevant in a particular paradigm. Aliases simplify the specification of scraper configurations and initial generator states. Without aliases, each concept would have to be referenced using its fully qualified name, e.g. :my.concepts/my-concept. Using aliases one can write :my-concept instead. Another purpose of aliases is to simplify the attribute syntax:

; Assuming my-concept has properties x, y; parent-concept has property z.
; The full attribute names would be:
:my.concepts/x :my.concepts/y :my.other.concepts/z
; If my-concept extends other-concept, instances have all three attributes.
; Attribute aliases simplify referring to those attributes:
:my-concept/x :my-concept/y :my-concept/z
; Thus the original place of definition of an attribute is not necessary to refer to it.

Lastly aliases are also useful to refer to a concept via multiple names, e.g. a namespace concept could be aliased to :package in a Java environment and to :module in a Python environment.

1.3. Defining Ecosystems

(defecosystem name [optional vector of parent paradigms or ecosystems ...]
  :concepts { ... A map of concept aliases to concepts ... }
  :builtins [ ... A collection of concept instance descriptions ... ]
  :generate A generator function
  :executor An executor factory)

Defines a new ecosystem with name name in the current namespace. Similar to defparadigm but accepts additional key-value pair types:

:generate: A function that takes a metadata map and a database containing an executable CFG and that returns a snippet of executable code for the ecosystem.
:executor: Similar to the generator defined above but returns a function that executes the code snippet and returns the result of the execution instead of simply returning the code snippet string.

Ecosystems are essentially paradigms with support for some specific syntax via the :generator and :executor functions. They typically also have a much more extensive set of :builtins than paradigms.

2. Builtin Model Components

The Librarian model comes with a collection of builtin components that are described next.

2.1. Builtin General-Purpose Concepts

Concepts that are useful across multiple programming paradigms and languages are defined in the librarian.model.concepts namespace. Now follows an overview of those concepts. For each concept the list of its attributes and parent concepts is given.

Legend:

Derived attributes will be automatically computed via a preprocessor or postprocessor and should not be manually provided.
Indexed attributes allow a fast reverse lookup of the entities having a given attribute value. By default only the forward direction is indexed.
Unique attributes are indexed and also guarantee that a reverse lookup will find at most one entity for any given value.

Names & Positions
Concept	Attribute	Type / Cardinality / Index	Description
`named`			A named entity.
	`name`	String, indexed	Name of the entity.
`namespace` extends `named`			A namespace.
	`id`	Derived from `name`, unique	Unique id of the namespace.
	`member`	Ref to `namespaced`, multiple (0..n), indexed	A member of the namespace.
`namespaced` extends `named`			A namespace member.
	`id`	Derived vector, unique	Fully-qualified name of the member: `[namespace-name member-name]`
`positionable`			Something with an ordinal position.
	`position`	Integer, optional (0..1)	Ordinal position of the entity.
Datatypes
Concept	Attribute	Type / Cardinality / Index	Description
`datatype`			A datatype.
	`datatype`	Ref to `datatype`, multiple (0..n), indexed	A supertype of the datatype.
`basetype` extends `named`, `datatype`			A basic type (like `int` or `boolean`).
	`id`	Derived from `name`, unique	Unique id of the basetype.
`semantic-type` extends `positionable`, `datatype`			A datatype representing all values that have a certain semantic. Semantic types are fuzzy since their semantic is described via natural language. They can be ordered via `position` if the semantic `value`s for a `key` represent some sequence, e.g. a sequence of paragraphs.
	`key`	String, optional (0..1)	A context for the semantic `value`, e.g. "description" or "unit"
	`value`	String	A string describing the semantics of the type.
`role-type` extends `datatype`			Role types represent the set of values that can take a certain role. The role type with id `:dataset` for example could represent all training dataset arrays. While role types describe some kind of semantic, similar to `semantic-type`, they are not fuzzy and are assumed to have a clearly defined meaning.
	`id`	Keyword, unique	Unique id of the role type.
`typed`			A concept with datatypes. The datatype of an entity with multiple types is the union type of those types.
	`datatype`	Ref to `datatype`, multiple (0..n), indexed	A datatype of the concept.
Callables
Concept	Attribute	Type / Cardinality / Index	Description
`callable` extends `typed`			Represents something that can be called with parameters and returns results. It is typed so that semantic and role information can be attached to it.
	`parameter`	Ref to `parameter`, multiple (0..n), indexed	A parameter of the callable.
	`result`	Ref to `result`, multiple (0..n), indexed	A returned result of the callable.
`io-container` extends `named`, `typed`, `positionable`			Represents an input or output (parameter or result) of a callable.
			No attributes.
`parameter` extends `io-container`, `data-receiver`			Represents a parameter of a callable.
	`optional`	Boolean, optional (default: `false`)	Denotes whether this parameter is optional.
`result` extends `io-container`, `data-receiver`			Represents a returned result of a callable.
			No attributes.
Control Flow Graph Nodes
Concept	Attribute	Type / Cardinality / Index	Description
`call` extends `typed`			Represents a call to some `callable`.
	`callable`	Ref to `callable`	The callable of this call.
	`parameter`	Ref to `call-parameter`, multiple (0..n), indexed	A parameter of this call.
	`result`	Ref to `call-result`, multiple (0..n), indexed	A result of this call.
`data-receivable`			Something that can be received by a `data-receiver`.
			No attributes.
`data-receiver` extends `data-receivable`			A concept that can receive a value from some `data-receivable`. A receiver either has or receives some value to which it can optionally also get some additional semantic information from the outside.
	`receives`	Ref to `data-receivable`, multiple (0..n), indexed	A receivable from which this receiver gets its value and thus has to be able to accept the datatype of the received value.
	`receives-semantic`	Ref to `data-receivable`, multiple (0..n), indexed	A receivable from which this receiver gets the `semantic-type`s of the value it holds.
`call-parameter` extends `typed`, `positionable`, `data-receiver`			Represents a parameter of a `call`.
	`parameter`	Ref to `parameter`	The `parameter` for which this `call-parameter` provides a value.
`call-result` extends `typed`, `positionable`, `data-receiver`			Represents a result of a `call`.
	`result`	Ref to `result`	The `result` that provides the value for this `call-result`.
`constant` extends `typed`, `datatype`, `data-receivable`			Represents a constant value that can be received by `call-parameters`. The constant concept is implemented as a typed datatype, where a constant is its own instance. This was done to be able to represent enum types as disjunctions of constants (disjunctions are however not yet supported).
	`value`	String or integer or boolean	The value of the constant.
`snippet`			Represents a code snippet/template as a partial CFG. A snippet is a concept that points to the CFG nodes that make up its partial CFG.
	`value`	Ref to a CFG-node or any concept with a truthy `:placeholder` attribute	A control-flow concept that is part of the snippet.

2.2. Builtin Paradigms

The model comes with three builtin paradigms.

2.2.1. The `common` paradigm

A universal paradigm of concepts that are common in many paradigms.

Concept Aliases:

:named: named
:namespace: namespace
:namespaced: namespaced
:datatype: datatype
:basetype: basetype
:semantic-type: semantic-type
:role-type: role-type
:typed: typed
:callable: callable
:io-container: io-container
:parameter: parameter
:result: result
:call: call
:data-receiver: data-receiver
:call-parameter: call-parameter
:call-result: call-result
:constant: constant
:snippet: snippet

No additional concepts or builtin instances are defined.

2.2.2. The `functional` paradigm (extends `common`)

A paradigm for functional languages.

Additional Concept Aliases:

:function: function

Functional Concepts
Concept	Attribute	Type / Cardinality / Index	Description
`function` extends `namespaced`, `callable`			A function.
			No attributes.

No builtin instances are defined.

2.2.3. The `oo` paradigm (extends `common`)

A paradigm for object oriented languages.

Additional Concept Aliases:

:class: class
:constructor: constructor
:method: method

OOP Concepts
Concept	Attribute	Type / Cardinality / Index	Description
`class` extends `typed`, `namespaced`, `datatype`			A class.
	`constructor`	Ref to `constructor`, multiple (1..n), indexed	Constructor of the class.
	`method`	Ref to `method`, multiple (0..n), indexed	Method of the class.
`constructor` extends `callable`			A constructor of a class.
			No attributes.
`method` extends `named`, `callable`			A method of a class.
			No attributes.

No builtin instances are defined.

2.3. Builtin Ecosystems

The model provides its builtin ecosystems via the librarian.model.core/ecosystems map:

{:python python}

Every ecosystem has a keyword alias with which it can be referenced in scraper configuration files.

Currently only an ecosystem for Python (:python) is provided.

2.3.1. The `python` ecosystem (extends `functional`, `oo`)

An ecosystem for Python.

Additional Concept Aliases:

:class: python/class (overrides class)
:constructor: python/constructor (overrides constructor)
:basetype: python/basetype (overrides basetype)

Python Concepts
Concept	Attribute	Type / Cardinality / Index	Description
`python/class` extends `class`			A Python class. Like `class` but can only have a single constructor and automatically recognizes methods named `__init__` as its constructor.
			No attributes.
`python/constructor` extends `constructor`			Like `constructor` but with a unique reference to its class.
	`class`	Derived ref to `python/class`, unique	A reference to the constructor's class. In Python this uniquely identifies a constructor.
`python/basetype` extends `basetype`			Like `basetype` but only allows the Python basetype names: "object", "int", "float", "complex", "string", "boolean".
			No attributes.

Builtin Instances:

basetype instances: int, float, complex, string, boolean which all extend object.
Typecasting functions:
- str(x): object -> string
- int(x): object -> int
- float(x): object -> float

Other Python builtins can be added when needed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

librarian/model

1. Definition of Model Components

1.1. Defining Concepts

1.2. Defining Paradigms

1.3. Defining Ecosystems

2. Builtin Model Components

2.1. Builtin General-Purpose Concepts

2.2. Builtin Paradigms

2.2.1. The `common` paradigm

2.2.2. The `functional` paradigm (extends `common`)

2.2.3. The `oo` paradigm (extends `common`)

2.3. Builtin Ecosystems

2.3.1. The `python` ecosystem (extends `functional`, `oo`)

Files

README.md

Latest commit

History

README.md

File metadata and controls

librarian/model

1. Definition of Model Components

1.1. Defining Concepts

1.2. Defining Paradigms

1.3. Defining Ecosystems

2. Builtin Model Components

2.1. Builtin General-Purpose Concepts

2.2. Builtin Paradigms

2.2.1. The common paradigm

2.2.2. The functional paradigm (extends common)

2.2.3. The oo paradigm (extends common)

2.3. Builtin Ecosystems

2.3.1. The python ecosystem (extends functional, oo)

2.2.1. The `common` paradigm

2.2.2. The `functional` paradigm (extends `common`)

2.2.3. The `oo` paradigm (extends `common`)

2.3.1. The `python` ecosystem (extends `functional`, `oo`)