-
Notifications
You must be signed in to change notification settings - Fork 143
The Language Specification
This is a draft for the 0.4
version of the Kind language specification, it's made to be a "formal" informal specification that is the basis for a real world implementation but the compiler can make some slight modifications to it in order to make it more performant.
This document is influenced by the Ferrocene Specification, The Ada Reference Manual, the Rust language reference and the Haskell 2010 Reference guide.
- This document specifies the syntax and the semantics of programs written in the programming language Kind and it's implemented as a whole with some modifications by the
kindc
compiler that is on hosted on GitHub under theHIgherOrderCO
organization. - This document does not specifies how programs are compiled to any particular backend, how they're interpreted in this specific backend, how things are translated to any specific IR, just how they're equivalent to some other constructions of the language.
- This document does not specifies how external effects interacts with the underlying machine in order to make it independent of a machine and just an abstract specification.
- Introduction
- Chapter 1. General
- 1.1. Scope
- 1.2. Structure
- Chapter 2. Lexical Grammar
- 2.1. Notation Conventions
- 2.2. Character Set
- 2.3. Grammar
- Chapter 3. Items syntax and semantics
- 3.1. Module
- 3.1.1 Shebang
- 3.1.2 Use aliases
- 3.2. Declarations
- 3.2.1. Arguments
- 3.2.1.1 Strictness, Erasability and Hidability
- 3.2.1. Functions Rules and Signatures
- Docstrings
- Public and private
- Default parameters
- Absurd Rule
- 3.2.2. Types and Records
- 3.2.3. Attributes
- 3.2.4. Commands
- 3.3.5. Eval and Print
- 3.3.6. Widget
- 3.2.1. Arguments
- 3.3. Literals
- 3.3.1 Type Literals
- 3.3.2 Numeric Literals
- 3.4. Expressions
- 3.3.3 Function calls
- 3.3.4 Named arguments
- 3.3.5 Lambda Injection
- 3.3.6 Lightweight lambdas
- 3.3.7 Pi type
- 3.3.8 Sigma type
- 3.3.9 Let binding
- 3.3.10 Type annotation
- 3.3.11 Binary operations
- 3.3.12 Substitutions
- 3.3.13 Field accessor
- 3.3.14 Syntax sugars
- 3.3.14.1 Open expression
- 3.3.14.2 Match expression
- 3.3.14.3 If expression
- 3.3.14.4 Do expression
- 3.3.14.4.1 Return statement
- 3.3.14.4.2 Ask statement
- 3.3.14.5 List expression
- 3.3.14.6 Pair expression
- 3.3.14.7 Sigma expression
- 3.3.3 Function calls
- 3.5 Patterns
- 3.5.1 Constructor pattern
- 3.5.2 Variable pattern
- 3.5.3 Absurd pattern
- 3.5.4 List pattern
- 3.5.5 Pair pattern
- 3.5.6 Named patterns
- 3.5.7 Literal Patterns
- 3.5.8 Rest patterns
- 3.5.9 Wildcards
- 3.1. Module
- Chapter 4. Types
- 4.1. Built-in Types
- 4.2. Function Types
- 4.3. Inductive Types
- 4.4. Records
- 4.5. Type aliases
- 4.6. Error Type
- Chapter 5. Patterns
- 5.1. Refutability
- 5.2. Pattern Matching and Match statement
- Chapter 6. Entities and Resolution
- 6.1. Name resolution
- 6.2. Crates and Packages
- 6.3. Namespaces and Visibility
- Chapter 8. Compile time constants
- 8.1. LOC, FILE, COL, LINE, VERSION
In this chapter, we describe the lexical structure of the Kind language.
The notation used in the lexical grammar is the following:
Syntax | Description |
---|---|
pattern? | Optional |
pattern* | Zero or more repetitions |
pattern+ | One or more repetitions |
pat1 | pat2 | Choice |
pat1-pat2 | Pat1 excluding Pat2 |
terminal |
Terminal |
num..num2 | Char range from num to num2 |
In the next section, we will ignore all of the whitespaces but we will handle semicolons and linebreaks explicitly due to how Kind expressions work in practice. All of the rules are expressed in a BNF-like language with rules in the form:
non_terminal -> alt1
| alt2
| ...
The program text of a Kind program is written using the Unicode character set. A whitespace string matches the following rule:
hor_tab -> `0x09` ; Horizontal Tabulation
ver_tab -> `0x0B` ; Vertical Tabulation
new_line -> `0x0A` ; New line
form_feed -> `0x0A` ; Form Feed
car_ret -> `0x0D` ; Cariage Return
space -> `0x20` ; Space
next_line -> `0x85` ; Next Line
white_space -> ( hor_tab
| ver_tab
| new_line
| form_feed
| car_ret
| space
| next_line
)+
An ascii character matches the following rule.
char -> 0x00..0x7F
We have a distiction between lower and upper cased identifiers in the language. Upper cased identifiers are used for globally available functions and constructors and lower cased identifiers for variables.
The lexical analysis process should use the "maximum munch", the principle that the longest chain of characters should be consumed to determine a single token.
lower -> 'a'-'z'
lower_id -> (lower | '_') id_tail
upper -> 'A'-'Z'
upper_id -> upper id_tail
id_tail -> (lower | upper | '_' | digit)
any_id -> lower_id | upper_id
All of the numbers can be suffixed with some specific type because Kind does not have something like a Num
type class so it's discovered sintatically.
digit -> '0'..'9'
hex_digit -> '0'..'9' | 'A' .. 'F' | 'a' .. 'f'
bin_digit -> '0'..'1'
oct_digit -> '0'..'7'
number -> `0` ('x' | 'X') hex_digit+
| `0` ('b' | 'B') bin_digit+
| `0` ('o' | 'O') oct_digit+
| digit+ `u60`?
number_u60 -> number `u60`?
number_u120 -> number `u120`
number_nat -> number `n`
number_f64 -> '-' dec_float
dec_float -> decimal+ ('.' decimal+ exp? | exp)
exp_float -> ('e' | 'E') ('+' | '-')? digit+
escaped_end -> `n` | `r` | `t` | `0` | `\` | `'` | `"`
| (`x` | 'X') hex_digit hex_digit
| (`u` | `U`) hex_digit hex_digit hex_digit hex_digit
escaped_char -> `\` escaped_end
| char
string_lit -> `"` escaped_char* `"`
char_lit -> `'` escaped_char `'`
All of these operators are useful to make binary operations between elements of type U60
.
operator -> `+` | `-` | `*` | `/` | `%` | `&`
| `|` | `^` | `>>` | `<<` | `>` | `<`
| `<=` | `>=` | `!` | `!=`
Keywords and the help
construction.
help -> `?` lower_id
keywords -> `return` | `ask` | `with`
soft_keywords -> `let` | `do` | `if` | `else` | `match`
| `type` | `record` | `constructor` | `as`
| `use`
Line and block comments.
block_part -> (char - (`/*` | `*/`))*
block_comment -> `/*` (block_comment | block_part) `*/`
comment -> `//` (char - (`\n` | EOF))
Separators.
ponctuation -> `.` | `,` | `<` | `>` | `@` | `~` | `#`
delimiter -> `(` | `)` | `[` | `]` | `{` | `}`
semi -> (';' | '\n')+
_ -> semi?
This chapter describes all of the available syntax in the Kind language and it's static and dynamic semantics.
A Kind program is defined as a set of modules that are ways to reuse declarations and to control the namespace.
A module is a container for zero or more declarations, it has a name that delimites which are the names that can be declared inside that module and how other modules can find that one.
The names inside of a module are limited because it enforces the transparency of dependency when trying to find some module by the name and avoid weird injections.
The bash-shebang is always stripped from the top of the file if present.
A set of namespace aliases can be present at the top of a module and they are useful for local synonyms.
alias_decl <- `use` _ qualified_id `as` _ qualified_id