Parser design

The CodeChat Editor uses the Pest parser, a Rust implementation of a parsing expression grammar (or PEG). The purpose of the parser from a CodeChat Editor perspective is to classify a source file into code blocks and doc blocks. To accomplish this goal, grammar files (.pest) are divided into:

A shared grammar (shared.pest), which contains basic definitions applicable to all languages;
A language-specific grammar, which builds on these shared definitions by providing necessary language-specific customizations.

In particular, a language-specific grammar must provide:

The definition of a doc_block; for most languages, doc_block = _{ inline_comment | block_comment }. However, languages which lack an inline comment (such as CSS) or a block comment (such as Python) would contain only the appropriate comment type.
Inline comment definitions:
- Opening inline delimiter(s) supported by the language. Three inline comment delimiters must be defined for a language. For C, this is:
```
inline_comment_delims  = _{ inline_comment_delim_0 }
inline_comment_delim_0 =  { "//" }
inline_comment_delim_1 =  { unused }
inline_comment_delim_2 =  { unused }
```
- A token which defines characters in the body of on an inline comment. For Python, this is:
```
inline_comment_char = { not_newline }
```

Block comment definitions: provide opening and closing delimiter definitions. For C, this is:

block_comment                 =  { block_comment_0 }
block_comment_opening_delim_0 =  { "/*" }
block_comment_opening_delim_1 =  { unused }
block_comment_opening_delim_2 =  { unused }
block_comment_closing_delim_0 =  { "*/" }
block_comment_closing_delim_1 =  { unused }
block_comment_closing_delim_2 =  { unused }

code_line_token, a token used to recognize tokens in a code line.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

parser_design.md

parser_design.md

Parser design

Files

parser_design.md

Latest commit

History

parser_design.md

File metadata and controls

Parser design