Skip to content

Latest commit

 

History

History
341 lines (233 loc) · 26.4 KB

README.md

File metadata and controls

341 lines (233 loc) · 26.4 KB

mcpyrate

Advanced macro expander and language lab for Python. The focus is on correctness, feature-completeness for serious macro-enabled work, and simplicity, in that order.

We aim at developer-friendliness. mcpyrate yields correct coverage for macro-enabled code, reports errors as early as possible, and makes it easy to display the steps of any macro expansion - with syntax highlighting, use site filename, and source line numbers:

mcpyrate stepping through expansion of letseq from demos Figure 1. mcpyrate stepping through letseq from the demos.

mcpyrate builds on mcpy, with a similar explicit and compact approach, but with a lot of new features. Some of our features are strongly inspired by macropy, such as quasiquotes, macro arguments, and expansion tracing. Features original to mcpyrate include a universal bootstrapper, integrated REPL system (including an IPython extension) and support for chainable whole-module source and AST transformers, developed from the earlier prototypes imacropy and pydialect; plus multi-phase compilation (a.k.a. staging; inspired by Racket), and identifier macros.

We use semantic versioning. mcpyrate is almost-but-not-quite compatible with mcpy 2.0.0, hence the initial release is 3.0.0. There are some differences in the named parameters the expander provides to the macro functions; for details, search the main user manual for differences to mcpy.

100% Python supported language versions supported implementations CI status codecov
version on PyPI PyPI package format dependency status
license: MIT open issues PRs welcome

Some hypertext features of this README, such as local links to detailed documentation, are not supported when viewed on PyPI; view on GitHub to have those work properly.

Table of Contents

News

26 September, 2024

The mcpyrate project is still alive, but it already does what I need it to do, and has not required maintenance in the past two years. Just now, I finally added Python 3.11 and 3.12 support.

If you would like to help out with anything in this project, start here. Small contributions matter!

First example

mcpyrate gives you macro-enabled Python with just two source files:

# mymacros.py with your macro definitions
def echo(expr, **kw):
    print('Echo')
    return expr

# application.py
from mymacros import macros, echo
echo[6 * 7]

Or even with just one source file:

# application.py
from mcpyrate.multiphase import macros, phase

with phase[1]:
    def echo(expr, **kw):
        print('Echo')
        return expr

from __self__ import macros, echo
echo[6 * 7]

To run either example, macropython -m application, or macropython application.py.

More examples can be found in the demo/ subfolder. To run the demos after installing mcpyrate, go to the mcpyrate project directory, and invoke them like macropython demo/anaphoric_if.py.

Running the extra examples in the tests

The tests contain even more usage examples, including advanced ones. See the mcpyrate/test/ subfolder.

Tests must be run using the mcpyrate in the source tree (instead of any installed one), because they expect to live in the module mcpyrate.test, but the test subfolder is not part of the installation. Thus, if the mcpyrate top-level module name resolves to an installed copy, there won't be a module named mcpyrate.test.

To run with the mcpyrate in the source tree, replace macropython with python3 -m mcpyrate.repl.macropython. For example, to run a demo, python3 -m mcpyrate.repl.macropython demo/anaphoric_if.py, or to run a test, python3 -m mcpyrate.repl.macropython -m mcpyrate.test.test_compiler. Here the first -m goes to python3, whereas the second one goes to macropython.

If you just want to run all tests, python3 runtests.py.

Features

  • Agile development tools.

    • Multi-phase compilation: Use macros also in the same module where they are defined.
    • Universal bootstrapper: macropython. Import and use macros in your main program.
    • Interactive console: macropython -i. Import, define and use macros in a console session.
      • Embeddable à la code.InteractiveConsole. See mcpyrate.repl.console.MacroConsole.
    • IPython extension mcpyrate.repl.iconsole. Import, define and use macros in an IPython session.
    • See full documentation of the REPL system.
  • Run-time compiler access.

    • Expand, compile and run macro-enabled code snippets on the fly.
    • Accepts source code and AST inputs. (Use quasiquotes to conveniently create ASTs.)
    • Dynamically created code snippets support all the same features as importing code from a source file on disk.
    • See full documentation of the compiler. Examples can be found in mcpyrate/test/test_compiler.py.
  • Testing and debugging.

    • Statement coverage is correctly reported by tools such as Coverage.py.
    • Macro expansion errors are reported at macro expansion time, with use site traceback.
    • Debug output with a step-by-step expansion breakdown. See macro mcpyrate.debug.step_expansion.
      • Has both expr and block modes. Use step_expansion[...] or with step_expansion as appropriate.
      • The output is syntax-highlighted, and line-numbered based on lineno fields from the AST.
        • Also names of macros currently bound in the expander are highlighted by step_expansion.
        • Line numbers are taken from statement AST nodes.
      • The invisible nodes ast.Module and ast.Expr are shown, since especially ast.Expr is a common trap for the unwary.
      • To step the expansion of a run-time AST value, see the macro mcpyrate.metatools.stepr. Documentation.
    • Manual expand-once. See expander.visit_once; get the expander as a named argument of your macro. See also the expand1s and expand1r macros in mcpyrate.metatools.
  • Lightning speed.

    • Bytecode caches (.pyc) are created and kept up-to-date. Saves macro expansion cost at startup for unchanged modules. Makes mcpyrate fast on average.

      Beside a .py source file itself, we look at any macro definition files it imports macros from, recursively, in a make-like fashion.

      The mtime is the latest of those of the source file and its macro-dependencies, considered recursively, so that if any macro definition anywhere in the macro-dependency tree of a source file is changed, Python will treat that source file as "changed", thus re-expanding and recompiling it (hence, updating the corresponding .pyc).

    • CAUTION: PEP 552 - Deterministic pycs is not supported; we support only the default mtime invalidation mode, at least for now.

  • Quasiquotes, with advanced features.

    • Hygienically interpolate both regular values and macro names.
    • Delayed macro expansion inside quasiquoted code. User-controllable.
    • Inverse quasiquote operator. See function mcpyrate.quotes.unastify.
      • Convert a quasiquoted AST back into a direct AST, typically for further processing before re-quoting it.
        • Not an unquote; we have those too, but the purpose of unquotes is to interpolate values into quoted code. The inverse quasiquote, instead, undoes the quasiquote operation itself, after any unquotes have already been applied.
    • See full documentation of the quasiquote system.
  • Macro arguments.

    • Opt-in. Declare by using the @parametricmacro decorator on your macro function.
    • Use brackets to invoke, e.g. macroname[arg0, ...][expr]. If no args, just leave that part out, e.g. macroname[expr].
    • The macroname[arg0, ...] syntax works in expr, block and decorator macro invocations in place of a bare macroname.
    • The named parameter args is a raw list of the macro argument ASTs. Empty if no args were sent, or if the macro function is not parametric.
  • Identifier (a.k.a. name) macros.

    • Opt-in. Declare by using the @namemacro decorator on your macro function.
    • Can be used for creating magic variables that may only appear inside specific macro invocations.
  • Dialects, i.e. whole-module source and AST transforms.

    • Think Racket's #lang, but for Python.
    • Define languages that use Python's surface syntax, but change the semantics; or plug in a per-module transpiler that (at import time) compiles source code from some other programming language into macro-enabled Python. Also an AST optimizer could be defined as a dialect. Dialects can be chained.
    • Sky's the limit, really. See the dialects modules in unpythonic for example dialects.
    • For debugging, from mcpyrate.debug import dialects, StepExpansion.
    • If writing a full-module AST transformer that splices the whole module into a template, see mcpyrate.splicing.splice_dialect.
    • See full documentation of the dialect system.
  • Conveniences.

    • Relative macro-imports (for code in packages), e.g. from .other import macros, kittify.
    • The expander automatically fixes missing ctx attributes in the AST, so you don't need to care about those in your macros.
    • In most cases, the expander also fills in correct source location information automatically (for coverage reporting). If you're discarding nodes from the input, then you may have to be slightly careful and use ast.copy_location appropriately.
    • Several block macros can be invoked in the same with (equivalent to nesting them, with leftmost outermost).
    • AST visitor and transformer à la macropy's Walker, to easily context-manage state for subtrees, and collect items across the whole walk. Full documentation.
    • AST markers (pseudo-nodes) for communication in a set of co-operating macros (and with the expander).
    • gensym to create a fresh, unused lexical identifier.
    • unparse to convert an AST to the corresponding source code, optionally with syntax highlighting (for terminal output).
    • dump to look at an AST representation directly, with (mostly) PEP8-compliant indentation, optionally with syntax highlighting (node types, field names, bare values).

Documentation

The full documentation of mcpyrate lives in the doc/ subfolder. Some quick links:

We aim at complete documentation. If you find something is missing, please file an issue. (And if you already figured out the thing that was missing from the docs, a documentation PR is also welcome!)

Differences to other macro expanders for Python

mcpyrate is not drop-in compatible with macropy or mcpy. This section summarizes the differences.

It should be emphasized that what follows is a minimal list of differences. In addition, we provide many advanced features not available in previous macro expanders for Python, such as dialects (full-module transforms), multi-phase compilation, and run-time compiler access (with dynamic module creation).

Differences to macropy

  • mcpyrate has no macro registry; each macro function is its own dispatcher. See the syntax named parameter.

  • In mcpyrate, macros expand outside-in (and only outside-in) by default.

    • No yield. To expand inside-out, recurse explicitly in your macro implementation at the point where you want to switch to inside-out processing. See the main user manual.
  • Named parameters filled by expander (**kw in the macro definition) are almost totally different.

    • No gen_sym parameter; use the function mcpyrate.gensym. We use UUIDs, avoiding the need for a lexical scan.
  • Macro arguments

    • Passed using brackets.
  • Macro expansion error reporting

    • Raise an exception as usual, don't assert False.
  • Quasiquotes

    • Quasiquoted code isn't automatically macro-expanded.
    • Hygienic quasiquoting works differently.
      • No separate hq[]. Instead, we provide h[], which is a hygienic unquote that captures a value or a macro.
      • You can capture arbitrary expressions, not just identifiers.
      • You can capture macros, too.
    • The equivalents of ast_literal and name have additional features, and an operator to interpolate a list of ASTs as an ast.Tuple has been added (one use case is to splice in a variable number of macro arguments to a quasiquoted macro invocation).
  • AST walkers

    • Similar in spirit to ast.NodeTransformer and ast.NodeVisitor, but with functionality equivalent to that of macropy.core.walkers.Walker. But it works differently.
    • Particularly, use explicit recursion as in ast.NodeTransformer; there is no stop.
    • For the ctx mechanism, see withstate and generic_withstate (the latter relates to the former as generic_visit relates to visit in ast.NodeTransformer).

Differences to mcpy

  • Named parameters filled by expander
    • No to_source; use the function mcpyrate.unparse. You might want unparse(tree, debug=True, color=True).
    • No expand_macros; ask for expander instead, and call expander.visit.

Install & uninstall

From PyPI

pip install mcpyrate

From source

Clone the repo from GitHub. Then, navigate to it in a terminal, and:

pip install . --no-compile

The --no-compile flag is important. It prevents an incorrect precompilation of the mcpyrate modules, without macro support, that pip install would otherwise do at its bdist_wheel step.

For most Python projects such precompilation is just fine - it's just macro-enabled projects that shouldn't be precompiled with standard tools.

If --no-compile is NOT used, the precompiled bytecode cache may cause errors such as ImportError: cannot import name 'macros' from 'mcpyrate.quotes', when you try to import macros from some other macro-enabled package that uses macros from mcpyrate (e.g. from unpythonic.syntax import macros, let). In-tree, it might work, but against an installed copy, it will fail. It has happened that my CI setup did not detect this kind of failure.

This is a common issue when using macro expanders in Python. See the Packaging section in troubleshooting.

Uninstall

pip uninstall mcpyrate

Understanding the implementation

We follow the mcpy philosophy that macro expanders aren't rocket science. See CONTRIBUTING.md.

Emacs syntax highlighting

This Elisp snippet adds syntax highlighting for keywords specific to mcpyrate to your Emacs setup:

  (defun my/mcpyrate-syntax-highlight-setup ()
    "Set up additional syntax highlighting for `mcpyrate` in Python mode."
    ;; adapted from code in dash.el
    (let ((more-keywords '("macros" "dialects"
                           "q" "u" "n" "a" "s" "t" "h"))
          ;; How to make Emacs recognize your magic variables. Only for the anaphoric if demo.
          ;; A list, like `more-keywords`, even though in the example there is only one item.
          (magic-variables '("it")))
      (font-lock-add-keywords 'python-mode `((,(concat "\\_<" (regexp-opt magic-variables 'paren) "\\_>")
                                              1 font-lock-variable-name-face)) 'append)
      (font-lock-add-keywords 'python-mode `((,(concat "\\_<" (regexp-opt more-keywords 'paren) "\\_>")
                                              1 font-lock-keyword-face)) 'append)
  ))
  (add-hook 'python-mode-hook 'my/mcpyrate-syntax-highlight-setup)

Known issue: For some reason, during a given session, this takes effect only starting with the second Python file opened. The first Python file opened during a session shows with the default Python syntax highlighting. Probably something to do with the initialization order of font-lock and whichever python-mode is being used.

Tested with anaconda-mode.

Install (for beginners in Emacs customization)

If you use the Spacemacs kit, the snippet can be inserted into the function dotspacemacs/user-config. (If you use the Emacs key bindings, M-m f e d to open your config file.) Here's my spacemacs.d for reference; the syntax highlight code is in prettify-symbols-config.el, and it's invoked from the function dotspacemacs/user-config in init.el.

In a basic Emacs setup, the snippet goes into the ~/.emacs startup file, or if you have an .emacs.d/ directory, then into ~/.emacs.d/init.el.

Why macros?

Despite their fearsome reputation, syntactic macros are a clean solution to certain classes of problems. Main use cases of macros fall into a few (not necessarily completely orthogonal) categories:

  1. Syntactic abstraction, to extract a pattern that cannot be extracted as a regular run-time function. Regular function definitions are a tool for extracting certain kinds of patterns; macros are another such tool. Both these tools aim at eliminating boilerplate, by allowing the definition of reusable abstractions.

    Macros can replace design patterns, especially patterns that work around a language's limitations. See Norvig's classic presentation on design patterns. For a concrete example, see Seibel.

  2. Source code access. Any operation that needs to get a copy of the source code of an expression (or of a code block) as well as run that same code is a prime candidate for a macro. This is useful for implementing tooling for e.g. debug-logging and testing.

  3. Evaluation order manipulation. By editing code, macros can change the order in which it gets evaluated, as well as decide whether a particular expression or statement runs at all.

    As an example, macros allow properly abstracting delay/force in a strict language. force is just a regular function, but delay needs to be a macro. See our delayed evaluation demo.

  4. Language-level features inspired by other programming languages. For example, unpythonic provides expression-local variables (let), automatic tail call optimization (TCO), autocurry, lazy functions, and multi-shot continuations.

    As the Racket guide notes, this is especially convenient for language-level features not approved by some other language designer. Macros allow users to extend the language. Dialects take that idea one step further.

  5. Compile-time validation. See e.g. Alexis King (2016): Simple, safe multimethods in Racket. Our sister project unpythonic also uses macros to perform some simple checks for whether certain helper constructs appear in a valid position, and when not, errors out at compile time.

  6. Embedded domain-specific languages (eDSLs).

    Here embedded means the DSL seamlessly integrates into the surrounding programming language (the host language). With embedded DSLs, there is no need to implement a whole new parser for the DSL, and many operations can be borrowed from the host language. Infix arithmetic notation and regular expressions are common examples of eDSLs that come embedded in many programming languages.

    (Note that a general-purpose programming language does not strictly need to provide infix arithmetic; many Lisps do not. Of course, a form of infix arithmetic can be added as a macro; here's a very compact Racket solution (search the page for "more operators").)

    The embedded approach significantly decreases the effort needed to implement a DSL, thus making small DSLs an attractive solution for a class of design problems. A language construction kit can be much more useful than how it may sound at first.

  7. Mobile code, as pioneered by macropy. Shuttle code between domains, while still allowing it to be written together in a single code base.

That said, macros are the 'nuclear option' of software development. Often a good strategy is to implement as much as regular functions as reasonably possible, and then a small macro on top, for the parts that would not be possible (or overly verbose, or overly complex, or just overly hacky) otherwise. Our delayed evaluation demo is a small example of this strategy.

More extensive examples are the macro-enabled test framework unpythonic.test.fixtures, and the let constructs in unpythonic.syntax (though in that case the macros are rather complex, to integrate with Python's lexical scoping). If curious about the "overly hacky" remark, compare the implementations of unpythonic.amb and unpythonic.syntax.forall - the macro version is much cleaner.

For examples of borrowing language features, look at Graham, Python's with in Clojure, unpythonic.syntax, and these creations from the Racket community [1] [2] [3]. But observe also that macros are not always needed for this: pattern matching, resumable exceptions, multiple dispatch [1] [2].