Implementation details

p.s. I think the current code is not organized in good order, any volunteer to reorder it? (CPP is declarative and order doesn't matter.. but it matters for human)

Overall idea

It is known that the ability of CPP is sort of weak, however, it does support FP to some extent, e.g. you can pass a macro name to another macro.

#define E(...) __VA_ARGS__ //Identity macro
#define N(...) //Zero macro
#define __test(something,k,...) k(do something)
#define _test(something,...) __test(something,E)
_test(testit) // => __test(testit,E) => E(do testit) => do testit
_test(E(N,N)) // => __test(N,N,E) => N(do N) =>
//The result is eaten!

This is just a handy trick to construct a zero point of our _test macro. [original]

However, even though it seems possible to implement CAR, CDR, ... using C macro, LAMBDA seems impossible to be implemented. Therefore, I come up with the idea to first implement a LISP interpreter that does not support LAMBDA, then write a LISP self-interpreter on it. See this:

//Line 176, csp.h
#define $zipped_eval(e,a) COND(\
        (ATOM e (EVAL_e $zipped_assoc(e,a)))    \
        (ATOM SAFE_CAR e \
         COND(($eq(SAFE_CAR e (quote))EVAL_e(SAFE_CAR SAFE_CDR e))  \
              ($eq(SAFE_CAR e (atom)) (EVAL_e ATOM DELAY_INT_23($zipped_eval_R)() (SAFE_CAR SAFE_CDR e,a))) \
              ($eq(SAFE_CAR e (eq)) (EVAL_e $eq( DELAY_INT_25($zipped_eval_R)() (SAFE_CAR SAFE_CDR e,a) DELAY_INT_25($zipped_eval_R)() (SAFE_CAR SAFE_CDR SAFE_CDR e,a)))) \
              ($eq(SAFE_CAR e (car)) (EVAL_e SAFE_CAR DELAY_INT_23($zipped_eval_R)() (SAFE_CAR SAFE_CDR e,a))) \
              ($eq(SAFE_CAR e (cdr)) (EVAL_e SAFE_CDR DELAY_INT_23($zipped_eval_R)() (SAFE_CAR SAFE_CDR e,a))) \
              ($eq(SAFE_CAR e (cons)) (EVAL_e CONS DELAY_INT_23($zipped_eval_R)() (SAFE_CAR SAFE_CDR e,a) DELAY_INT_23($zipped_eval_R)() (SAFE_CAR SAFE_CDR SAFE_CDR e,a))) \
              ((T)(EVAL_e DELAY_INT_23($zipped_eval_R)()(($zipped_assoc(SAFE_CAR e,a) EVAL_e SAFE_CDR e),a))) \
                 )                                                      \
                )                                                       \
        ($eq(SAFE_CAR SAFE_CAR e (label)) (EVAL_e DELAY_INT_23($zipped_eval_R)() (CONS SAFE_CAR SAFE_CDR SAFE_CDR SAFE_CAR e SAFE_CDR e,CONS (SAFE_CAR SAFE_CDR SAFE_CAR e SAFE_CAR e) a))) \
        ($eq(SAFE_CAR SAFE_CAR e (lambda))\
        (DELAY_INT_26(EVAL_e_R)() DELAY_INT_23($zipped_eval_R)()(\
        EVAL_e(EVAL_e(EVAL_e(EVAL_e(SAFE_CAR SAFE_CDR SAFE_CDR SAFE_CAR e)))),\
        EVAL_e(APPEND DELAY_INT_13($pair_R)()(EVAL_e(EVAL_e(EVAL_e(SAFE_CAR SAFE_CDR SAFE_CAR e)))\
                                              (DELAY_INT_19($zipped_evlis_R)()(EVAL_e(_e EVAL_e(SAFE_CDR e)), a)))a)))\
        )                                                               \
)

This is a common LISP self-interpreter, written in a sort of weird form that interpreter A (in fact, it is original CPP + some macro definitions) accepts.

The usage of interpreter A is quite clear in the example above:

SAFE_CAR xxxx // Original LISP: (car xxx)
SAFE_CDR xxxx // Original LISP: (cdr xxx)
.....
$pair(xxxx) // Original LISP: (pair xxx)
// Be aware of the $ and the bracket
$eq(xxxx) // Original LISP: (eq xxxx)
.....
$zipped_assoc(a,b) // Original LISP: (assoc a b)
....
//NOTE that the notation of COND is different from the idioms above, this primitive is special itself, too.
COND(a b) // Original LISP: (cond a b)

It is important that, it seems that on CPP there is no way to implement short-circuit condition and COND have to expand every branches first, then choose from it. This means that usually most of the branch expressions are completely illegal, so every primitives used in interpreter A should not cause error or expand to unmatched brackets if illegal input is given. They have to output some nonsense with matching brackets, not to disturb the macro expansion process, and then be eaten by the expansion of COND.

Basic list structure

Various ways came to my mind at first. The first option is:

(a,b,c,d) //This is a list

This seems natural because it is of the same form as C macro arg list and easy to be operated by macros, however there seems to be no way to iterate through a list without using deferred expression, so I've chosen this form:

( (a) (b) (c) (d) ) //This is a list, and (a) is an atom

Then something like this will be able to iterate through the list:

#define sth(x) dosth(x) sth_y
#define sth_y(x) dosth(x) sth
sth(a)(b)(c)(d) // => dosth(a) sth_y(b)(c)(d) ... => dosth(a) dosth(b) ... sth or sth_y

I use 2 identical macros to avoid the macros from being painted blue. In practice I acquire the zero point technology to eat the last sth or sth_y. [original]

Another techique is something like:

#define _sth(list) CAT(sth list,_end)
#define sth_end
#define sth_y_end
_sth((a)(b)(c)(d)) // => CAT( dosth(a) ... sth,_end) => dosth(a) ... sth_end => dosth(a)

This solution has some drawbacks when using ZIP macro to handle multi-arguments iterations. Both the two solutions are used in CSP.

One of my classmates suggested representing lists as nested tuples like (a,(b,(c...., this sounds interesting but kind of hard to read and write. You could do some experiments if you are interested in this idea.

Note

CAT : is a concatenating macro.

#define _CAT(x,y) x##y
#define CAT(x,y) _CAT(x,y)

It is used to paste two macros after they are expanded:

#define a() A
#define b() B
a() ## b() //Error. )b is not a valid CPP token
CAT(a(),b()) // => AB

ZIP : it enables manipulating the first 2 node of a (a)(b)(c)... list. [original]

#define _be(y) y)
#define ZIP(x)   _n() (x,_be //This _n() delays the expansion of do_sth macro, after _be is expanded. otherwise it will an unmatched bracket error. 
do_sth ZIP(a)(b) =>do_sth _n()(a,_be(b) => do_sth (a,b)

List operation example: CAR

The unsafe version of CAR is simple:[Original]

#define CAR(x) (_CAR x ))
#define _CAR(x) x _n(
CAR ( (a) (b) (c) ) //=>(_CAR (a) (b) (c) )) => (a _n( (b) (c) )) => (a)

Work fine, right? however if we provide illegal input:

CAR () //=> (_CAR ))
CAR (a) //=> (_CAR a))

The result has unmatched brackets, this will make it fail completely in COND expressions (recall that even false branches will be first evaluated then removed in Interpreter A COND).

A safe implementation looks like:

//Line 139 csp.h
#define SAFE_CAR_N(x) (())_n
#define SAFE_CAR_EAT_CAR )SAFE_CAR_N((
#define COND_EATY_SCEND (a)
#define COND_EAT_SCEND (a)
#define _SAFE_CAR(a)  _e(_n _n()(CAT(SAFE_CAR_EAT,_E(_e CAR(CAT(COND_EAT_FIRST a,_SCEND)))))(CAR(a)))
#define SAFE_CAR(a) _E(_e _SAFE_CAR(a))

Let's see how it keep bracket balance when given illegal input:

// Let's see this part first
#define _SAFE_CAR(a)  _e(_n _n()(CAT(SAFE_CAR_EAT,_E(_e CAR(CAT(COND_EAT_FIRST a,_SCEND)))))(CAR(a)))
//                                                        ^ fail. expand to (_CAR ))
//                                                 ^ Kill a bracket. => _CAR )
//                                 ^ Make it SAFE_CAR_EAT_CAR =======\
//                                v another fail.                    |
// => _e(_n _n() () SAFE_CAR_N(() (_CAR )) )                      <==/
// => _e( () )
// => () Victorious! This output is very reasonable too. ( a nil list)

The COND_EAT_FIRST serves to make complex legal input reduce to (a), too, to be handled by the following logic.

SAFE_CDR etc. stuff features similar idea.

Primitive example: ATOM

Here's the code: [Original]

#define _DESTROY_E(...) __VA_ARGS__
#define _DESTROY_N(...)
#define _DESTROY_BOTTOM _DESTROY_E(_DESTROY_N,_DESTROY_N)
#define _ODESTROY(car,k,...) k(_n((car)) ODESTROYY)
#define _ODESTROYY(car,k,...) k(_n((car)) ODESTROY)
#define ODESTROY(...) _ODESTROY( __VA_ARGS__,_DESTROY_E)
#define ODESTROYY(...) _ODESTROYY( __VA_ARGS__,_DESTROY_E)

#define _bsem )
#define _T(x) T
#define __ATOM(x) ODESTROY x (_DESTROY_BOTTOM)
#define RODESTROY T _n()(
#define KATOMT(...) (T)
#define KATOMR () _n _n()(
#define ATOM(x) _e(CAT(KATOM,_e(CAT(R,__ATOM(x)) _bsem)))

The ODESTROYxxx stuff serves to eat a whole list (a)(b).... It follows the zero point and 2-macro iteration idea. Let's see the detail of how ATOM works.

__ATOM(a) //=> ODESTROY a (_DESTROY_E(_DESTROY_N,_DESTROY_N)) 
__ATOM((a)) //=> ODESTROY (a) (_DESTROY_E(_DESTROY_N,_DESTROY_N)) =>  (nothing)
//                              ^ Remember? This is the zero point.
ATOM(a) //=>_e(CAT(KATOM,RODESTORY )))=>_e(CAT(KATOM,T _n( )))=>_e(KATOMT ) => (T)
ATOM((a)) //=>_e(CAT(KATOM,R ))) => _e(() _n ()) => ()

It is also safe. You can prove yourself that it works in other condition. If you find a bug, please open an issue. If you prove it right/wrong mathematically, please tell me before you become an immortal.

A little more complex case: AND

NOTE Currently this feature is not included in Interpreter B Eval Recursion yet. Any volunteer? Thanks.

[Original]

#define _AND(a) INTERNAL_EVAL((COND((NOT(a)(COND_EAT))((T)(_AND_Y)))))
#define _AND_Y(a) INTERNAL_EVAL((COND((NOT(a)(COND_EAT))((T)(_AND)))))

#define andtransCOND_EAT ()
#define andtransCOND_EATY ()
#define andtrans_AND_Y (T)
#define andtrans_AND (T)

#define $and(a) CAT(andtrans,_AND a)

This implementation uses COND. I'm not sure if it is safe. However we think we know the meaning of COND so analysing it become much easier.

$and((T)()(T)) // => CAT(andtrans, _AND(T)()(T))
// if _AND accept a T ( NOT(a) is false ) it return _AND_Y and continue the 2-macro iteration.
//=>CAT(andtrans, _AND()(T) )
// This time it returns COND_EAT, and begin a COND_EAT-COND_EATY 2-macro iteration destroying the list after it.
// Finally leaves a COND_EAT or COND_EATY (This time I didn't use zero point trick)
//=>CAT(andtrans,COND_EATY) => ()
// if all the atoms are (T) then a _AND or _AND_Y is left and finally expands to (T)

Provide feedback

Saved searches