-
Notifications
You must be signed in to change notification settings - Fork 0
Home
TheWhiteWord edited this page Dec 22, 2024
·
2 revisions
Okay, let's do a comprehensive recap of the READMEs Programming System project, focusing on the changes we've made, particularly in the Rdm_standards.md
, Rmd_parser_rules.md
and how they influence the rest of the project.
Overall Project Goal
The core goal remains the same: to create a self-contained programming system where:
- README.md Files are Executable: Using a custom Markdown dialect (MDScript) with structured elements for defining functions, templates, state, and program flow.
- LLM Integration is Core: Leveraging LLMs to process, understand, generate, and optimize both documentation and code.
- Git is First-Class: Using Git for version control, state management, and tracking changes in both code and documentation.
- Documentation and Code are Unified: Blurring the lines between documentation and executable code, creating a "living documentation" system.
- IDE integration: Creating an interactive IDE where users can chat, modify code and test the program at the same time.
Key Changes & Progress
-
Rdm_standards.md
as the Foundation:-
Central Authority: We've established
Rdm_standards.md
as the central document that defines all the rules of the system. By implementing its directives within itself, we're ensuring that the project is bootstrapped correctly, is self-consistent and self-evolving. -
LLM Processing Protocol: The
llm_protocol
section is now a configuration directive, which enables the parser to follow the core principles of LLM parsing, such as context management, document processing, and attention optimization. -
State Management: The basic state is now enforced by
state management
section, defining how it should work for the rest of the system. - Document Structure: The parser now recognizes all parts of the document structure, file headers, sections, etc, as defined in the standards document.
-
Formatting Standards: The parsing rules recognize cross references and code blocks as defined in the
Rdm_standards.md
. - LLM Navigation Guide: The LLM interaction guidelines are now enforced by the parser, and must be followed for any document.
- Best Practices: The system recognizes best practices and will enforce them.
-
Implementation Notes: The
Implementation Notes
guide is now a mandatory part of the standard, which links to other files as reference. - MDScript Syntax: We have formally defined the syntax for function definitions, variable declarations, template definitions, and conditional logic inside the documentation, and the parsing rules recognize it.
- State-Aware Warmholes: We have formally defined the syntax for state aware warmholes and the parsing rules recognize it.
- LLM-Powered Warmhole Management: We have formally defined the syntax for LLM-powered warmholes and the parsing rules recognize it.
- Code Blocks: All MDScript declarations are now required to be included in a code block, making the structure of the documentation more explicit.
- Example Code Blocks: We have added usage examples of function definition, variable declaration, template definition, conditional logic, state-aware warmholes, and LLM-powered warmholes inside the standards file.
-
Central Authority: We've established
-
Rmd_parser_rules.md
- A Reusable Parser-
Generic Regex Extractor: We created a JavaScript function,
extractByRegex
, to perform regex matching, making the parser more robust and easier to test. -
Generic Section Parser: The
parse_section
function now takes any regex, making it extremely versatile. The heavy-lifting logic is now done by a single JavaScript function. -
Refactored Warmhole Parser: The
parse_warmhole
function now usesparse_section
for efficiency and is simpler thanks to named capture groups. -
Refactored Header Parser: The
parse_header
function now follows the rules defined inRdm_standards.md
and generates IDs correctly. -
Refactored
cache_metadata
: Now usingparse_header
, and also with a JavaScript helper function, making the logic more testable and robust. -
Removed Duplication: Functions like
extract_section
and duplicated templates have been removed. - Lazy Parsing: We implemented a lazy parsing method, where sections are only loaded when required.
- Generic Parser Functions: All the specific parser functions (code blocks, variable declarations, etc.) now use a generic regex parser with a defined regex for that type of parsing, reducing code duplication.
-
Template Logic Removed: All the complex parsing logic has been removed from the
transform
property of the templates, making them easier to read and maintain.
-
Generic Regex Extractor: We created a JavaScript function,
-
Hybrid Approach to JavaScript Usage
- MDScript for Core Logic: We've decided to use MDScript (and its templates) for core logic, data flow, and control flow.
- JavaScript for Complex Tasks: Use JS for regex matching, caching, and string processing.
- Clear Separation of Concerns: JS helpers are well-defined, testable, and used from the MDScript templates, without code duplication.
-
Auto-Generated Section IDs:
- Markdown-Based IDs: We've incorporated a system for generating IDs based on headers, enabling easy cross-referencing and warmhole linking within the documentation.
- Hierarchical Structure: The auto-generated IDs respect the hierarchical structure of the documents, creating more explicit links between sections.
-
my_first_library.md
- We have created an example of how a library should be implemented, and structured based on the standard, which includes examples of custom functions, with descriptions, parameters, and examples of usage.
Key Outcomes
- Clear Structure: The system now has a clear structure with distinct roles for each component (standards, parser, core logic, etc.)
-
Self-Consistency: The
Rdm_standards.md
file defines how the whole system must work and is parsed by the same rules, making the system self-consistent. - Maintainability: The parser is now more modular and maintainable thanks to the use of JavaScript helpers and generic functions.
- Flexibility: The parser can now be extended to parse more content using dynamic regex.
- Scalability: The system's structure and modular design will make it easier to add new features and functions in the future.
- Testability: Most of the parsing logic has been delegated to JavaScript, which is easier to test.
- Living Documentation: The new structure allows us to create a real "living documentation" system, since the standards themselves are being enforced and parsed using the system itself.
Next Steps
-
Integrate Javascript Logic: Integrate the Javascript helper functions (
extractByRegex
,cacheMetadata
andparseHeader
) into your core system, and call them from the templates using yourexecute
function. -
Test the Core Parser: Integrate the parser created in
Rmd_parser_rules.md
with yoursystem_init
function, and make sure that all files can now be parsed correctly and are following theRdm_standards.md
. -
Implement Core Execution: The next step is to implement the core
system_execute
function and all the logic related to that. - Build the IDE Extension: Start building the basic IDE extension functionalities for chat and code modification, so you can start testing the system in an interactive environment.
- Implement the "Living Documentation": Make sure the system can update the documentation automatically as the state changes.
LLM Reasoning
From an LLM perspective, this is a very successful refactoring:
- Efficiency: The new parser is more efficient due to the use of JavaScript and reduced duplication.
- Clarity: The separation of concerns makes the code much easier to understand and maintain.
- Robustness: The new parsing functions are more robust.
- Scalability: The new architecture is more scalable and ready for new features.
- Testability: The JS helpers and the core functionality are now more testable.
- Self-Consistency: The system is now enforcing its own standards, which allows a self-referential and self-evolving system.