Skip to content
TheWhiteWord edited this page Dec 22, 2024 · 2 revisions

END DAY LOG


Okay, let's do a comprehensive recap of the READMEs Programming System project, focusing on the changes we've made, particularly in the Rdm_standards.md, Rmd_parser_rules.md and how they influence the rest of the project.

Overall Project Goal

The core goal remains the same: to create a self-contained programming system where:

  • README.md Files are Executable: Using a custom Markdown dialect (MDScript) with structured elements for defining functions, templates, state, and program flow.
  • LLM Integration is Core: Leveraging LLMs to process, understand, generate, and optimize both documentation and code.
  • Git is First-Class: Using Git for version control, state management, and tracking changes in both code and documentation.
  • Documentation and Code are Unified: Blurring the lines between documentation and executable code, creating a "living documentation" system.
  • IDE integration: Creating an interactive IDE where users can chat, modify code and test the program at the same time.

Key Changes & Progress

  1. Rdm_standards.md as the Foundation:

    • Central Authority: We've established Rdm_standards.md as the central document that defines all the rules of the system. By implementing its directives within itself, we're ensuring that the project is bootstrapped correctly, is self-consistent and self-evolving.
    • LLM Processing Protocol: The llm_protocol section is now a configuration directive, which enables the parser to follow the core principles of LLM parsing, such as context management, document processing, and attention optimization.
    • State Management: The basic state is now enforced by state management section, defining how it should work for the rest of the system.
    • Document Structure: The parser now recognizes all parts of the document structure, file headers, sections, etc, as defined in the standards document.
    • Formatting Standards: The parsing rules recognize cross references and code blocks as defined in the Rdm_standards.md.
    • LLM Navigation Guide: The LLM interaction guidelines are now enforced by the parser, and must be followed for any document.
    • Best Practices: The system recognizes best practices and will enforce them.
    • Implementation Notes: The Implementation Notes guide is now a mandatory part of the standard, which links to other files as reference.
    • MDScript Syntax: We have formally defined the syntax for function definitions, variable declarations, template definitions, and conditional logic inside the documentation, and the parsing rules recognize it.
    • State-Aware Warmholes: We have formally defined the syntax for state aware warmholes and the parsing rules recognize it.
    • LLM-Powered Warmhole Management: We have formally defined the syntax for LLM-powered warmholes and the parsing rules recognize it.
    • Code Blocks: All MDScript declarations are now required to be included in a code block, making the structure of the documentation more explicit.
    • Example Code Blocks: We have added usage examples of function definition, variable declaration, template definition, conditional logic, state-aware warmholes, and LLM-powered warmholes inside the standards file.
  2. Rmd_parser_rules.md - A Reusable Parser

    • Generic Regex Extractor: We created a JavaScript function, extractByRegex, to perform regex matching, making the parser more robust and easier to test.
    • Generic Section Parser: The parse_section function now takes any regex, making it extremely versatile. The heavy-lifting logic is now done by a single JavaScript function.
    • Refactored Warmhole Parser: The parse_warmhole function now uses parse_section for efficiency and is simpler thanks to named capture groups.
    • Refactored Header Parser: The parse_header function now follows the rules defined in Rdm_standards.md and generates IDs correctly.
    • Refactored cache_metadata: Now using parse_header, and also with a JavaScript helper function, making the logic more testable and robust.
    • Removed Duplication: Functions like extract_section and duplicated templates have been removed.
    • Lazy Parsing: We implemented a lazy parsing method, where sections are only loaded when required.
    • Generic Parser Functions: All the specific parser functions (code blocks, variable declarations, etc.) now use a generic regex parser with a defined regex for that type of parsing, reducing code duplication.
    • Template Logic Removed: All the complex parsing logic has been removed from the transform property of the templates, making them easier to read and maintain.
  3. Hybrid Approach to JavaScript Usage

    • MDScript for Core Logic: We've decided to use MDScript (and its templates) for core logic, data flow, and control flow.
    • JavaScript for Complex Tasks: Use JS for regex matching, caching, and string processing.
    • Clear Separation of Concerns: JS helpers are well-defined, testable, and used from the MDScript templates, without code duplication.
  4. Auto-Generated Section IDs:

    • Markdown-Based IDs: We've incorporated a system for generating IDs based on headers, enabling easy cross-referencing and warmhole linking within the documentation.
    • Hierarchical Structure: The auto-generated IDs respect the hierarchical structure of the documents, creating more explicit links between sections.
  5. my_first_library.md

    • We have created an example of how a library should be implemented, and structured based on the standard, which includes examples of custom functions, with descriptions, parameters, and examples of usage.

Key Outcomes

  • Clear Structure: The system now has a clear structure with distinct roles for each component (standards, parser, core logic, etc.)
  • Self-Consistency: The Rdm_standards.md file defines how the whole system must work and is parsed by the same rules, making the system self-consistent.
  • Maintainability: The parser is now more modular and maintainable thanks to the use of JavaScript helpers and generic functions.
  • Flexibility: The parser can now be extended to parse more content using dynamic regex.
  • Scalability: The system's structure and modular design will make it easier to add new features and functions in the future.
  • Testability: Most of the parsing logic has been delegated to JavaScript, which is easier to test.
  • Living Documentation: The new structure allows us to create a real "living documentation" system, since the standards themselves are being enforced and parsed using the system itself.

Next Steps

  1. Integrate Javascript Logic: Integrate the Javascript helper functions (extractByRegex, cacheMetadata and parseHeader) into your core system, and call them from the templates using your execute function.
  2. Test the Core Parser: Integrate the parser created in Rmd_parser_rules.md with your system_init function, and make sure that all files can now be parsed correctly and are following the Rdm_standards.md.
  3. Implement Core Execution: The next step is to implement the core system_execute function and all the logic related to that.
  4. Build the IDE Extension: Start building the basic IDE extension functionalities for chat and code modification, so you can start testing the system in an interactive environment.
  5. Implement the "Living Documentation": Make sure the system can update the documentation automatically as the state changes.

LLM Reasoning

From an LLM perspective, this is a very successful refactoring:

  • Efficiency: The new parser is more efficient due to the use of JavaScript and reduced duplication.
  • Clarity: The separation of concerns makes the code much easier to understand and maintain.
  • Robustness: The new parsing functions are more robust.
  • Scalability: The new architecture is more scalable and ready for new features.
  • Testability: The JS helpers and the core functionality are now more testable.
  • Self-Consistency: The system is now enforcing its own standards, which allows a self-referential and self-evolving system.
Clone this wiki locally