Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

edtlib: Preserve paths of properties from included child bindings #78095

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

dottspina
Copy link
Contributor

@dottspina dottspina commented Sep 6, 2024

The problem to be solved is similar to an issue (#65135) we fixed (#65221) for included property specifications, but here for child bindings.

Issue

All child bindings are initialized at once after included binding files have been merged, with the including binding file set as their origin.
As a result, properties of child bindings all appear to be specified (the last modified semantic of PropertySpec.path) by the top-level binding.

For example, properties of a channel child of an ADC controller with compatible "nordic,nrf-saadc" (e.g. zephyr,gain) will all appear last modified by nordic,nrf-saadc.yaml instead of adc-controller.yaml.

Changes

We can use the same approach we've already applied for property specifications:

  • first, we may inherit child-binding definitions from other bindings when merging included files
  • eventually, we may get child-binding definitions with the content of the child-binding: element of the top-level binding's file, updating any inherited definitions

With this patch:

  • the paths of properties inherited from included child bindings always correspond to where the properties are defined (or last modified)
  • the path of the child binding itself is set to:
    • if all properties of the child binding come from the same binding file, with no filter applied: the path to this binding file, e.g. adc-controller.yaml when nordic,nrf-saadc.yaml is simply included with include: nordic,nrf-saadc.yaml
    • otherwise (filters or multiple property specifications files): the path of the including binding file (taking ownership)

Note: edtlib: fix Pylint error W0101 (unreachable) is just while we're at it, does not deserve a PR for itself alone.

How has this been tested? ?

We have to be careful when it comes to touching child bindings initialization:

  • first, pass all unit tests in scripts/dts/python-devicetree/tests to catch obvious regressions (this patch does not cause regressions in DTSh either, where it's in the testing branch)
  • then add tests covering the multiple ways (levels, filters) of including child bindings, a bit like a recap

What hasn't been tested ?

  • Added unit tests do not check the cell specifiers of the produced child bindings; I'm not sure how this could fail, though (specifier2cells is quite simple)
  • [Edit: redefining a property of an included child binding, typically to set required: true, will truncate its definition, likely loosing its type and description. This is not tested.
    I'll comment the fix when all CI tests pass.]

Thanks.

@dottspina dottspina marked this pull request as ready for review September 6, 2024 12:36
@zephyrbot zephyrbot requested review from decsny and galak September 6, 2024 12:36
@dottspina dottspina force-pushed the pr-preserve-child-bindings-origin branch from 7c72a6c to a253116 Compare September 9, 2024 01:04
@dottspina
Copy link
Contributor Author

Fixed CI error:

  • failed test: west build -p -b native_sim tests/drivers/build_all/led -T drivers.led.build
  • cause: when a binding contains a child-binding: element that redefines a property of a child binding inherited from an included file, typically setting required: true, the property definition is truncated, loosing its type: (and description:) and breaking the DTS generation in gen_defines.py; this was not covered by the added unit tests, sorry I missed that
  • fix: preserve type (and description, etc) inherited from included bindings; update unit tests to cover the cause of the error

Thanks.

@decsny decsny requested review from a user, benediktibk, tejlmand and 57300 September 11, 2024 20:54
Copy link
Collaborator

@benediktibk benediktibk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we stumbled upon the same issue:
a04f0b4

@dottspina
Copy link
Contributor Author

Looks like we stumbled upon the same issue: a04f0b4

@benediktibk , yes, and I don't have eyeballs either, I also got that from the linter.

IIUC, the fix is part of #78159: I'll rebase and remove edtlib: fix Pylint error W0101 (unreachable) from this PR when #a04f0b4 is in main.
Thanks for letting me know.

Copy link
Collaborator

@benediktibk benediktibk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A more general thing: I think it would be better to split up the test cases (test functions as well as the test bindings) into separate tests. It is way easier to comprehend the whole setup and the assertions when there is basically one aspect tested per test.

# + child-prop-1
#
# grandchild-binding:
# + gandchild-prop-1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nitpick: There are some "r"s missing in grandchild

@benediktibk
Copy link
Collaborator

Btw, I really like your approach with trying to test all the edge cases.

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not reviewing details until we're sure that we really need such a complex fix. See my comment.

@@ -455,6 +474,126 @@ def _add_included_prop2specs(self, fname: str, contents: dict,
if prop not in self.prop2specs:
self.prop2specs[prop] = spec

# Add child-binding definitions inherited from an included binding.
Copy link

@ghost ghost Sep 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrm, while this (and #65221) seems to work, it introduces quite some additional complexity to the code.

TLDR;

This (and #65221) seem to work around design problems that we should rather fix.

Side notes:

  • It seems that this change alters the inheritance of child bindings from override to merge. Maybe I overlooked s.th.? Otherwise it would be a breaking change. Then we should ensure that we clarify this with the arch WG maybe? And we might have to document the change in the migration README of the release.
  • While analyzing this problem I think I discovered another bug. It seems as if child bindings are only taken from the direct parent during validation rather than from all ancestors. But maybe I'm wrong on that. The code is a bit hard to follow. If it exists, this problem should go away, with the proposed fixes.

Fixing the binding model:

  • I think the main problem with the binding model is that we keep the path in the binding rather than in the prop spec or even on property attribute level (for overridable property attributes like "required"). Whether an attribute is overridable or not can be kept as a class property.
  • We may derive a (deprecated) binding-level path property for backwards compat. But keeping the info at the right place immutably should simplify the merging.
  • There should also be a way to share more code between merging properties and merging child bindings. And also fix the load/merge duplication.
  • I think we should recurse right in the beginning of the __init__ function to the deepest level and then work our way upwards rather than mixing the two directions to simplify the merge process.
  • Additionally it may make sense to have a Compat class that extends Binding. Things like "requires compatible" would go away with polymorphism.

Update (2024-09-26): fixed an oversight in my first proposal.

Fixing node validation:

Attaching bindings verbatim to nodes seems to make validation unnecessarily complex. It adds path dependent constraints to subnodes with distinct bindings and possibly conflicting rules across node levels. I think we should merge bindings applied to the same node. In the end we only want a list of property specs attached to each node. Or am I being too naive maybe?

If I'm right then we should be able to establish those validations by merging bindings from top to bottom:

  1. Iterate the device tree breadth first
  2. For each node: Merge all property specs of the Compats of the current node and child bindings applied by anchestors to the current node into a single list. Here we can check for conflicts and apply well-defined merging rules.
  3. Validate the current node's values against the merged property specs.
  4. Distribute property specs of child bindings to the current node's descendants.
  5. iterate or recurse

This should do away with the need to have a derived property2specs list. And I wouldn't be surprised if we found that other derived state could also be removed.

Update (2024-09-26): simplified.

I'll have some bandwidth over the next days to look into this and try it out. I'll have to tinker around with edtlib anyway to make a poc for some new features.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh - of course I can help with the refactoring if you're afraid of such a structural fix. I'm onto some refactoring of edtlib anyway.

@ghost
Copy link

ghost commented Sep 12, 2024

redefining a property of an included child binding, typically to set required: true, will truncate its definition

Not sure what you mean by that, but this sounds like a bug to me as the intention of updating only the required field should quite obviously inherit the rest. Should be fixed in transit by the design fixes I propose (see the attribute-level inheritance approach).

@dottspina
Copy link
Contributor Author

redefining a property of an included child binding, typically to set required: true, will truncate its definition

Not sure what you mean by that, but this sounds like a bug to me as the intention of updating only the required field should quite obviously inherit the rest. Should be fixed in transit by the design fixes I propose (see the attribute-level inheritance approach).

Yes, what you quote was actually the description of a bug, caught by the CI and fixed (see this comment).

Thanks for your review.
I agree that this PR and #65221 add a lot of complexity to the model initialization, I think mainly because my use of edtlib (DTSh) was not expected. I'll dig into your proposal when I have time and comment back.

@ghost
Copy link

ghost commented Sep 12, 2024

my use of edtlib (DTSh) was not expected

But your use of edtlib is highly welcome and there are a lot of obvious synergies with more general activities going on in the configuration/provisioning area that might further expand the applicability of edtlib (think DTShell -> Config, DT and Provisioning Shell). That's why I have high stakes in making this work. :-)

dottspina added a commit to dottspina/zephyr that referenced this pull request Oct 10, 2024
These tests show a known issue and will fail with
the version of edtlib at the time of writing.

See also: zephyrproject-rtos#78095

Signed-off-by: Christophe Dufaza <[email protected]>
@dottspina dottspina force-pushed the pr-preserve-child-bindings-origin branch from a253116 to 3584a8d Compare October 10, 2024 23:23
Bindings initialization supports a rich include mechanism:
- a base binding may be "include:"ed at any level: binding,
  child-binding, grandchild-binding, etc, the defined
  properties being injected (in cascade) at the level
  of inclusion
- whenever an "include:" appears, the including binding
  may filter the properties it expects at the binding,
  child-binding, etc, levels; these filters must apply
  transitively across included files
- when base bindings are inherited which contain
  definitions about a same property, rules apply: sometimes
  the last one "wins", but not always,
  and quite a few things are not allowed (e.g. downgrading
  a "required: true" or changing a "const:" value)

All this happens (recursively) in the Binding constructor,
independently of any actual devicetree model (EDT instance).

These additional unit tests try to cover somewhat systematically
what we finally get at the exit of the constructor once the object
is completely initialized.

Signed-off-by: Christophe Dufaza <[email protected]>
These tests show a known issue and will fail with
the version of edtlib at the time of writing.

See also: zephyrproject-rtos#78095

Signed-off-by: Christophe Dufaza <[email protected]>
@dottspina dottspina force-pushed the pr-preserve-child-bindings-origin branch from 3584a8d to 4208824 Compare October 10, 2024 23:35
@dottspina
Copy link
Contributor Author

Three parts in this comment:

  • first, some issues (bugs ?) I may have found while working on this PR, and especially writing the unit tests I thought necessary to prevent regressions: of these tests, five fail, which I don't think is expected (independently of this PR)
  • some brief thought regarding this PR: in one word, to me only the unit tests are worth it, the issue this patch tries to address should be approached with an openness to design changes
  • the last part will perhaps mainly concern @fgrandel: I think this work that the edtlib library could benefit from goes beyond the intent of this PR, and propose to move the thinking (and code) to an RFC with linked PRs; this should eventually fix all issues (sic) incidentally

Unit tests and issues

Working inside the Binding constructor is somewhat perilous, it's really easy to break one thing when fixing another, so I wrote unit tests (the current coverage in test_edtlib.py is not enough to reassure me).

These tests revealed inconsistencies that I initially considered to be regressions introduced while working on this, but:

  • they are not regressions: I can reproduce them without applying this PR, and even with commit b3b5ad8 reverted
  • IIUC correctly what should happen (the applied rules) when loading binding files, there might be a long standing bug

These tests now constitute the main content of this PR:

  • commit 3584a8d: the unit tests that relates to the discussion bellow, which were not expected to fail
  • commit 4208824: the tests directly associated to the issue this PR tried to address (these ones were expected to fail)

You'll find them in a separate source file, test_edtlib_binding_init.py, which is already quite large. Resource files are in the test-bindings-init directory.

Details are also available in the CI tests report for this PR, e.g. in "Devicetree script tests (3.11, ubuntu-22.04) failed".

@benediktibk, you also showed some interest in testing: the additional unit tests in commit 3584a8d may as well be helpful independently of this issue, do you think I should open another PR with just this commit ?

Indirect include:s discard property filters

It seems that the filters set by an including binding to select the property it chooses to inherit, at its child-binding level and bellow, are lost if these properties are included indirectly (as in A includes B includes C).

Consider the test case bellow:

  • simple.yaml: a base binding file, specifies some properties at the binding, child-binding and grandchild-binding levels
  • simple_inherit: just include:s the base binding file above (without modification nor filter)
  • simple_filter: top-level binding of the test case, which intends to simply filter the properties it inherits
# simple_filter.yaml

include:
  - name: simple_inherit.yaml
    property-allowlist: [prop-1]
    child-binding:
      property-allowlist: [child-prop-1]
      child-binding:
        property-allowlist: [grandchild-prop-1]

IIUC, the filters set by the including file (the snippet) should be implemented to the end, and we should see (still from the snippet) only prop-1 at the binding level, child-prop-1 at the child-binding level, etc.

test_filter_inherited_propspecs_basics() suggests that we instead inherit all the properties defined in the base binding file (simple.yaml):

E       AssertionError: assert {'child-prop-1'} == {'child-prop-...child-prop-2'}
E
E         Extra items in the right set:
E         'child-prop-2'
E         Use -v to get more diff

tests/test_edtlib_binding_init.py:859: AssertionError
======== test summary info ========
FAILED tests/test_edtlib_binding_init.py::test_filter_inherited_propspecs_basics - AssertionError: assert {'child-prop-1'} == {'child-prop-...child-prop-2'}

Similar unit tests that fail: test_expect_propspecs_diamond_child_binding() and test_expect_propspecs_diamond_grandchild_binding().

Not caught by the allow-and-blocklist-multilevel.yaml test case in test_edtlib.py: there, "multilevel" means include: appearing at the binding and child-binding levels of the including YAML file (this does not cover A includes B includes C).

Ignored descriptions overrides

IIUC, an including binding file is allowed to amend (overwrite) the description of an inherited property: test_expect_propspecs_diamond_binding() suggests the opposite.

Similarly, test_grandchild_binding_compat_desc() suggests that an including binding cant overwrite the description of an included binding.

Status of this PR

My feeling regarding this PR is really not optimistic:

  • two weeks ago my approach was to clean this thing up a bit, see if a slightly less ugly implementation was possible, at least for documentation purpose, then join efforts with @fgrandel to consider possible design changes in edtlib
  • more than once I thought it was good, and unit tests eventually proved I was wrong
  • I ended with an implementation that passed all unit tests, but even more convoluted than the initial contents of this PR (this may be the result of successively fixing almost orthogonal issues, I'll see if I can do something before I push that, still for documentation purpose ;-)

For me, the interest of this PR now lies mainly in the other issues it raises, the source code being there more to document possible design problems than to resolve the initial bug: I'll keep it open until we moved the valuable material (e.g. the unit tests and fgrandel's thoughts) somewhere we can discuss the design of edtlib more specifically, may be an RFC.
The proper fix will come from that work.

Moving to an RFC

@fgrandel, I agree that this PR as well as #65221 are hacks around deeper issues:

  • an API not really friendly to users besides gen_defines.py, not really thought out for that: whether it should be is itself a question (I know you too think we could do many other things with edtlib)
  • possible sub-optimal design (I'm thinking readability, not optimization)
  • an implementation that evolved in fits and starts (for legitimate reasons)

Bellow are TLDR comments I was writing before I was convinced that we definitely need something like an RFC.

It seems to me that this specific change alters the inheritance approach for child bindings from override to merge.

Not sure what you mean when you say "from override to merge". To me it's actually neither, merge being the less incorrect, though:

  • when child bindings (as well as bindings, grandchild bindings, etc) are included, and there are many ways to do so, rules apply, and the latest does not always "win", so it's not really override (e.g. you can't downgrade a requirement with required: false, change a const: value, etc)
  • but you can merge additional definitions into an included property specification, e.g. set (but not change) a default: value (you can also specify new properties, obviously)

This and #65221 both replicate existing code smells and introduce new ones (structural duplication, broken functional encapsulation, feature envy of functions, premature optimization, the same internal properties being updated from several abstraction layers, therefore unclear abstraction levels, function names not clearly saying what the function does, ...).

Having read this source code for a couple of years, I may share most of your concerns (I'm not sure about "premature optimization", though), but would add a few comments:

  • although that was planed at some point, supporting edtlib use outside of the Zephyr build process is not a priority; additionally this role at the very beginning of the build process is something that may hold back a lot of refactoring in this area unless they prove necessary (and peoples who fully understand the whole thing are not busy with more valuable stuff)
  • the devicetree model initialization is not that simple anyway

As a regular user of this library (and reader of its source code), I'm not against giving it some attention, and up for working on it, but IMHO it's not a simple refactoring, and that can't be done in the context of this PR.

Since it looks like you're in too, I'd propose we open an RFC with the draft status:

  • where we can refine the design changes before going to code (e.g. I'm not convinced with the validation idea, to me even property filters are part of the binding, not validation rules, but I may have misunderstood you)
  • that we can link to separate incremental PRs

Would that suit you ?

Thanks.

@dottspina
Copy link
Contributor Author

A more general thing: I think it would be better to split up the test cases (test functions as well as the test bindings) into separate tests. It is way easier to comprehend the whole setup and the assertions when there is basically one aspect tested per test.

Thanks for your review.

Right, the test cases weren't well organized/structured.
If you have some time, you can take a look at the new version, it should be more readable.

@ghost
Copy link

ghost commented Oct 11, 2024

@dottspina

As a regular user of this library (and reader of its source code), I'm not against giving it some attention, and up for working on it, but IMHO it's not a simple refactoring, and that can't be done in the context of this PR.

Since it looks like you're in too, I'd propose we open an RFC with the draft status:

I did not yet have time to read your comments in detail but I can see that you invested a lot of time and thought into it, so I'll read it very thoroughly over the weekend. Thank you very much for sharing your thoughts!

UPDATE: We don't have the right smileys to say enough thanks to you for adding that much of extra test coverage. Also see my efforts to extract bindings into a separate generic layer. This combines nicely with what you're doing here.

This PR vs. RFC vs. PoC

You're working with the library for much longer than I and I saw that in some aspects, my initial analysis - while being generally correct - shot too quickly. I tend to be overconfident sometimes (sorry for that). So I fully agree with you that a larger refactoring should not be part of this PR. I love the focus of your PR as it stands right now.

In the meantime I worked full time on a PoC with the library and get to know it better. I actually find it well designed except for the few things I mentioned.

So yes, I'd be really happy to work on an RFC together with you. The initial bug you reported can be fixed with reasonable effort by introducing more fine-grained path tracking as proposed in my comment (which I've mostly implemented in the meantime but not published yet). As you say: this is probably better introduced in a separate PR.

For an RFC it's important for you to understand what drives me and see whether we can find common ground to work on...

Zephyr as an application programming framwork

I'm trying to conceive Zephyr more as an application programming framework rather than an OS. One of the major things that differ between the two approaches is: application frameworks provide an integrated view onto configuration and data modeling while an OS is a rather wild collection of independent applications that all have their own data models and configuration approaches. I think Zephyr's usability as an application programming framework could be improved in that sense. Its configuration is sometimes hard to understand, partly redundant and fragmented. That's why I'm so involved in configuration and developer experience.

Related configuration work in the arch WG

It has been decided in the architecture working group that we base future subsystem configuration and provisioning on YAML files. This will be exemplified for network configuration in a first step but should be extensible to arbitrary subsystems in the future. This includes drivers and "upper" subsystems (networking, sensing, bluetooth, ...). There exist isolated solutions to visualize Kconfig and (due to Nordic's efforts) also DT. But nothing similar exists for the envisioned YAML subsystem configuration.

We must therefore be very careful to not increase complexity, inconsistency and fragmentation further when throwing yet another configuration approach into the mix. I'm personally very concerned about that although I understand why we cannot extend existing config approaches to multi-instance software configuration and provisioning.

RFC outline (including dtsh's role in it)

My idea therefore is to enforce a unified view onto configuration in a single tree - something like a combination of sysfs (for read-only, kernel-managed, build-time configuration - think Kconfig + DT) and configfs (for runtime/provisioning read/write, application-managed configuration).

You can imagine that I'd love to see dtsh extended as a potential viewer for such a combined configuration tree as it already shares the "filesystem metaphor".

While you seem to be interested in the visualization part, my interest is in creating the capabilities in the background required to integrate Kconfig, DT and YAML (hence the sysfs/configfs metaphor).

Implementation approach

On the implementation side, my approach is to share the existing, generic overlay/binding/validation system between DT and YAML to re-use existing community knowledge and code and to protect investment into these technologies. [1] As far as Kconfig is still involved in software configuration (as opposed to "feature selection") a partial mapping of legacy Kconfig options to the same data model will be added - initially for the network subsystem configuration which was based on Kconfig so far. My main focus is on integrating hardware (DT) and software (YAML) configuration, though, because that's where the distinction is most ambiguous, confusing and contentious in practice. I want to shortcut dogmatic fights and flame wars that regularly hold the community back in these areas by making the distinction less relevant in practice (at least from a user and migration point-of-view). Currently purists block pragmatists and vice versa. I want all maintainers to strive with their individual approach w/o user experience being damaged.

What has been done so far

To achieve that, I've done a profound refactoring of edtlib to extract generic parts (binding, validation, dependency management, overlays, etc.) and base a YAML-to-config-converter on top of it. DT- and YAML-specific subtrees are built on top of that common library which in a final step are merged into a single tree for validation of cross-references and also to deal with migration issues (e.g. software properties being moved from DT to YAML in the future on an as-needed basis).

I also developed two alternative configuration/provisioning approaches in my PoC: A zero-footprint-by-default macro target (build-time) and a prototypical NVS settings image generator (provisioning and runtime support). The image generator can easily be extended to littlefs, cbor or any other storage format. NVS is just the more complicated legacy case that's why I tested feasibility with it first. The images are compatible ootb with the existing Zephyr settings subsystem. The image can be flashed into its own partition independently from the build.

Compatibility of my changes with edtlib and dtsh

The merged tree API is staying 99% the same as what edtlib exposed before, just a few changed imports, changes to the way the API is instantiated and some casting involved if one wants to work with one of the two subtrees rather than with the generic tree. The differences are not in structure but in processing DT vs. YAML internally, so they can be largely hidden behind existing abstractions.

Proof: All existing edtlib tests run on top of my refactoring with only minor API changes due to the new dual nature of the tree and the newly introduced type composition feature (except for 3 all failing due to a missing update of ControllerAndData which is not relevant for my PoC - so I'd like to postpone the fix).

It should be really easy to adapt dtsh to that combined view and thereby immediately let it visualize configuration, too. I think that in about one week I'll have a preview of this ready for you to inspect.

Relevance of an integrated settings view to provisioning

The unified settings view will be most important when it comes to provisioning and migration/backwards compat.

Provisioning/Runtime settings

What part of the overall tree needs to be provisioned (as opposed to built into the firmware) is strongly application dependent. No dogmatic definition will ever be able to capture for all times what goes where. Users will neither want to provision too much (due to obvious resource overhead as compared to build-time config) nor too little (as some applications will have to be more adaptable than others).

I introduce a very basic "path language" based on full paths or DT/config node labels. It basically selects the "folders" to be exported to the image. These folders will then be exposed 1:1 to the application via the settings subsystem. As the settings subsystem uses the "directory/file" model itself, we won't see any impedance mismatch there. It is important that we keep "folder" selection as well as the configuration model very flexible, such that users can easily extend it to their custom runtime/provisioning time configuration needs.

Migration support

See the implementation approach above. The integrated settings view hides legacy settings and structures from client code so that the client code can evolve freely w/o having to deal with backwards compatibility.

The importance of "dtsh" (aka "configsh") for an integratied settings view

Something like "configsh" would be ideal to let users debug and visualize their configuration variants or select the branches to be provisioned or even script per-device "mass provisioning" in the future. I could also imagine extending the existing settings subsys integration into a "light weight" version of configsh in the application shell to browse through and set configuration at runtime as a replacement of the many individual subsystem configuration approaches we have now. This of course only concerns parts that are actually in flash or memory - no one wants the full config tree to be materialized in an embedded system with restricted resources, hence the per-default macro-based approach and the importance of something like "configsh" to assist users with the full model.

Next steps on my side

Conceptual

I'm currently preparing some slides that I'll present in the arch WG to elevate awareness of the topics raised in this comment and elsewhere before.

PoC

From the merged representation, I'll in a next step extract canonical DT_ and CFG_ macro views no matter whether the sources use legacy DT or even Kconfig properties to describe configuration aspects or whether they already migrated to YAML. This is especially relevant in the network area where software configuration aspects are scattered across Kconfig and DT right now. So network configuration is a good "worst case" scenario to start with. Decoupling of sources from targets becomes even more relevant in the provisioning area where data may be coming from third-party systems like large device databases that are also used for device management, OTA, etc.

An integration of network configuration with the macro target does already exist. So next comes the integration of the now available settings from the settings subsystem with the network configuration code.

Change log:

  • 2024-10-14: Introduced structure to get closer to s/th usable for an RFC, updated to the current state of the PoC

[1] Of course our home-grown binding approach (which is actually just a Zephyr-specific schema language) has its weaknesses. But while evaluating alternative approaches (pykwalify, JSON schema) it also became obvious that those approaches are not very well suited either. And having our own approach has the advantage that we can tailor it exactly to our needs. That's why I've decided to vote for extending the existing schema modeling to YAML rather than having yet another technology being introduced and re-invent all the wheels.

@benediktibk
Copy link
Collaborator

@benediktibk, you also showed some interest in testing: the additional unit tests in commit 3584a8d may as well be helpful independently of this issue, do you think I should open another PR with just this commit ?

Tough question, depends on how far you want to go with refactoring. It might be not worth the time and effort to add tests for stuff which you want to throw out of the window anyway.

In my opinion it would make sense to split the test cases you have added here in two categories:

  1. Fail without any implementation-wise changes
  2. Succeed without any implementation-wise changes

The latter ones could then be added in a separate PR, as they are in general a good thing to have that nothing breaks.
The first ones should rather come in with the appropriate bug fix PRs or PRs which actually change the behaviour.

@benediktibk
Copy link
Collaborator

Right, the test cases weren't well organized/structured.
If you have some time, you can take a look at the new version, it should be more readable.

Way easier to read, IMHO. So far I've skipped only over the tests, as actually verifying their correctness will take me a considerable amount of time. I'll happily invest this, as I will also learn a bunch about these so far to me unknown features, but I would like to do so after you have decided about the approach you want to go ahead.

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

I propose that we leave the failing tests in the tree and just inactivate them for CI. These tests are perfect for me to check whether my fix works.

@ghost
Copy link

ghost commented Oct 14, 2024

Indirect include:s discard property filters

Ignored descriptions overrides

I agree with your analysis. Both problems should be documented in their own issues - can you do that? IMO you could use your description verbatim. You may assign me to those issues unless you want to fix them yourself.

Moving to an RFC

@fgrandel, I agree that this PR as well as #65221 are hacks around deeper issues:

Great, I've tried to update my RFC-related comment now such that some of its paragraphs can become part of an RFC almost verbatim if you agree with the basic approach.

My RFC-comment misses the design fix for the "last modified source path" issue from my initial comment, therefore I'll re-summrize it here based on my updated knowledge:

  • I propose to introduce independent "source_path" members to all levels of the tree, both on the binding as well as on the config source sides.
  • These are at least the combined tree (stree) <- 1-to-n -> partial tree (dts vs yaml config) <- 1-to-n -> tree node <- n-to-m -> binding <- 1-to-n -> property spec entities.
  • To fix the problems with description/required overrides and document their source, we'd have to add ... <- 1-to-n -> binding attribute as a new entity into the conceptual data model.
  • Binding attributes might be implemented in PropertySpec if adding a separate class seems to heavy.
  • To document single-property updates on the source side, we'd have to add the source_path member to the EDTProperty (or STProperty) class, too.
  • Only the most granular "source_tree" member will be actual data. All others can be derived from that layer by some static logic.

For me, the interest of this PR now lies mainly in the other issues it raises: [...] we can discuss the design of edtlib more specifically, may be an RFC. The proper fix will come from that work.

100% agree.

The additional design fixes I proposed in my original comment re compositional bindings plus related dependency management improvements have already been implemented in my PoC. I think those should be covered by the arch WG's configuration work, though, IMO no need to include them in another RFC.

There are still some simplifications pending in my PoC:

  • the awkward handling of merged pointers must be fixed
  • I didn't publish my changes to the binding class, yet, because they are not relevant to the PoC.

Not sure what you mean when you say "from override to merge". To me it's actually neither, merge being the less incorrect, though:

Yes, you're right. This is one of the things where I "shot too quickly". I've updated my understanding in the meantime and agree with you. Sorry for the noise.

  • although that was planed at some point, supporting edtlib use outside of the Zephyr build process is not a priority; additionally this role at the very beginning of the build process is something that may hold back a lot of refactoring in this area unless they prove necessary (and peoples who fully understand the whole thing are not busy with more valuable stuff)

I couldn't agree more. This is even more true as our co-operation with the devicetree specification project has officially ended and will not be taken up again in the foreseeable future. That's why I didn't make an effort to consider such artificial stakes.

My policy on edtlib (tentatively renamed to settingslib in my PoC) would clearly be: Touch/add/fix whatever we think is required for Zephyr. Just as always: keep the API as stable as possible when refactoring (what I did) including improved protection against regressions (what you did).

I think with the changes in my PoC and your additional test coverage, edtlib/settingslib would be even more useful externally. I could think of many more tools that could be built on top of it in the Zephyr community in addition to "dtsh" aka "configsh":

  • more configuration tools (actual config + visualization)
  • support standard Zephyr profiles (performance, space, major feature areas, ...) for fully automated interactive setup scripts, similar to what is common in modern application programming frameworks like Spring, React, Vue, several python tools, ... to name just a few I know well myself.
  • mass provisioning tools and integrations
  • integration with device management tools for OTA
  • ...

I therefore propose that we return to publishing and versioning the library on PyPi so that you and others can properly import it without moving the source code out of the Zephyr tree again.

My handling of dtlib would be very conservative, though. I think dtlib doesn't have to be touched at all unless we want to support more recent DT features already built into dtc. The library should be 100% focused on being compatible with the standard and Linux extensions to it.

As a regular user of this library (and reader of its source code), I'm not against giving it some attention, and up for working on it, but IMHO it's not a simple refactoring, and that can't be done in the context of this PR.

No of course not. What I did comes close to a full structural re-write. The thorny logic with all its edge cases could be re-used almost 100%, though, except for shuffling some features around for improved encapsulation. Therefore I didn't have any relevant regressions in the tests. In this sense it was simply a re-factoring.

@mathieuchopstm
Copy link
Collaborator

Echo'ing a discussion from Discord on request of @fgrandel wrt inherited properties. This maybe not be exactly related to this PR but worth mentioning nonetheless.

Currently, it is not possible for a node to obtain properties from both its parent's child-bindings > properties, and its own binding specified by compatible property. Whether this is intended or a bug is not very clear to me though. One example use case is for devices of a given protocol, where the (per-vendor) controller requires additional configuration that is per-device but vendor-specific (this came in mind when discussing MSPI but could apply more broadly).

Tentative implementation details
Dummy controller binding
# Binding for 'xxx' protocol controller implemented by ST
# For example, this could be I²C or SPI.
include: base.yaml

description: ST 'xxx' controller

compatible: "st,xxx-controller"

properties:
  # Controller node must provide its location.
  reg:
    required: true

  # Child nodes ('xxx' devices) are designated
  # by a single-cell address (no size)
  "#address-cells":
    const: 1

  "#size-cells":
    const: 0

child-binding:
  properties:
    # For the ST-specific 'xxx' controller, we must indicate
    # some device-specific information, e.g. timings. But not
    # all vendors will require this - maybe another will want
    # operating volatge or something else...
    st,xxx-dev-property:
      type: int
      required: true
      description: "'xxx' device property for ST controller"
Dummy device binding
# Binding for a device communicated with via 'xxx' protocol.
# This is implemented by a "third-party" vendor (i.e., not SoC maker)
include: base.yaml

description: "'xxx' device by vendor 'yyy'"

compatible: "yyy,dummy-xxx-device"

properties:
  # Address on the 'xxx' protocol
  reg:
    required: true

  # device-specific property
  # this could be a configuration value for example
  device-property:
    type: int
    required: true
    description: yyy device property
Example instanciation in board DTS
	soc {
		custom_xxx_ctrl: custom-controller@80 {
			compatible = "st,xxx-controller";
			reg = <0x80 64>; /* dummy values */
			#address-cells = <1>;
			#size-cells = <0>;

			child_dev: child-device@0 {
				compatible = "yyy,dummy-xxx-device";
				reg = <0>;
				device-property = <10>;
				st,custom-child-property = <20>;
			};
		};
	};

Attempting to build hello_world for a board with this added to the tree, even on top of this PR, results in the following error:

devicetree error: 'st,custom-child-property' appears in /soc/custom-controller@80/child-device@0 in ~/zephyrproject/build/zephyr/zephyr.dts.pre, but is not declared in 'properties:' in ~/zephyrproject/zephyr/dts/bindings/test_device_binding.yaml

So it looks like the child-bindings properties are ignored when a node has a compatible.
I didn't bother testing with grandchild-bindings but I assume the same would be visible.

@dottspina
Copy link
Contributor Author

I agree with your analysis. Both problems should be documented in their own issues - can you do that? IMO you could use your description verbatim. You may assign me to those issues unless you want to fix them yourself.

In my opinion it would make sense to split the test cases you have added here in two categories:

  1. Fail without any implementation-wise changes
  2. Succeed without any implementation-wise changes

Just to let you know I'm on it.

@ghost
Copy link

ghost commented Oct 14, 2024

Currently, it is not possible for a node to obtain properties from both its parent's child-bindings > properties, and its own binding specified by compatible property.

Exactly, we currently implement "oneOf" logic, giving preference to explicitly specified compatibles over child-bindings.

Whether this is intended or a bug is not very clear to me though.

AFAIK this is intended. I found an old message by @mbolivar mentioning it. Just speculating: It seemed too complicated to implement type composition properly at the time.

One example use case is for devices of a given protocol, where the (per-vendor) controller requires additional configuration that is per-device but vendor-specific

Yes, good example. There are really many such examples. And a certain flavor of composition was actually intended by the standard that's why you can add several comptibles in the first place.

Re implementation: I already have this implemented in a way that I believe is well founded on first principles and our needs. Should cover your example case nicely. You'll be invited for review of course as soon as I'm making a PR out of it.

If you follow some of the links in my comments, you'll see that bindings (plural) have replaced binding everywhere. :-)

dottspina added a commit to dottspina/zephyr that referenced this pull request Oct 18, 2024
This unit test was added specifically to cover a regression
reported by the CI while working on [1].

Further work on related issues [2] showed that:
- [1] and [2] are dead end: we need to first rethink
  how bindings (and especially child-bindings) are initialized
- the inclusion mechanism supported by Zephyr deserves more systematic
  testing in edtlib if we want to work with confidence

The approach we choose is to:
- revert all changes made in [1]
- from there, systematically add unit tests as we address
  the issues we identified (or the additional features we need)
  one after the other

[1] edtlib: fix last modified semantic in included property specs
[2] edtlib: Preserve paths of properties from included child bindings

See also: zephyrproject-rtos#65221, zephyrproject-rtos#78095

This reverts commit 33bb3b6.

Signed-off-by: Christophe Dufaza <[email protected]>
dottspina added a commit to dottspina/zephyr that referenced this pull request Oct 18, 2024
This unit test was added to cover the change introduced by [1].

Further work on related issues [2] showed that the chosen approach
is dead end.
We're reverting all changes made in [1].

[1] edtlib: fix last modified semantic in included property specs
[2] edtlib: Preserve paths of properties from included child bindings

See also: zephyrproject-rtos#65221, zephyrproject-rtos#78095

This reverts commit 70eaa61.

Signed-off-by: Christophe Dufaza <[email protected]>
dottspina added a commit to dottspina/zephyr that referenced this pull request Oct 18, 2024
[1] was introduced to get more valuable answers from
the PropertySpec.path API, which is supposed to tell
in which file the property's specification was "last modfied".

Further work on related issues [2] showed that the
approach chosen in [1] is dead end: we need to first rethink
how bindings (and especially child-bindings) are initialized.

[1] edtlib: fix last modified semantic in included property specs
[2] edtlib: Preserve paths of properties from included child bindings

See also: zephyrproject-rtos#65221, zephyrproject-rtos#78095

This reverts commit b3b5ad8.

Signed-off-by: Christophe Dufaza <[email protected]>
@dottspina
Copy link
Contributor Author

dottspina commented Oct 18, 2024

If you follow some of the links in my comments, you'll see that bindings (plural) have replaced binding everywhere. :-)

Regarding scripts: lib: settings: add capability to process YAML source files: from what I've read, this is not a "refactoring of edtlib.py" but its complete disappearance ;-)
IIUC, you're not planning on any backward compatibility with edtlib at the API level: is this discussed somewhere ? I've skimmed through #77638, which I found interesting, but does not seem to document or review this API removal.

I also had a look at scripts:dts:edtlib:type composition: I can roughly see the use cases, but I'm not sure I grasp all the ins and outs (e.g. its seems it breaks the unit tests changes covering "Support enums for array like dt props", commit 7454cb9).

@fgrandel, sorry for such a brief and late feedback, my (less exciting) priority was to try to do something with the most annoying issues accidentally raised by this PR.
Fortunately, #80030 fixed more things than I initially thought, and I'll have a bit more time for the (more exciting) ideas we wanted to work on.

@ghost
Copy link

ghost commented Oct 18, 2024

IIUC, you're not planning on any backward compatibility with edtlib at the API level

Except for the renaming of the public class from EDT to EDTree and the different way in which it is instantiated via STree, there should be no difference in the public API or behavior of the resulting object, because I actually paid a lot of attention to backwards compat. I adapted the tests and you'll see that they run mostly unchanged except for the instantiation of STree/EDTree and the changes that map to new features (e.g. type composition). There should be only 3 failing tests and that's on purpose because some feature is missing there. The tests exemplify how to use the generic settings library for EDT only.

But of course this is just a PoC and if there's any part of the change that would be problematic to you, then I'd try to co-operate with you as best as I can to keep the changes to the bare minimum required.

is this discussed somewhere ?

No, as a PoC it's far from being PR level or being agreed upon in arch WG even in principle. The purpose of this code is to show that certain claims I'm making regarding re-usability of existing code for configuration are actually feasible. This needs to be discussed step by step in arch WG and then transformed into actual reviewable pieces and only if it actually becomes arch WG policy.

I very recently did a first presentation of the PoC in arch WG, though, presenting its high-level design: https://docs.google.com/presentation/d/1JXk1IPPqogRg3Ir6_BK19txXpSQ3KZGaUKNYpIMRGDA/edit?usp=sharing

I've skimmed through #77638, which I found interesting, but does not seem to document or review this API removal.

No, it just gives you a bit of background and it is the discussion that triggered the PoC.

"Support enums for array like dt props", commit 7454cb9).

Oh sure, that's entirely possible. I might not have rebased properly or maybe a test for that feature was missing. But certainly there's not problem in principle, it's probably just a small bug in the PoC.

@fgrandel, sorry for such a brief and late feedback, my (less exciting) priority was to try to do something with the most annoying issues accidentally raised by this PR.

No problem of course. Just fyi: In the next iteration of my PoC that I'll showcase next Tuesday, I'll publish the changes to the binding class that also contain the fix for the path issue. I'll let you know when the code is out, so you can inspect it. The re-design simplifies binding class instantiation considerably, as expected.

mmahadevan108 pushed a commit that referenced this pull request Nov 6, 2024
This unit test was added specifically to cover a regression
reported by the CI while working on [1].

Further work on related issues [2] showed that:
- [1] and [2] are dead end: we need to first rethink
  how bindings (and especially child-bindings) are initialized
- the inclusion mechanism supported by Zephyr deserves more systematic
  testing in edtlib if we want to work with confidence

The approach we choose is to:
- revert all changes made in [1]
- from there, systematically add unit tests as we address
  the issues we identified (or the additional features we need)
  one after the other

[1] edtlib: fix last modified semantic in included property specs
[2] edtlib: Preserve paths of properties from included child bindings

See also: #65221, #78095

This reverts commit 33bb3b6.

Signed-off-by: Christophe Dufaza <[email protected]>
mmahadevan108 pushed a commit that referenced this pull request Nov 6, 2024
This unit test was added to cover the change introduced by [1].

Further work on related issues [2] showed that the chosen approach
is dead end.
We're reverting all changes made in [1].

[1] edtlib: fix last modified semantic in included property specs
[2] edtlib: Preserve paths of properties from included child bindings

See also: #65221, #78095

This reverts commit 70eaa61.

Signed-off-by: Christophe Dufaza <[email protected]>
mmahadevan108 pushed a commit that referenced this pull request Nov 6, 2024
[1] was introduced to get more valuable answers from
the PropertySpec.path API, which is supposed to tell
in which file the property's specification was "last modfied".

Further work on related issues [2] showed that the
approach chosen in [1] is dead end: we need to first rethink
how bindings (and especially child-bindings) are initialized.

[1] edtlib: fix last modified semantic in included property specs
[2] edtlib: Preserve paths of properties from included child bindings

See also: #65221, #78095

This reverts commit b3b5ad8.

Signed-off-by: Christophe Dufaza <[email protected]>
mark-inderhees pushed a commit to mark-inderhees/zephyr that referenced this pull request Nov 7, 2024
This unit test was added specifically to cover a regression
reported by the CI while working on [1].

Further work on related issues [2] showed that:
- [1] and [2] are dead end: we need to first rethink
  how bindings (and especially child-bindings) are initialized
- the inclusion mechanism supported by Zephyr deserves more systematic
  testing in edtlib if we want to work with confidence

The approach we choose is to:
- revert all changes made in [1]
- from there, systematically add unit tests as we address
  the issues we identified (or the additional features we need)
  one after the other

[1] edtlib: fix last modified semantic in included property specs
[2] edtlib: Preserve paths of properties from included child bindings

See also: zephyrproject-rtos#65221, zephyrproject-rtos#78095

This reverts commit 33bb3b6.

Signed-off-by: Christophe Dufaza <[email protected]>
mark-inderhees pushed a commit to mark-inderhees/zephyr that referenced this pull request Nov 7, 2024
This unit test was added to cover the change introduced by [1].

Further work on related issues [2] showed that the chosen approach
is dead end.
We're reverting all changes made in [1].

[1] edtlib: fix last modified semantic in included property specs
[2] edtlib: Preserve paths of properties from included child bindings

See also: zephyrproject-rtos#65221, zephyrproject-rtos#78095

This reverts commit 70eaa61.

Signed-off-by: Christophe Dufaza <[email protected]>
mark-inderhees pushed a commit to mark-inderhees/zephyr that referenced this pull request Nov 7, 2024
[1] was introduced to get more valuable answers from
the PropertySpec.path API, which is supposed to tell
in which file the property's specification was "last modfied".

Further work on related issues [2] showed that the
approach chosen in [1] is dead end: we need to first rethink
how bindings (and especially child-bindings) are initialized.

[1] edtlib: fix last modified semantic in included property specs
[2] edtlib: Preserve paths of properties from included child bindings

See also: zephyrproject-rtos#65221, zephyrproject-rtos#78095

This reverts commit b3b5ad8.

Signed-off-by: Christophe Dufaza <[email protected]>
JA-NXP pushed a commit to nxp-upstream/zephyr that referenced this pull request Nov 19, 2024
This unit test was added specifically to cover a regression
reported by the CI while working on [1].

Further work on related issues [2] showed that:
- [1] and [2] are dead end: we need to first rethink
  how bindings (and especially child-bindings) are initialized
- the inclusion mechanism supported by Zephyr deserves more systematic
  testing in edtlib if we want to work with confidence

The approach we choose is to:
- revert all changes made in [1]
- from there, systematically add unit tests as we address
  the issues we identified (or the additional features we need)
  one after the other

[1] edtlib: fix last modified semantic in included property specs
[2] edtlib: Preserve paths of properties from included child bindings

See also: zephyrproject-rtos#65221, zephyrproject-rtos#78095

This reverts commit 33bb3b6.

Signed-off-by: Christophe Dufaza <[email protected]>
JA-NXP pushed a commit to nxp-upstream/zephyr that referenced this pull request Nov 19, 2024
This unit test was added to cover the change introduced by [1].

Further work on related issues [2] showed that the chosen approach
is dead end.
We're reverting all changes made in [1].

[1] edtlib: fix last modified semantic in included property specs
[2] edtlib: Preserve paths of properties from included child bindings

See also: zephyrproject-rtos#65221, zephyrproject-rtos#78095

This reverts commit 70eaa61.

Signed-off-by: Christophe Dufaza <[email protected]>
JA-NXP pushed a commit to nxp-upstream/zephyr that referenced this pull request Nov 19, 2024
[1] was introduced to get more valuable answers from
the PropertySpec.path API, which is supposed to tell
in which file the property's specification was "last modfied".

Further work on related issues [2] showed that the
approach chosen in [1] is dead end: we need to first rethink
how bindings (and especially child-bindings) are initialized.

[1] edtlib: fix last modified semantic in included property specs
[2] edtlib: Preserve paths of properties from included child bindings

See also: zephyrproject-rtos#65221, zephyrproject-rtos#78095

This reverts commit b3b5ad8.

Signed-off-by: Christophe Dufaza <[email protected]>
Copy link

This pull request has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this pull request will automatically be closed in 14 days. Note, that you can always re-open a closed pull request at any time.

@github-actions github-actions bot added the Stale label Dec 18, 2024
@decsny decsny removed the Stale label Dec 18, 2024
@decsny decsny requested a review from rruuaanng December 18, 2024 01:48
@dottspina
Copy link
Contributor Author

dottspina commented Dec 18, 2024

Just trying to clarify the status of this PR, which may not be obvious:

This PR now contains:

AFAICT we can close this, it won't stop us from linking to interesting parts of the conversation if we want to.

Thanks.

[Edit: I have reopened #65135, and submitted #83457 which amends the PropertySpec.path API documentation.]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants