Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bluetooth: Mesh: Add support for Upload OOB Start #61389

Conversation

ludvigsj
Copy link
Collaborator

This adds support for the Upload OOB Start message to the DFD server, by providing callbacks that the application can use to hook any OOB scheme into the model behavior.

There are also extensive changes to the dfu_slot module, to accomodate the new needs that appeared with the support for OOB transfer (mainly, fwid, size and metadata are no longer available when the slot is allocated, they appear later in the handling).

@ludvigsj ludvigsj force-pushed the develop/NCSDK-14801_upload_oob_start_support branch from f716b95 to 702ec88 Compare August 11, 2023 10:59
* @c bt_mesh_dfd_srv_oob_store_complete.
*
* @param srv Firmware Distribution Server model instance.
* @param slot Slot used for the upload.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this?

Suggested change
* @param slot Slot used for the upload.
* @param slot Slot to be used for the upload.

@@ -75,6 +87,60 @@ struct bt_mesh_dfd_srv_cb {
const struct bt_mesh_dfu_slot *slot,
const struct bt_mesh_blob_io **io);

/** @brief Firware upload OOB start callback.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to this PR, but should we start indicating globablly which params are inputs and which params are outputs or in-out?

* @return BT_MESH_DFD_SUCCESS on success, or error code otherwise.
*/
int (*start_oob_upload)(struct bt_mesh_dfd_srv *srv,
const struct bt_mesh_dfu_slot *slot,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't we discuss somewhere that image identification at application will be based on FWID rather than a slot? or was it in different context?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We decided to stop using slot index for image identification, and that further changes to this would require more discussion. I just kept the slot pointer since that was what already was used and it doesn't have the problems the idx method has with the index of a slot able to change when earlier slots are removed. If you'd like a different solution here, I'd like that to be its own issue.

{ \
.cb = _cb, \
.dfu = BT_MESH_DFU_CLI_INIT(&_bt_mesh_dfd_srv_dfu_cb), \
.upload = { \
.blob = { .cb = &_bt_mesh_dfd_srv_blob_cb }, \
}, \
.oob_schemes = { \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these will be populated by application, somedocstrings would help.

};

/** @brief Call when an OOB check has completed or failed
*
* This should be called by the application after an OOB check started by the @c start_oob
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* This should be called by the application after an OOB check started by the @c start_oob
* This should be called by the application after an OOB check started by the @c start_oob_upload

@@ -461,7 +490,7 @@ static int handle_upload_start(struct bt_mesh_model *mod, struct bt_mesh_msg_ctx
timeout_base);
if (err) {
LOG_ERR("BLOB Server rejected upload (err: %d)", err);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not realated to this PR, but we don't have uniform debug strings. Sometimes its ... err: %d <something else>, and sometimes it is ... (err: %d). Could be done as a seperate task.

* @return 0 on success, or (negative) error code on failure.
* @return 0 on success, (negative) error code otherwise.
*/
int bt_mesh_dfu_slot_fwid_set(struct bt_mesh_dfu_slot *dfu_slot,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mentioned this, but I don't recollect. Could tell me again why there are seperate APIs for fwid_set and info_set?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because during OOB, fwid must be set after the check is done, whereas size and metadata will be known only after the image is done transferring.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh right. Then, I think, we need to clarify this a bit. Because then "fwid" is the "new fwid" ... and application should not mix that with the "old fwid". Because this part is not explained in this API documentation. Slots are identified by the "fwid" contained in them, and not the "fwid" of the firmware on the chip or the remote node.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep in mind that this file is an internal API, so no application will call these directly. But I'll clarify a bit.

}

sys_slist_init(&list);
committed_slot_count--;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like, the logic in this API will fail if higher layer calls this API twice passing the same *dfu_slot. Potentially causing committed_slot_count to underflow.? Please check that committed_slot_count manipulations are sufficiently guarded since lot of logic in this module depends on correct value of this counter.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did think about just counting the number of COMMITTED slots every time we need the count, but decided on this as a CPU cycle optimization. I'm fine with removing this counter to reduce risk.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, no, it won't fail on passing the same dfu_slot twice. It will return on line 224 in that case, because the first call will set the state to FREE and then if will fail the check.

SYS_SLIST_FOR_EACH_CONTAINER(&list, s, n) {
if (!idx--) {
return &s->slot;
struct bt_mesh_dfu_slot *bt_mesh_dfu_slot_at(uint16_t idx)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid confusion for API users, can we rename slot index (idx) to slot identifier (id)? Index kinds of hints that it is a sequencial thing. Where as identifier need not be sequential.

Suggested change
struct bt_mesh_dfu_slot *bt_mesh_dfu_slot_at(uint16_t idx)
struct bt_mesh_dfu_slot *bt_mesh_dfu_slot_get_by_id(uint16_t id
```suggestion
struct bt_mesh_dfu_slot *bt_mesh_dfu_slot_at(uint16_t idx)
```)

Copy link
Collaborator Author

@ludvigsj ludvigsj Aug 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it is sequential? slot_at(0) will get the first commited slot, slot_at(1) will get the second, etc.? In commit order.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, then why does it doubly checks
slots[i].idx == idx ? Wouldn't it be enough to simply return by using idx for array referencing? But that won't work. AFAICS, If the device has four slots, slot_data_load() can assign idx of 3 to slot[0] (depending on where it is positioned in the settings flash since entries can be stored in different order)? This is clear if someone read the relavent code, but it is very easy incorrectly infer that idx actually represents the physical ordering of the slots on the flash by just reading API decriptions and nomenclature. Hence the suggestion.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that the slots exist in an array is not supposed to exist outside this file. The only place that a slot has "two indices" (i and idx in this function for example) are inside this file. All APIs use idx to mean slot::idx. The name "index" is also what the spec uses for this field, so it makes sense for outside viewers. I don't feel like changing this. Meaning: Where a slot is in the array is an implementation detail. No one should care and no one outside this file should know. They should not even care that it is an array.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And, yes. array location can be different from slot::idx. This is the entire point of having the slot::idx field.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only comment is about the name choosen to represent the idx. In all of mesh code base, idx variable name is always used to represent some kind of index. Just like firmware has Firmware ID, slot has a Slot ID?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's not an ID. An ID I would expect to be constant for the lifetime of the slot. This index is not. It's the "Index" from 4.3.5 in the spec. All places in the API that mention idx refer to the "Index" from that chapter. As in: If we order the commited slots in the order they were added, what index of that (abstract) list is the slot at. (This list being the "Firmware Images List" state from 4.3.5). Position in the list may change if an earlier slot is removed. I still can't see why it's confusing to call it index. It's only confusing if you think about the backing array used for storing the slots, but no one should do that, they should just look at the APIs where idx is always == "Index" from 4.3.5.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, now I get it with that. Ok, can this be then img_idx or imglist_idx?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see much point, seeing as there are no other indexes in this API, but sure. I'll change it everywhere then, to be consistent.


return 0;
}

int bt_mesh_dfu_slot_del_all(void)
int bt_mesh_dfu_slot_del(const struct bt_mesh_dfu_slot *dfu_slot)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you check ideompotencay of important slot APIs in the test. Like "del", "release", "add", and "del_all". Calling these APIs multiple times with same params, should not result in misbehavior.

tests/bsim/bluetooth/mesh/tests_scripts/dfu/dfu_slot.sh Outdated Show resolved Hide resolved
tests/bsim/bluetooth/mesh/src/test_dfu.c Show resolved Hide resolved
tests/bsim/bluetooth/mesh/src/test_dfu.c Outdated Show resolved Hide resolved
subsys/bluetooth/mesh/shell/dfu.c Show resolved Hide resolved
include/zephyr/bluetooth/mesh/dfd_srv.h Outdated Show resolved Hide resolved
subsys/bluetooth/mesh/dfd_srv.c Outdated Show resolved Hide resolved
subsys/bluetooth/mesh/dfd_srv.c Show resolved Hide resolved
subsys/bluetooth/mesh/dfu_slot.c Outdated Show resolved Hide resolved
subsys/bluetooth/mesh/dfu_slot.c Show resolved Hide resolved
subsys/bluetooth/mesh/dfu_slot.c Outdated Show resolved Hide resolved
* * @c BT_MESH_DFD_ERR_NEW_FW_NOT_AVAILABLE if the check completes successfully but no new
* firmware is available.
*
* If this function returns 0, the application should then download the firmware to the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can leave this as is.

*
* Called at the start of an OOB firmware upload. The application must
* start a firmware check using an OOB mechanism, and then call
* @c bt_mesh_dfd_srv_oob_check_complete. Depending on the return value
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How exactly depending on the return value of that function application makes a decision? It will be more clear, if that aspect is spelt out in doc string. It may be a slight repetition, but it would be useful.

"If return value is , the application must do , otherwise does ."

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation for bt_mesh_dfd_srv_oob_check_complete already explains this. I don't think it's needed twice.


/** @brief Call when an OOB store has completed or failed
*
* This should be called by the application after an OOB store started by the @c start_oob
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't get this.

SYS_SLIST_FOR_EACH_CONTAINER(&list, s, n) {
if (!idx--) {
return &s->slot;
struct bt_mesh_dfu_slot *bt_mesh_dfu_slot_at(uint16_t idx)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only comment is about the name choosen to represent the idx. In all of mesh code base, idx variable name is always used to represent some kind of index. Just like firmware has Firmware ID, slot has a Slot ID?

Copy link
Collaborator

@omkar3141 omkar3141 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check response to comments.

subsys/bluetooth/mesh/dfu_slot.h Outdated Show resolved Hide resolved
subsys/bluetooth/mesh/dfu_slot.c Outdated Show resolved Hide resolved
subsys/bluetooth/mesh/dfu_slot.c Outdated Show resolved Hide resolved
subsys/bluetooth/mesh/dfu_slot.c Outdated Show resolved Hide resolved
@alxelax alxelax dismissed their stale review August 24, 2023 11:24

I want't have ability to control my comments next 3 weeks since vacation time. I remove my review to not block PR. @PavelVPV could you control that comments were considered?

@PavelVPV
Copy link
Collaborator

I want't have ability to control my comments next 3 weeks since vacation time. I remove my review to not block PR. @PavelVPV could you control that comments were considered?

Sure!

@@ -75,6 +87,60 @@ struct bt_mesh_dfd_srv_cb {
const struct bt_mesh_dfu_slot *slot,
const struct bt_mesh_blob_io **io);

/** @brief Firware upload OOB start callback.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/** @brief Firware upload OOB start callback.
/** @brief Firmware upload OOB start callback.

* @param srv Firmware Distribution Server model instance.
* @param slot Slot used for the upload.
* @param uri Pointer to buffer containing the URI used to
* check for new Firmware.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* check for new Firmware.
* check for new firmware.

* callback has completed or failed. The @p status param should be set to one of the following
* values:
*
* * @c BT_MESH_DFD_SUCCESS if the check was succesfull and a new firmware ID was found.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

firmware ID or Firmware ID like used in callbacks further up?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it looks like firmware ID is used more than Firmware ID. So maybe those other instances should be fixed instead.

@ludvigsj ludvigsj force-pushed the develop/NCSDK-14801_upload_oob_start_support branch from 702ec88 to 16c090a Compare September 4, 2023 11:21
@ludvigsj ludvigsj requested a review from nashif as a code owner September 4, 2023 11:21
@zephyrbot zephyrbot added the area: Settings Settings subsystem label Sep 4, 2023
@ludvigsj ludvigsj requested a review from akredalen September 14, 2023 13:48
Copy link
Collaborator

@akredalen akredalen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will approve once Alex's last comment is resolved.

@ludvigsj ludvigsj force-pushed the develop/NCSDK-14801_upload_oob_start_support branch from 005f532 to ee61a15 Compare September 15, 2023 06:36
PavelVPV
PavelVPV previously approved these changes Sep 15, 2023
akredalen
akredalen previously approved these changes Sep 15, 2023
@ludvigsj ludvigsj added the DNM This PR should not be merged (Do Not Merge) label Sep 15, 2023
@ludvigsj ludvigsj dismissed stale reviews from akredalen and PavelVPV via 52e00d3 September 15, 2023 09:06
@ludvigsj ludvigsj force-pushed the develop/NCSDK-14801_upload_oob_start_support branch from ee61a15 to 52e00d3 Compare September 15, 2023 09:06
@ludvigsj
Copy link
Collaborator Author

Sorry, need your reviews again, was notified of a small error I'd missed.

@ludvigsj ludvigsj removed the DNM This PR should not be merged (Do Not Merge) label Sep 15, 2023
@PavelVPV
Copy link
Collaborator

@ludvigsj , the issue with the CI has been fixed on main, you need to rebase you PR to pass CI.

This adds support for the Upload OOB Start message to the DFD server, by
providing callbacks that the application can use to hook any OOB scheme
into the model behavior.

There are also extensive changes to the dfu_slot module, to accomodate
the new needs that appeared with the support for OOB transfer (mainly,
fwid, size and metadata are no longer available when the slot is
allocated, they appear later in the handling).

Signed-off-by: Ludvig Samuelsen Jordet <[email protected]>
@ludvigsj ludvigsj force-pushed the develop/NCSDK-14801_upload_oob_start_support branch from 52e00d3 to 6de0a0f Compare September 18, 2023 08:27
@fabiobaltieri fabiobaltieri merged commit b990a74 into zephyrproject-rtos:main Sep 18, 2023
29 of 35 checks passed
m-alperen-sener added a commit to m-alperen-sener/sdk-nrf that referenced this pull request Oct 3, 2023
Aligne dfu sample with:
zephyrproject-rtos/zephyr#61389
bt_mesh_dfu_slot no longer has 'uri' param.

Signed-off-by: Alperen Şener <[email protected]>
de-nordic pushed a commit to nrfconnect/sdk-nrf that referenced this pull request Oct 5, 2023
Aligne dfu sample with:
zephyrproject-rtos/zephyr#61389
bt_mesh_dfu_slot no longer has 'uri' param.

Signed-off-by: Alperen Şener <[email protected]>
charlieshao5189 pushed a commit to charlieshao5189/sdk-nrf that referenced this pull request Oct 10, 2023
Aligne dfu sample with:
zephyrproject-rtos/zephyr#61389
bt_mesh_dfu_slot no longer has 'uri' param.

Signed-off-by: Alperen Şener <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants