Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Matter v1.3 bridged node how to achieve 100 endpoints to C3 target (CON-1417) #1155

Open
abu-matterize opened this issue Nov 16, 2024 · 12 comments

Comments

@abu-matterize
Copy link

abu-matterize commented Nov 16, 2024

Using this example (built with Matter v1.3), we tried to achieve 100 endpoints to the C3 target, but failed at 27th endpoint with the E (326240) chip[CSL]: PacketBuffer: pool EMPTY error.

More log
E (324540) chip[DMG]: Error retrieving data from clusterId: 0x0000_0062, err = b
> I (324550) chip[EM]: <<< [E:54942i S:47980 M:206943514] (S) Msg TX to 1:00000000DBF33283 [8836] [UDP:[FE80::181D:EB98:AB52:90FD%st1]:54351] --- Type 0001:05 (IM:ReportData)
E (324570) chip[EM]: Ignoring transient send error: 3000001 on exchange 54942i
I (325250) chip[EM]: Retransmitting MessageCounter:206943514 on exchange 54942i Send Cnt 1
E (325260) chip[EM]: Ignoring transient send error: 3000001 on exchange 54942i
I (325850) chip[EM]: Retransmitting MessageCounter:206943514 on exchange 54942i Send Cnt 2
E (325850) chip[EM]: Ignoring transient send error: 3000001 on exchange 54942i
E (326240) chip[CSL]: PacketBuffer: pool EMPTY.
I (326240) chip[DIS]: mDNS broadcast full failed in 1 separate send attempts.
E (326250) chip[DIS]: Failed to reply to query: b
I (326930) chip[EM]: Retransmitting MessageCounter:206943514 on exchange 54942i Send Cnt 3
E (326930) chip[EM]: Ignoring transient send error: 3000001 on exchange 54942i
W (326960) Heap: Free Block :     7168 Min size :     1676 Free size :     8812 Total Size :   218076
I (328470) chip[EM]: Retransmitting MessageCounter:206943514 on exchange 54942i Send Cnt 4
E (328470) chip[EM]: Ignoring transient send error: 3000001 on exchange 54942i
E (331070) chip[EM]: Failed to Send CHIP MessageCounter:206943514 on exchange 54942i sendCount: 4 max retries: 4
W (331960) Heap: Free Block :     7168 Min size :     1676 Free size :    10348 Total Size :   218076
I (333590) chip[SC]: SecureSession[0x3fc9d058, LSID:47980]: State change 'kActive' --> 'kDefunct'
E (333590) chip[DMG]: Time out! failed to receive status response from Exchange: 54942i

The above mentioned status is after enabling some memory optimisation configs as shown below:

CONFIG_ESP_MATTER_MAX_DYNAMIC_ENDPOINT_COUNT=42

CONFIG_USE_BLE_ONLY_FOR_COMMISSIONING=y
#CONFIG_ENABLE_CHIP_SHELL=n
CONFIG_NEWLIB_NANO_FORMAT=y
CONFIG_NIMBLE_MAX_CONNECTIONS=1
CONFIG_BTDM_CTRL_BLE_MAX_CONN=1
CONFIG_BTDM_CTRL_BLE_MAX_CONN_EFF=1
CONFIG_BT_NIMBLE_ROLE_CENTRAL=n
CONFIG_BT_NIMBLE_ROLE_OBSERVER=n
CONFIG_EVENT_LOGGING_CRIT_BUFFER_SIZE=1024
CONFIG_ESP32_WIFI_STATIC_RX_BUFFER_NUM=4
CONFIG_ESP32_WIFI_DYNAMIC_RX_BUFFER_NUM=8
CONFIG_ESP32_WIFI_DYNAMIC_TX_BUFFER_NUM=16
Memory profile for endpoint counts
            free block	free size	total size
boot	    114688	    192780	    210648
comm open	40960	    62992	    210648
after comm	36864	    124624	    218076
25 eps	    7168	    22728	    218076
28 eps	    7168	    10348	    218076

After a quick search, found this

Is there any optimisation config we're missing in there to achieve 100 endpoints?

@github-actions github-actions bot changed the title Matter v1.3 bridged node how to achieve 100 endpoints to C3 target Matter v1.3 bridged node how to achieve 100 endpoints to C3 target (CON-1417) Nov 16, 2024
@jonsmirl
Copy link
Contributor

jonsmirl commented Nov 17, 2024

Switch to S3 and use PSRAM, that is never going to fit on a C3. A C3 is good for making a light bulb, not a complete bridge. You can't add PSRAM to the C3 which is why you need to switch to the S3.

@abu-matterize
Copy link
Author

I think @shubhamdp have some success with 100 endpoints in C3, isn't it?

@shubhamdp
Copy link
Contributor

@abu-matterize, #1103 (comment) was about just configuring the endpoints in sdkconfig. I did not create each one. Let me give a try and get back on this,

@abu-matterize
Copy link
Author

Sure. The included memory profiling makes senses? Or there is still a room to drag it down for each endpoints?

@abu-matterize
Copy link
Author

abu-matterize commented Nov 18, 2024

I've the cli app modified in a way that matter esp bridge add <parent_endpoint_id> <device_type_id> <number_of_devices> happy to share the patch if you need.

@jonsmirl
Copy link
Contributor

jonsmirl commented Nov 18, 2024

I work on a fully functioning bridge which includes all of the code needed to make various types of endpoint proxies. It isn't even close to fitting into an S3 without PSRAM. But it also doesn't require 2MB of PSRAM, in total I need about 200KB more RAM than there is in the base chip. Since I can't get a chip with less than 2MB PSRAM I use the extra space for more wifi buffers and code caching. I could probably trim that 200KB down to 100KB but once I added the PSRAM I stopping making a lot of effort to minimize memory consumption.

I did try for a long time to avoid moving to PSRAM, but it is too tight. You get into problems like in the other thread where you can't pass the conformance tests because you need more wifi buffers and there isn't enough memory. Basic devices will fit into the chips without PSRAM, but once you start adding stuff it won't fit any more.

This is how you are going to fail: #1132
During development you will add code and endpoints which consume all of the RAM. But then when you go into the real world with lots of radio interference you will discover that the wifi buffers are dynamically allocated and you don't have any room available to allocate them. And that's exactly the path I went down.

@abu-matterize
Copy link
Author

While building it with main, the buffer required for a single endpoint reduced by ~40%. Was achieved about 40 devices. I understand the RAM bottleneck, but wondering how many endpoints we can squeeze it in for C3.

@shubhamdp
Copy link
Contributor

I gave a try to examples/bridge_apps/bridge_cli, with below optimizations:

CONFIG_EVENT_LOGGING_CRIT_BUFFER_SIZE=1024
CONFIG_NEWLIB_NANO_FORMAT=y
CONFIG_ESP_MATTER_MAX_DYNAMIC_ENDPOINT_COUNT=64

I did not disable the chip shell, and was able to spawn 37 endpoints. Disabling chip shell would help add few more (expecting ~ +10). So, I can see approximately 45 bridged endpoints.

But, I'm not sure what is your use-case. Once I hit the limit, device was unusable and I can-not interact it through console. So, If you are trying to build the device supporting many bridged endpoints, then you do not have the RAM left to perform operations. For further optimization you can refer RAM and Flash optimization guide.

Also, as mentioned by the @jonsmirl you may run into problems when going through certifications.

@abu-matterize
Copy link
Author

Thanks for your responses @shubhamdp and @jonsmirl

Is there any plan to further optimise the memory consumption? as like main is more optimised than release/v1.3

@jonsmirl
Copy link
Contributor

Switch to $3 ESP32-S3-N4R2 module instead of $2 ESP32-C3 and you won't have anymore issues with running out of memory.

@shubhamdp
Copy link
Contributor

Is there any plan to further optimise the memory consumption? as like main is more optimised than release/v1.3

We do not have a jot down plan, but we are trying our best to optimize the SDK. There are been few attempts on main towards optimizations (0277732, e4777fa, 0e4cb28).

However I do not have actual numbers at hand on main and release/v1.3. I'll see if I can get them and update here.

@abu-matterize
Copy link
Author

Sure. Thanks @shubhamdp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants