Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add periodic reboot/reboot on wifi disconnection option. #153

Open
Yersi88 opened this issue Mar 21, 2022 · 17 comments
Open

Add periodic reboot/reboot on wifi disconnection option. #153

Yersi88 opened this issue Mar 21, 2022 · 17 comments

Comments

@Yersi88
Copy link

Yersi88 commented Mar 21, 2022

Is your feature request related to a problem? Please describe.
Device goes offline randomy after at least few hours. Disconnects from MQTT and can't load web interface too. Solution is to power cycle the device.

Describe the solution you'd like
Option to have device restarted at periodic intervals. And/or a device to reboot after it has disconnected from the network.

Describe alternatives you've considered
N/A

Additional context
N/A

@giddyhup
Copy link

I also have problems with my ESP32 disconnecting and not coming online again. The reboot would be a welcome workaround but an actual solution would also be appreciated.

@shmuelzon
Copy link
Owner

Hey,

Do either of you have any serial logs to share when the device becomes unresponsive?
Preferably with debug logs enabled. It should give some insight as to what's going on.

Thanks!

@giddyhup
Copy link

Thanks for noticing this. I have only remote logging set up. The BLE device I am monitoring is very chatty. I use the default bin and see lots of Bluetooth message but nothing else.

@giddyhup
Copy link

giddyhup commented Apr 7, 2022

@shmuelzon perhaps you could provide a bin where less/no bluetooth messages are being logged. I would try to catch the ESP32 module's log output on a Raspberry Pi or an ESP8266. From what my research shows the ESP32 not reconnecting seems to be a more common issue.

@shmuelzon
Copy link
Owner

Hey,

@giddyhup Please find attached binaries with some log messages removed. One's with the default INFO level debugs while the other is DEBUG. I've removed some of my debug messages but you'll probably still see some from the other modules.

ble2mqtt.debug.bin.gz
ble2mqtt.info.bin.gz

I'm interested in what you've found regarding the ESP32 not reconnecting. Anything concrete for the root cause or any workarounds?

@giddyhup
Copy link

giddyhup commented Apr 8, 2022

@shmuelzon thanks for the binaries. It may take some time for me to work with them but I will report back.

Regarding re-connection issues. I noticed this comment in wifi.c:

This is a workaround as ESP32 WiFi libs don't currently auto-reassociate.

Googling the text shows other people dealing with it

@giddyhup
Copy link

giddyhup commented Apr 8, 2022

I did first what could be done without much infrastructure change. I remote-flashed the DEBUG image and collected the log remotely. The flash procedure ended with a warm reboot. I used remote logging and after about a minute encountered a disconnect. In the past I experienced disconnects only about once in 12 hours so I was either 'lucky' or something was triggered (wild idea: could it be DHCP renewal?).

The log showed no abnormalities. The last entry is the discovery of a BLE device that the module did not connect to (I only have one address whitelisted). After a forced manual reset via the module's reset button the messages appearing in the remote log still show no abnormalities. There is just some MQTT connection and subscribe chatter and listing of BLE devices.

As I feared, I really have to find a solution for direct logging from a Raspberry Pi to get what is going on when the module disconnects.

@shmuelzon
Copy link
Owner

Unfortunately, you're right. Debugging WiFi issues can't be done over WiFi.
This is a reoccurring problem and some ESP-IDF versions seem more stable than others.
To the best of my knowledge I'm not doing anything special with the way I handle WiFi but I know that using it in conjunction with Bluetooth causes issues.
I have other projects that use the exact same WiFi code (but don't include Bluetooth) that remain connected for weeks without issues... I've considered trying to switch from Bluedroid to nimBLE but I don't know if it will make any difference since both use the same PHY.

@giddyhup
Copy link

giddyhup commented Apr 9, 2022

Please find attached a redacted excerpt where I de-authed the module on purpose to see what happens.

A panic happened, the module rebooted and reconnected. So here, the result at least was desirable. I had remote logging still enabled but disabled it now for the next tests.
esp32ble.log

@shmuelzon
Copy link
Owner

Thanks for that. It looks like we've run or of memory (?)
But, it might be related to the fact that debug logs are enabled, it takes its toll...

@giddyhup
Copy link

Apparently serial logging fixes the reconnection issues. : )

The module is online since I started logging. I'll keep logging.

@Yersi88
Copy link
Author

Yersi88 commented Apr 10, 2022

Apparently serial logging fixes the reconnection issues. : )

The module is online since I started logging. I'll keep logging.

How do I enable this accidental fix?

@giddyhup
Copy link

I was joking, but if you want to try it yourself here's what I did:

I flashed (OTA) ble2mqtt.debug.bin.gz and connected my ESP32 module via its USB port to a Raspberry Pi. On the Pi I use minicom to collect the module's serial output and write it to a file.

@shmuelzon
Copy link
Owner

BTW - Not sure if relevant to your setup, I once had an ESP32 connected to an RPI2 (I think) and it would often fail and get stuck in a boot loop but when I moved it to another machine (with USB3) it didn't happen again. I'm guessing it might be related to the amount of current the USB was able to provide.

@giddyhup
Copy link

@shmuelzon, thanks for pointing that out.

Initially, my ESP32 was connected to a USB charging brick that provides 5V/1A or more. Currently, it is connected to a Pi 4's USB 2 port. So I guess it is currently getting less juice while before (where it had issues reconnecting). Perhaps later will connect it to the charger again and see if failing reconnects reoccur.

@giddyhup
Copy link

@shmuelzon I was offline for a few days and am now back to continue with this issue.

I downgraded to the official version (16) and can confirm that the issue returns. In the log these entries keep repeating:


I (87334265) WiFi: Disconnected
I (87334265) BLE2MQTT: Disconnected from the network, stopping MQTT
I (87334265) MQTT: Disconnecting MQTT client
W (87334265) wifi:m f probe req l=0

W (87334395) wifi:m f probe req l=0

W (87334515) wifi:m f probe req l=0

W (87334635) wifi:m f probe req l=0

W (87334755) wifi:m f probe req l=0

W (87334875) wifi:m f probe req l=0

W (87334995) wifi:m f probe req l=0

W (87335125) wifi:m f probe req l=0

W (87335245) wifi:m f probe req l=0

W (87335365) wifi:m f probe req l=0

W (87335485) wifi:m f probe req l=0

I (87335845) WiFi: Disconnected
I (87335845) BLE2MQTT: Disconnected from the network, stopping MQTT
I (87335845) MQTT: Disconnecting MQTT client
W (87335845) wifi:m f probe req l=0

@giddyhup
Copy link

I am using the debug version again and a few minutes ago the BT connection to the (single whitelisted) device I am listening to got disconnected. Initially, when I checked the web interface the request appeared in the log but during handling the web request the module disconnected:

D (12974347) httpd: httpd_server: processing listen socket 54
D (12974347) httpd: httpd_accept_conn: newfd = 57
D (12974347) httpd_sess: httpd_sess_new: fd = 57
D (12974347) httpd_sess: httpd_sess_new: active sockets: 1
D (12974357) httpd: httpd_accept_conn: complete
D (12974357) httpd: httpd_server: doing select maxfd+1 = 58
D (12974367) httpd: httpd_process_session: processing socket 57
D (12974367) httpd_sess: httpd_sess_process: httpd_req_new
D (12974377) httpd_txrx: httpd_recv_with_opt: requested length = 128
D (12974387) httpd_txrx: httpd_recv_with_opt: received length = 128
D (12974387) httpd_parse: read_block: received HTTP request block size = 128
D (12974397) httpd_parse: cb_url: message begin
D (12974397) httpd_parse: cb_url: processing url = /
D (12974407) httpd_parse: verify_url: received URI = /
D (12974407) httpd_parse: cb_header_field: headers begin
D (12974417) httpd_txrx: httpd_unrecv: length = 112
D (12974417) httpd_parse: pause_parsing: paused
D (12974427) httpd_parse: cb_header_field: processing field = Host
D (12974437) httpd_txrx: httpd_recv_with_opt: requested length = 128
D (12974437) httpd_txrx: httpd_recv_with_opt: pending length = 112
D (12974447) httpd_parse: read_block: received HTTP request block size = 112
D (12974457) httpd_parse: continue_parsing: skip pre-parsed data of size = 5
D (12974457) httpd_parse: continue_parsing: un-paused
D (12974467) httpd_parse: cb_header_value: processing value = ble2mqtt-d4ac.lan
D (12974477) httpd_parse: cb_header_field: processing field = User-Agent
D (12974477) httpd_parse: cb_header_value: processing value = Mozilla/5.0 (Wind9
D (12974487) httpd_parse: parse_block: parsed block size = 112
D (12974497) httpd_txrx: httpd_recv_with_opt: requested length = 128
D (12974507) httpd_txrx: httpd_recv_with_opt: received length = 128
D (12974507) httpd_parse: read_block: received HTTP request block size = 128
D (12974517) httpd_parse: cb_header_value: processing value = 9.0
D (12974527) httpd_parse: cb_header_field: processing field = Accept
D (12974527) httpd_parse: cb_header_value: processing value = text/html,applica8
D (12974547) httpd_parse: cb_header_field: processing field = Accept-Language
D (12974547) httpd_parse: cb_header_value: processing value = en
D (12974557) httpd_parse: cb_header_field: processing field = Accept-
D (12974567) httpd_parse: parse_block: parsed block size = 240
D (12974567) httpd_txrx: httpd_recv_with_opt: requested length = 128
D (12974577) httpd_txrx: httpd_recv_with_opt: received length = 101
D (12974587) httpd_parse: read_block: received HTTP request block size = 101
D (12974587) httpd_parse: cb_header_field: processing field = Encoding
D (12974597) httpd_parse: cb_header_value: processing value = gzip, deflate
D (12974607) httpd_parse: cb_header_field: processing field = DNT
D (12974607) httpd_parse: cb_header_value: processing value = 1
D (12974617) httpd_parse: cb_header_field: processing field = Connection
D (12974617) httpd_parse: cb_header_value: processing value = keep-alive
D (12974627) httpd_parse: cb_header_field: processing field = Upgrade-Insecure-s
D (12974637) httpd_parse: cb_header_value: processing value = 1
D (12974637) httpd_parse: cb_header_field: processing field = Sec-GPC
D (12974647) httpd_parse: cb_header_value: processing value = 1
D (12974657) httpd_parse: cb_headers_complete: bytes read     = 359
D (12974657) httpd_parse: cb_headers_complete: content length = 0
D (12974667) httpd_parse: pause_parsing: paused
D (12974667) httpd_parse: cb_no_body: message complete
D (12974677) httpd_parse: httpd_parse_req: parsing complete
D (12974687) httpd_uri: httpd_uri: request for / with type 1
D (12974687) httpd_uri: httpd_find_uri_handler: [0] = /restart
D (12974697) httpd_uri: httpd_find_uri_handler: [1] = /status
D (12974697) httpd_uri: httpd_find_uri_handler: [2] = /ble/bonding_db
D (12974707) httpd_uri: httpd_find_uri_handler: [3] = /ble/devices
D (12974717) httpd_uri: httpd_find_uri_handler: [4] = /ota/firmware
D (12974717) httpd_uri: httpd_find_uri_handler: [5] = /ota/configuration
D (12974727) httpd_uri: httpd_find_uri_handler: [6] = /fs/*
D (12974727) httpd_uri: httpd_find_uri_handler: [7] = /fs/*
D (12974737) httpd_uri: httpd_find_uri_handler: [8] = /fs/*
D (12974747) httpd_uri: httpd_find_uri_handler: [9] = /
D (12974747) HTTPD: Handling route for /
D (12974747) httpd_txrx: httpd_resp_set_hdr: new header = Content-Encoding: gzip
D (12974757) httpd_txrx: httpd_send_all: sent = 64
D (12974767) httpd_txrx: httpd_send_all: sent = 16
D (12974767) httpd_txrx: httpd_send_all: sent = 2
D (12974777) httpd_txrx: httpd_send_all: sent = 4
D (12974777) httpd_txrx: httpd_send_all: sent = 2
D (12974787) httpd_txrx: httpd_send_all: sent = 2
D (12974787) httpd_txrx: httpd_send_all: sent = 1055
D (12974797) httpd_sess: httpd_sess_process: httpd_req_delete
D (12974797) httpd_sess: httpd_sess_process: success
D (12974807) httpd: httpd_server: doing select maxfd+1 = 58
I (12980727) wifi:bcn_timout,ap_probe_send_start
W (12981737) wifi:m f null

I (12983237) wifi:ap_probe_send over, resett wifi status to disassoc
I (12983237) wifi:state: run -> init (c800)
D (12983237) wifi:connect status 5 -> 3
D (12983237) wifi:send diassoc for beacon timeout(200)
D (12983247) wifi:sta leave
I (12983247) wifi:pm stop, total sleep time: 197398855 us / 233968858 us

D (12983247) wifi:stop CSA timer
W (12983257) wifi:<ba-del>idx
W (12983257) wifi:<ba-del>idx
D (12983257) wifi:remove xx:xx:xx:xx:cd:2e from rc list
I (12983267) wifi:new:<6,0>, old:<6,0>, ap:<255,255>, sta:<6,0>, prof:1
D (12983267) wifi:filter: set rx policy=8
D (12983277) wifi:Send disconnect event, reason=200, AP number=0
D (12983277) wifi:connect status 3 -> 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants