Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skaled does not exit gracefully when it unable to connect to 2/3 nodes. #1679

Open
badrogger opened this issue Sep 29, 2023 · 3 comments
Open
Assignees
Labels
bug Something isn't working epic:node-rotation
Milestone

Comments

@badrogger
Copy link
Contributor

badrogger commented Sep 29, 2023

Describe the bug
Skaled exits gracelessly when it cannot connect 2/3 peers.

To Reproduce
Steps to reproduce the behavior:

  1. Run skale chain (3.17.0-develop.62)
  2. Turn off skale admin container.
  3. Disconnect the node from the network.
  4. Observe how skaled exits.

Expected behavior
Skaled should exit gracefully.

Actual behavior
Skaled crashes during exit procedure with the following error (more logs attached):

terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::log::v2s_mt_posix::system_error> >'
pure virtual method called
terminate called recursively

Caught (first) signal. Signal SIGABRT(6). This is the abort signal. Typically, a process will initiate this kill signal on itself.

more-skaled-exiting.log

@dimalit
Copy link
Contributor

dimalit commented Sep 29, 2023

Dispatch: All threads stopped
Dispatch: All dispatch queues removed

  /home/dimalit/skaled/build/skaled/skaled : dev::ExitHandler::exitHandler(int, dev::ExitHandler::exit_code_t)+0x44d [0x5647749d187d]
  /home/dimalit/skaled/build/skaled/skaled : dev::ExitHandler::exitHandler(int)+0x1e [0x5647749d07ae]
  /lib/x86_64-linux-gnu/libc.so.6 : ()+0x42520 [0x7f38864bd520]
  /lib/x86_64-linux-gnu/libc.so.6 : pthread_kill()+0x12c [0x7f3886511a7c]
  /lib/x86_64-linux-gnu/libc.so.6 : raise()+0x16 [0x7f38864bd476]
  /lib/x86_64-linux-gnu/libc.so.6 : abort()+0xd3 [0x7f38864a37f3]
  /lib/x86_64-linux-gnu/libstdc++.so.6 : ()+0xb042a [0x7f388685c42a]
  /lib/x86_64-linux-gnu/libstdc++.so.6 : ()+0xae20c [0x7f388685a20c]
  /lib/x86_64-linux-gnu/libstdc++.so.6 : ()+0xae277 [0x7f388685a277]
  /lib/x86_64-linux-gnu/libstdc++.so.6 : ()+0xaefa5 [0x7f388685afa5]
  /home/dimalit/skaled/build/skaled/skaled : boost::system::error_code::message[abi:cxx11]() const+0x5b [0x564774575863]
  /home/dimalit/skaled/build/skaled/skaled : boost::system::system_error::what() const+0x9f [0x564774575e9d]
  /lib/x86_64-linux-gnu/libstdc++.so.6 : ()+0xa2b5d [0x7f388684eb5d]
  /lib/x86_64-linux-gnu/libstdc++.so.6 : ()+0xae20c [0x7f388685a20c]
  /lib/x86_64-linux-gnu/libstdc++.so.6 : ()+0xad1e9 [0x7f38868591e9]
  /lib/x86_64-linux-gnu/libstdc++.so.6 : __gxx_personality_v0()+0x99 [0x7f3886859959]
  /lib/x86_64-linux-gnu/libgcc_s.so.1 : ()+0x16884 [0x7f38866b9884]
  /lib/x86_64-linux-gnu/libgcc_s.so.1 : _Unwind_Resume()+0x12d [0x7f38866ba2dd]
  /home/dimalit/skaled/build/skaled/skaled : boost::log::v2s_mt_posix::sources::basic_composite_logger<char, boost::log::v2s_mt_posix::sources::severity_channel_logger_mt<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::log::v2s_mt_posix::sources::multi_thread_model<boost::log::v2s_mt_posix::aux::light_rw_mutex>, boost::log::v2s_mt_posix::sources::features<boost::log::v2s_mt_posix::sources::severity<int>, boost::log::v2s_mt_posix::sources::channel<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >::open_record()+0xbf [0x5647745d1bbf]
  /home/dimalit/skaled/build/skaled/skaled : SkaleHost::stopWorking()+0x8c [0x564774825fc0]
  /home/dimalit/skaled/build/skaled/skaled : dev::eth::Client::stopWorking()+0x71 [0x56477477596b]
  /home/dimalit/skaled/build/skaled/skaled : dev::eth::Client::~Client()+0x47 [0x56477477565f]
  /home/dimalit/skaled/build/skaled/skaled : dev::eth::EthashClient::~EthashClient()+0x8f [0x5647748f249f]
  /home/dimalit/skaled/build/skaled/skaled : dev::eth::EthashClient::~EthashClient()+0x1c [0x5647748f24ce]
  /home/dimalit/skaled/build/skaled/skaled : std::default_delete<dev::eth::Client>::operator()(dev::eth::Client*) const+0x2c [0x5647745e5920]
  /home/dimalit/skaled/build/skaled/skaled : std::unique_ptr<dev::eth::Client, std::default_delete<dev::eth::Client> >::~unique_ptr()+0x56 [0x5647745d4fba]
  /lib/x86_64-linux-gnu/libc.so.6 : ()+0x45495 [0x7f38864c0495]
  /lib/x86_64-linux-gnu/libc.so.6 : on_exit()+0 [0x7f38864c0610]
  /home/dimalit/skaled/build/skaled/skaled : Schain::healthCheck()+0x363 [0x564774cd1d55]
  /home/dimalit/skaled/build/skaled/skaled : Node::startClients()+0x3a [0x564774bead0a]
  /home/dimalit/skaled/build/skaled/skaled : ConsensusEngine::startAll()+0x802 [0x564774ba4738]
  /home/dimalit/skaled/build/skaled/skaled : ()+0x1f24551 [0x564774825551]
  /home/dimalit/skaled/build/skaled/skaled : ()+0x1f28890 [0x564774829890]

@dimalit
Copy link
Contributor

dimalit commented Sep 29, 2023

As a solution, I'd propose not to call exit() directly from consensus, but use ConsesusExtFace::terminateApplication() instead

@DmytroNazarenko DmytroNazarenko assigned kladkogex and unassigned dimalit Oct 16, 2023
@PolinaKiporenko PolinaKiporenko modified the milestones: SKALE 2.3, SKALE 2.4 Dec 29, 2023
@kladkogex
Copy link
Collaborator

Moving to 2.5 since we do not have time for this in 2.4

@kladkogex kladkogex modified the milestones: SKALE 2.4, SKALE 2.5 Mar 21, 2024
@PolinaKiporenko PolinaKiporenko modified the milestones: SKALE 2.5, SKALE 2.6 Apr 24, 2024
@PolinaKiporenko PolinaKiporenko moved this from Ready For Pickup to To Do in SKALE Engineering 🚀 Oct 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working epic:node-rotation
Projects
Status: No status
Development

No branches or pull requests

5 participants