SpecPaxos replica crashes upon roll backs and when merging logs #4

ramanala · 2019-10-07T20:49:16Z

Hello, I have been trying to play around with the SpecPaxos implementation. The scenario I'm trying is simple: I run five replicas on a single machine (listening on different ports) and I have five clients sending requests in a closed loop. I understand that for SpecPaxos to deliver high throughput and low latency, the network needs to provide ordered delivery (at least for most of the time). If not, there will be many conflicts, leading to many roll backs that can hurt performance, but the system must keep making progress.

However, in the above scenario, I see that the replicas start to crash after a while. Once two replicas crash (in a five-node cluster), the clients block indefinitely.

Details:

I compiled with the paranoid flag on.
Here is how I start the servers:

./bench/replica -c ./conf -i 0 -m spec >rep0 2>&1 &
./bench/replica -c ./conf -i 1 -m spec >rep1 2>&1 &
./bench/replica -c ./conf -i 2 -m spec >rep2 2>&1 &
./bench/replica -c ./conf -i 3 -m spec >rep3 2>&1 &
./bench/replica -c ./conf -i 4 -m spec >rep4 2>&1 &

Here is how start the clients:

./bench/client -c ./conf -n 1000 -m spec >cli-0 2>&1 &
./bench/client -c ./conf -n 1000 -m spec >cli-1 2>&1 &
./bench/client -c ./conf -n 1000 -m spec >cli-2 2>&1 &
./bench/client -c ./conf -n 1000 -m spec >cli-3 2>&1 &
./bench/client -c ./conf -n 1000 -m spec >cli-4 2>&1 &

Here is the stack trace of a replica that is crashing:

20190907-154417-2122 17865 * MergeLogs (replica.cc:820): [2] Merging 3 logs
20190907-154417-2124 17865 PANIC MergeLogs (replica.cc:1060): Assertion `newEntry.viewstamp.view == entry.view()' failed
20190907-154417-2124 17865 ! Backtrace (message.cc:169): Backtrace:
20190907-154417-2128 17865 ! Backtrace (message.cc:220): 0: _Z6_Panicv+0x9 [0x440314]

20190907-154417-2130 17865 ! Backtrace (message.cc:220): 1: _ZN9specpaxos4spec11SpecReplica9MergeLogsEmmRKSt3mapIiNS0_5proto19DoViewChangeMessageESt4lessIiESaISt4pairIKiS4_EEERSt6vectorINS_3Log8LogEntryESaISG_EE+0x1a19 [0x40bc9f]

20190907-154417-2132 17865 ! Backtrace (message.cc:220): 2: _ZN9specpaxos4spec11SpecReplica18HandleDoViewChangeERK16TransportAddressRKNS0_5proto19DoViewChangeMessageE+0x965 [0x40e5a3]

20190907-154417-2134 17865 ! Backtrace (message.cc:220): 3: ZN9specpaxos4spec11SpecReplica14ReceiveMessageERK16TransportAddressRKSsS6+0x5bd [0x4077c7]

20190907-154417-2136 17865 ! Backtrace (message.cc:220): 4: _ZN12UDPTransport10OnReadableEi+0xb84 [0x4489fc]

20190907-154417-2138 17865 ! Backtrace (message.cc:220): 5: _ZN12UDPTransport14SocketCallbackEisPv+0x39 [0x448f2b]

20190907-154417-2140 17865 ! Backtrace (message.cc:220): 6: event_base_loop+0x754 [0x7f684a341f24]

20190907-154417-2142 17865 ! Backtrace (message.cc:220): 7: _ZN12UDPTransport3RunEv+0x1f [0x447bff]

20190907-154417-2144 17865 ! Backtrace (message.cc:220): 8: main+0x94f [0x40610f]

20190907-154417-2146 17865 ! Backtrace (message.cc:220): 9: __libc_start_main+0xf5 [0x7f684938af45]

20190907-154417-2148 17865 ! Backtrace (message.cc:220): 10: _start+0x29 [0x4056c9]

20190907-154417-2150 17865 ! Backtrace (message.cc:220): 11: ???+0x29 [0x29]

I can attach the full logs if needed. Ideally, the replicas should not crash but resolve the conflicts and make progress.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SpecPaxos replica crashes upon roll backs and when merging logs #4

SpecPaxos replica crashes upon roll backs and when merging logs #4

ramanala commented Oct 7, 2019 •

edited

Loading

SpecPaxos replica crashes upon roll backs and when merging logs #4

SpecPaxos replica crashes upon roll backs and when merging logs #4

Comments

ramanala commented Oct 7, 2019 • edited Loading

Details:

ramanala commented Oct 7, 2019 •

edited

Loading