Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spinning connection retries? #780

Open
dejonghb opened this issue Aug 9, 2017 · 3 comments
Open

spinning connection retries? #780

dejonghb opened this issue Aug 9, 2017 · 3 comments

Comments

@dejonghb
Copy link
Member

dejonghb commented Aug 9, 2017

rejecting connection to an asd causes large performance drop and proxylog is filled with following messages:

2017-08-09 11:24:11 125571 +0200 - ftcmp03 - 2933/0000 - alba/proxy - 260 - info - closing (fd:26)
2017-08-09 11:24:11 125651 +0200 - ftcmp03 - 2933/0000 - alba/proxy - 261 - info - "(Unix.Unix_error \"Connection refused\" connect \"\")": should_invalidate:false should_retry:false
2017-08-09 11:24:11 125689 +0200 - ftcmp03 - 2933/0000 - alba/proxy - 262 - warning - could not receive enough fragments for namespace 1, object "tlog_abd0ecaa-b09e-40b1-82e7-af4441ba44dc" ("\142l\2477\180w\003&,D\242\157Z;@B|f\180_\206\166\240{\135`\t\166\149POt") chunk 0; got 0 while 4 needed
2017-08-09 11:24:11 125718 +0200 - ftcmp03 - 2933/0000 - alba/proxy - 263 - info - retrying : Alba_client_errors.Error.Exn(8); backtrace:; Raised at file "src/alba_client_errors.ml", line 23, characters 25-32; Called from file "src/alba_client_download.ml", line 453, characters 8-48; Called from file "src/core/lwt.ml", line 670, characters 66-69
2017-08-09 11:24:11 125904 +0200 - ftcmp03 - 2933/0000 - alba/proxy - 264 - info - connect_with : 172.22.186.33 8603 None Net_fd.TCP (fd:26)
2017-08-09 11:24:11 126093 +0200 - ftcmp03 - 2933/0000 - alba/proxy - 265 - info - closing (fd:26)
2017-08-09 11:24:11 126252 +0200 - ftcmp03 - 2933/0000 - alba/proxy - 266 - info - "(Unix.Unix_error \"Connection refused\" connect \"\")": should_invalidate:false should_retry:false
2017-08-09 11:24:11 126405 +0200 - ftcmp03 - 2933/0000 - alba/proxy - 267 - info - Detected missing fragment namespace_id=1 object_name="tlog_abd0ecaa-b09e-40b1-82e7-af4441ba44dc" object_id="\142l\2477\180w\003&,D\242\157Z;@B|f\180_\206\166\240{\135`\t\166\149POt" location=((7L, 0)) (chunk,fragment,version)=(0,1)
2017-08-09 11:24:11 166627 +0200 - ftcmp03 - 2933/0000 - alba/proxy - 268 - info - connect_with : 172.22.186.33 8603 None Net_fd.TCP (fd:26)
2017-08-09 11:24:11 166725 +0200 - ftcmp03 - 2933/0000 - alba/proxy - 269 - info - closing (fd:26)
2017-08-09 11:24:11 166810 +0200 - ftcmp03 - 2933/0000 - alba/proxy - 270 - info - "(Unix.Unix_error \"Connection refused\" connect \"\")": should_invalidate:false should_retry:false
2017-08-09 11:24:11 166835 +0200 - ftcmp03 - 2933/0000 - alba/proxy - 271 - warning - could not receive enough fragments for namespace 1, object "tlog_ef460c4d-7eb8-42fd-a09a-783a22e302e9" ("\129x\150\b\175\211f\020\201\200fz|r\167\242M\192\021:Ne\245\128>BDn\137!\231#") chunk 0; got 0 while 4 needed
2017-08-09 11:24:11 166854 +0200 - ftcmp03 - 2933/0000 - alba/proxy - 272 - info - retrying : Alba_client_errors.Error.Exn(8); backtrace:; Raised at file "src/alba_client_errors.ml", line 23, characters 25-32; Called from file "src/alba_client_download.ml", line 453, characters 8-48; Called from file "src/core/lwt.ml", line 670, characters 66-69
2017-08-09 11:24:11 167013 +0200 - ftcmp03 - 2933/0000 - alba/proxy - 273 - info - connect_with : 172.22.186.33 8603 None Net_fd.TCP (fd:26)
2017-08-09 11:24:11 167144 +0200 - ftcmp03 - 2933/0000 - alba/proxy - 274 - info - closing (fd:26)
2017-08-09 11:24:11 167285 +0200 - ftcmp03 - 2933/0000 - alba/proxy - 275 - info - "(Unix.Unix_error \"Connection refused\" connect \"\")": should_invalidate:false should_retry:false
2017-08-09 11:24:11 167403 +0200 - ftcmp03 - 2933/0000 - alba/proxy - 276 - info - Detected missing fragment namespace_id=1 object_name="tlog_ef460c4d-7eb8-42fd-a09a-783a22e302e9" object_id="\129x\150\b\175\211f\020\201\200fz|r\167\242M\192\021:Ne\245\128>BDn\137!\231#" location=((7L, 0)) (chunk,fragment,version)=(0,2)
2017-08-09 11:24:11 219140 +0200 - ftcmp03 - 2933/0000 - alba/proxy - 277 - info - connect_with : 172.22.186.33 8603 None Net_fd.TCP (fd:26)
2017-08-09 11:24:11 219363 +0200 - ftcmp03 - 2933/0000 - alba/proxy - 278 - info - closing (fd:26)
2017-08-09 11:24:11 219439 +0200 - ftcmp03 - 2933/0000 - alba/proxy - 279 - info - "(Unix.Unix_error \"Connection refused\" connect \"\")": should_invalidate:false should_retry:false
2017-08-09 11:24:11 219464 +0200 - ftcmp03 - 2933/0000 - alba/proxy - 280 - warning - could not receive enough fragments for namespace 1, object "tlog_734253a9-4bae-4aea-9e75-a94b8226eefc" ("\233'\199\155\019g?+\222\030\146 l\014:\027\255A\200r\"\248\156\184VP\155\169\r\187\004\168") chunk 0; got 0 while 4 needed
2017-08-09 11:24:11 219482 +0200 - ftcmp03 - 2933/0000 - alba/proxy - 281 - info - retrying : Alba_client_errors.Error.Exn(8); backtrace:; Raised at file "src/alba_client_errors.ml", line 23, characters 25-32; Called from file "src/alba_client_download.ml", line 453, characters 8-48; Called from file "src/core/lwt.ml", line 670, characters 66-69
2017-08-09 11:24:11 219675 +0200 - ftcmp03 - 2933/0000 - alba/proxy - 282 - info - connect_with : 172.22.186.33 8603 None Net_fd.TCP (fd:26)
2017-08-09 11:24:11 219798 +0200 - ftcmp03 - 2933/0000 - alba/proxy - 283 - info - closing (fd:26)
2017-08-09 11:24:11 219953 +0200 - ftcmp03 - 2933/0000 - alba/proxy - 284 - info - "(Unix.Unix_error \"Connection refused\" connect \"\")": should_invalidate:false should_retry:false
2017-08-09 11:24:11 220132 +0200 - ftcmp03 - 2933/0000 - alba/proxy - 285 - info - Detected missing fragment namespace_id=1 object_name="tlog_734253a9-4bae-4aea-9e75-a94b8226eefc" object_id="\233'\199\155\019g?+\222\030\146 l\014:\027\255A\200r\"\248\156\184VP\155\169\r\187\004\168" location=((7L, 0)) (chunk,fragment,version)=(0,0)
2017-08-09 11:24:11 440521 +0200 - ftcmp03 - 2933/0000 - alba/proxy - 286 - info - connect_with : 172.22.186.33 8603 None Net_fd.TCP (fd:26)
2017-08-09 11:24:11 440726 +0200 - ftcmp03 - 2933/0000 - alba/proxy - 287 - info - closing (fd:26)
2017-08-09 11:24:11 440860 +0200 - ftcmp03 - 2933/0000 - alba/proxy - 288 - info - "(Unix.Unix_error \"Connection refused\" connect \"\")": should_invalidate:false should_retry:false
2017-08-09 11:24:11 440907 +0200 - ftcmp03 - 2933/0000 - alba/proxy - 289 - warning - could not receive enough fragments for namespace 1, object "tlog_88dae73a-5f55-494e-bc6a-876f10d94cba" ("6#\174\019\230]K\138\233\170\156\026\135l\021-to\199\209\142(U%N\196\129U\172\134\202\020") chunk 0; got 0 while 4 needed
2017-08-09 11:24:11 440942 +0200 - ftcmp03 - 2933/0000 - alba/proxy - 290 - info - retrying : Alba_client_errors.Error.Exn(8); backtrace:; Raised at file "src/alba_client_errors.ml", line 23, characters 25-32; Called from file "src/alba_client_download.ml", line 453, characters 8-48; Called from file "src/core/lwt.ml", line 670, characters 66-69
2017-08-09 11:24:11 441406 +0200 - ftcmp03 - 2933/0000 - alba/proxy - 291 - info - connect_with : 172.22.186.33 8603 None Net_fd.TCP (fd:26)
2017-08-09 11:24:11 441668 +0200 - ftcmp03 - 2933/0000 - alba/proxy - 292 - info - closing (fd:26)
2017-08-09 11:24:11 441934 +0200 - ftcmp03 - 2933/0000 - alba/proxy - 293 - info - "(Unix.Unix_error \"Connection refused\" connect \"\")": should_invalidate:false should_retry:false
2017-08-09 11:24:11 442190 +0200 - ftcmp03 - 2933/0000 - alba/proxy - 294 - info - Detected missing fragment namespace_id=1 object_name="tlog_88dae73a-5f55-494e-bc6a-876f10d94cba" object_id="6#\174\019\230]K\138\233\170\156\026\135l\021-to\199\209\142(U%N\196\129U\172\134\202\020" location=((7L, 0)) (chunk,fragment,version)=(0,0)

Performance drop might be due to spinning retries.

iptables -A INPUT -p tcp --destination-port 8603 -j REJECT
@toolslive
Copy link
Member

which alba version is this?

@dejonghb
Copy link
Member Author

dejonghb commented Aug 9, 2017

# alba version --terse
ee-1.5.10

(Jan asked to create ticket here)

@domsj
Copy link
Contributor

domsj commented Aug 9, 2017

Apparently the behavior is the same for an asd that is down as for one that is blocked by iptables.
To avoid these useless connection attempts towards an asd that is down we would need the disqualification concept for reads too.
(Currently it's a bit too write specific, as a full asd is rightfully rejected for new writes, but should still be used for reading.)

@wimpers wimpers added this to the J milestone Aug 23, 2017
@wimpers wimpers removed this from the J milestone Mar 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants