Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] linux-test-docker integTests failing on remote cluster #667

Open
finnegancarroll opened this issue Nov 6, 2024 · 1 comment
Open
Assignees
Labels
bug Something isn't working

Comments

@finnegancarroll
Copy link
Collaborator

What is the bug?

Testing locally the integTestRemote task passes but integTest does not. integTestRemote filters only for rest tests so there is maybe a silent failure here?

How can one reproduce the bug?
Testing with:

export plugin=opensearch-asynchronous-search-3.0.0.0-SNAPSHOT.zip
export version=3.0.0
export plugin_version=3.0.0.0
export qualifier=
export docker_version=3.0.0

echo "FROM opensearchstaging/opensearch:$docker_version" >> Dockerfile
echo "RUN if [ -d /usr/share/opensearch/plugins/opensearch-asynchronous-search ]; then /usr/share/opensearch/bin/opensearch-plugin remove opensearch-asynchronous-search; fi" >> Dockerfile
echo "ADD $plugin /tmp/" >> Dockerfile
echo "RUN /usr/share/opensearch/bin/opensearch-plugin install --batch file:/tmp/$plugin" >> Dockerfile

docker build -t opensearch-asynchronous-search:test .

docker run \
-p 9200:9200 \
-p 9600:9600 \
-d \
-e OPENSEARCH_INITIAL_ADMIN_PASSWORD=myStrongPassword123! \
-e discovery.type=single-node \
opensearch-asynchronous-search:test

./gradlew integTest \
  -Dtests.rest.cluster=localhost:9200 \
  -Dtests.cluster=localhost:9200 \
  -Dtests.clustername="docker-cluster" \
  -Dhttps=true \
  -Duser=admin \
  -Dpassword=myStrongPassword123!

Test failures:

Tests with failures:
 - org.opensearch.search.asynchronous.integTests.AsynchronousSearchQueryIT.testEmptyQueryString
 - org.opensearch.search.asynchronous.integTests.AsynchronousSearchQueryIT.testHighlighterQuery
 - org.opensearch.search.asynchronous.integTests.AsynchronousSearchQueryIT.testIpRangeQuery
 - org.opensearch.search.asynchronous.integTests.AsynchronousSearchQueryIT.testAggregationQuery
 - org.opensearch.search.asynchronous.integTests.AsynchronousSearchRejectionIT.testSimulatedSearchRejectionLoad
 - org.opensearch.search.asynchronous.integTests.AsynchronousSearchRejectionIT.testSearchFailures
 - org.opensearch.search.asynchronous.integTests.AsynchronousSearchTaskCancellationIT.testCancellationDuringQueryPhase
 - org.opensearch.search.asynchronous.integTests.AsynchronousSearchTaskCancellationIT.testCancellationDuringFetchPhase
 - org.opensearch.search.asynchronous.listener.AsynchronousSearchCancellationIT.testCancellationDuringQueryPhase
 - org.opensearch.search.asynchronous.listener.AsynchronousSearchCancellationIT.testCancellationDuringFetchPhase
 - org.opensearch.search.asynchronous.listener.AsynchronousSearchPartialResponseIT.testPartialReduceBuckets
 - org.opensearch.search.asynchronous.management.AsynchronousSearchManagementServiceIT.testCleansUpExpiredAsynchronousSearchDuringQueryPhase
 - org.opensearch.search.asynchronous.management.AsynchronousSearchManagementServiceIT.testDeletesExpiredAsynchronousSearchResponseFromPersistedStore
 - org.opensearch.search.asynchronous.management.AsynchronousSearchManagementServiceIT.testCleansUpExpiredAsynchronousSearchDuringFetchPhase
 - org.opensearch.search.asynchronous.request.AsynchronousSearchRequestRoutingIT.testRequestForwardingToCoordinatorNodeForPersistedAsynchronousSearch
 - org.opensearch.search.asynchronous.request.AsynchronousSearchRequestRoutingIT.testRequestForwardingToCoordinatorNodeForRunningAsynchronousSearch
 - org.opensearch.search.asynchronous.request.AsynchronousSearchRequestRoutingIT.testInvalidIdRequestHandling

Often with 'failed to connect' errors which appear transport related:

  1> org.opensearch.transport.ConnectTransportException: [node_s2][127.0.0.1:44359] connect_exception
  1>    at org.opensearch.transport.TcpTransport$ChannelsConnectedListener.onFailure(TcpTransport.java:1106) ~[opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.core.action.ActionListener.lambda$toBiConsumer$2(ActionListener.java:217) ~[opensearch-core-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.common.concurrent.CompletableContext.lambda$addListener$0(CompletableContext.java:57) ~[opensearch-common-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863) ~[?:?]
  1>    at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841) ~[?:?]
  1>    at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510) ~[?:?]
  1>    at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2194) ~[?:?]
  1>    at org.opensearch.common.concurrent.CompletableContext.completeExceptionally(CompletableContext.java:72) ~[opensearch-common-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.SocketChannelContext.connect(SocketChannelContext.java:160) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.EventHandler.handleConnect(EventHandler.java:130) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.transport.nio.TestEventHandler.handleConnect(TestEventHandler.java:139) ~[framework-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.attemptConnect(NioSelector.java:446) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.registerChannel(NioSelector.java:469) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.setUpNewChannels(NioSelector.java:458) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.preSelect(NioSelector.java:279) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.singleLoop(NioSelector.java:172) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.runLoop(NioSelector.java:148) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]
  1> Caused by: java.net.ConnectException: Connection refused
  1>    at java.base/sun.nio.ch.Net.pollConnect(Native Method) ~[?:?]
  1>    at java.base/sun.nio.ch.Net.pollConnectNow(Net.java:682) ~[?:?]
  1>    at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:973) ~[?:?]
  1>    at org.opensearch.nio.SocketChannelContext.connect(SocketChannelContext.java:157) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    ... 9 more
  1> [2024-11-06T23:20:29,284][WARN ][o.o.c.NodeConnectionsService] [node_s4] failed to connect to {node_s3}{7lJLTNV_SMeqFgtL1SJsew}{DWMlPFFURwiVGq46clX-QQ}{127.0.0.1}{127.0.0.1:45207}{dimr}{shard_indexing_pressure_enabled=true} (tried [1] times)
  1> org.opensearch.transport.ConnectTransportException: [node_s3][127.0.0.1:45207] connect_exception
  1>    at org.opensearch.transport.TcpTransport$ChannelsConnectedListener.onFailure(TcpTransport.java:1106) ~[opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.core.action.ActionListener.lambda$toBiConsumer$2(ActionListener.java:217) ~[opensearch-core-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.common.concurrent.CompletableContext.lambda$addListener$0(CompletableContext.java:57) ~[opensearch-common-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863) ~[?:?]
  1>    at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841) ~[?:?]
  1>    at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510) ~[?:?]
  1>    at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2194) ~[?:?]
  1>    at org.opensearch.common.concurrent.CompletableContext.completeExceptionally(CompletableContext.java:72) ~[opensearch-common-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.SocketChannelContext.connect(SocketChannelContext.java:160) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.EventHandler.handleConnect(EventHandler.java:130) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.transport.nio.TestEventHandler.handleConnect(TestEventHandler.java:139) ~[framework-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.attemptConnect(NioSelector.java:446) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.registerChannel(NioSelector.java:469) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.setUpNewChannels(NioSelector.java:458) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.preSelect(NioSelector.java:279) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.singleLoop(NioSelector.java:172) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.runLoop(NioSelector.java:148) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]
  1> Caused by: java.net.ConnectException: Connection refused
  1>    at java.base/sun.nio.ch.Net.pollConnect(Native Method) ~[?:?]
  1>    at java.base/sun.nio.ch.Net.pollConnectNow(Net.java:682) ~[?:?]
  1>    at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:973) ~[?:?]
  1>    at org.opensearch.nio.SocketChannelContext.connect(SocketChannelContext.java:157) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    ... 9 more
  1> [2024-11-06T23:20:29,285][WARN ][o.o.c.NodeConnectionsService] [node_s5] failed to connect to {node_s3}{7lJLTNV_SMeqFgtL1SJsew}{DWMlPFFURwiVGq46clX-QQ}{127.0.0.1}{127.0.0.1:45207}{dimr}{shard_indexing_pressure_enabled=true} (tried [1] times)
  1> org.opensearch.transport.ConnectTransportException: [node_s3][127.0.0.1:45207] connect_exception
  1>    at org.opensearch.transport.TcpTransport$ChannelsConnectedListener.onFailure(TcpTransport.java:1106) ~[opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.core.action.ActionListener.lambda$toBiConsumer$2(ActionListener.java:217) ~[opensearch-core-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.common.concurrent.CompletableContext.lambda$addListener$0(CompletableContext.java:57) ~[opensearch-common-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863) ~[?:?]
  1>    at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841) ~[?:?]
  1>    at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510) ~[?:?]
  1>    at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2194) ~[?:?]
  1>    at org.opensearch.common.concurrent.CompletableContext.completeExceptionally(CompletableContext.java:72) ~[opensearch-common-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.SocketChannelContext.connect(SocketChannelContext.java:160) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.EventHandler.handleConnect(EventHandler.java:130) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.transport.nio.TestEventHandler.handleConnect(TestEventHandler.java:139) ~[framework-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.attemptConnect(NioSelector.java:446) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.registerChannel(NioSelector.java:469) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.setUpNewChannels(NioSelector.java:458) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.preSelect(NioSelector.java:279) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.singleLoop(NioSelector.java:172) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.runLoop(NioSelector.java:148) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]
  1> Caused by: java.net.ConnectException: Connection refused
  1>    at java.base/sun.nio.ch.Net.pollConnect(Native Method) ~[?:?]
  1>    at java.base/sun.nio.ch.Net.pollConnectNow(Net.java:682) ~[?:?]
  1>    at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:973) ~[?:?]
  1>    at org.opensearch.nio.SocketChannelContext.connect(SocketChannelContext.java:157) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    ... 9 more
  1> [2024-11-06T23:20:29,285][WARN ][o.o.c.NodeConnectionsService] [node_s5] failed to connect to {node_s2}{jkL01aTcRJu2tQDHvKWbyg}{0LFBkj48Q9G2NAG2z50obg}{127.0.0.1}{127.0.0.1:44359}{dimr}{shard_indexing_pressure_enabled=true} (tried [1] times)
  1> org.opensearch.transport.ConnectTransportException: [node_s2][127.0.0.1:44359] connect_exception
  1>    at org.opensearch.transport.TcpTransport$ChannelsConnectedListener.onFailure(TcpTransport.java:1106) ~[opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.core.action.ActionListener.lambda$toBiConsumer$2(ActionListener.java:217) ~[opensearch-core-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.common.concurrent.CompletableContext.lambda$addListener$0(CompletableContext.java:57) ~[opensearch-common-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863) ~[?:?]
  1>    at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841) ~[?:?]
  1>    at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510) ~[?:?]
  1>    at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2194) ~[?:?]
  1>    at org.opensearch.common.concurrent.CompletableContext.completeExceptionally(CompletableContext.java:72) ~[opensearch-common-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.SocketChannelContext.connect(SocketChannelContext.java:160) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.EventHandler.handleConnect(EventHandler.java:130) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.transport.nio.TestEventHandler.handleConnect(TestEventHandler.java:139) ~[framework-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.attemptConnect(NioSelector.java:446) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.registerChannel(NioSelector.java:469) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.setUpNewChannels(NioSelector.java:458) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.preSelect(NioSelector.java:279) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.singleLoop(NioSelector.java:172) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.runLoop(NioSelector.java:148) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]
  1> Caused by: java.net.ConnectException: Connection refused
  1>    at java.base/sun.nio.ch.Net.pollConnect(Native Method) ~[?:?]
  1>    at java.base/sun.nio.ch.Net.pollConnectNow(Net.java:682) ~[?:?]
  1>    at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:973) ~[?:?]
  1>    at org.opensearch.nio.SocketChannelContext.connect(SocketChannelContext.java:157) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]

What is the expected behavior?
Integ tests pass.

@finnegancarroll finnegancarroll added bug Something isn't working untriaged labels Nov 6, 2024
@dblock dblock removed the untriaged label Nov 25, 2024
@dblock
Copy link
Member

dblock commented Nov 25, 2024

[Catch All Triage - 1, 2, 3, 4, 5]

@finnegancarroll finnegancarroll self-assigned this Nov 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants