Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repeated "Read timed out" errors when recovering a large sized shards from S3 repository #157

Open
cregev opened this issue Dec 29, 2014 · 0 comments

Comments

@cregev
Copy link

cregev commented Dec 29, 2014

When im trying to restore a large size index (750GB splitted to 6 shards) from S3, "Read timed out" errors are raised , and the restore process does not finish the operation it looks like nothing is happening ...

Details about our Elasticsearch Cluster:

Es Version 1.4.2
Aws Cloud Plugin - 2.4.1

[2014-12-29 00:51:20,706][WARN ][indices.cluster ] [es-test-hist01] [2014_11][1] failed to start shard
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [2014_11][1] failed recovery
at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:185)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.elasticsearch.index.snapshots.IndexShardRestoreFailedException: [2014_11][1] restore failed
at org.elasticsearch.index.snapshots.IndexShardSnapshotAndRestoreService.restore(IndexShardSnapshotAndRestoreService.java:130)
at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:127)
... 3 more
Caused by: org.elasticsearch.index.snapshots.IndexShardRestoreFailedException: [2014_11][1] failed to restore snapshot [es-test-transfer]
at org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardRepository.restore(BlobStoreIndexShardRepository.java:165)
at org.elasticsearch.index.snapshots.IndexShardSnapshotAndRestoreService.restore(IndexShardSnapshotAndRestoreService.java:124)
... 4 more
Caused by: org.elasticsearch.index.snapshots.IndexShardRestoreFailedException: [2014_11][1] Failed to recover index
at org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardRepository$RestoreContext.restore(BlobStoreIndexShardRepository.java:787)
at org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardRepository.restore(BlobStoreIndexShardRepository.java:162)
... 5 more
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:150)
at java.net.SocketInputStream.read(SocketInputStream.java:121)
at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
at sun.security.ssl.InputRecord.readV3Record(InputRecord.java:577)
at sun.security.ssl.InputRecord.read(InputRecord.java:532)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:954)
at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:911)
at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
at org.apache.http.impl.io.AbstractSessionInputBuffer.read(AbstractSessionInputBuffer.java:204)
at org.apache.http.impl.io.ContentLengthInputStream.read(ContentLengthInputStream.java:182)
at org.apache.http.conn.EofSensorInputStream.read(EofSensorInputStream.java:138)
at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:71)
at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:71)
at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:71)
at org.elasticsearch.index.snapshots.blobstore.SlicedInputStream.read(SlicedInputStream.java:92)
at java.io.InputStream.read(InputStream.java:101)
at org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardRepository$RestoreContext.restoreFile(BlobStoreIndexShardRepository.java:834)
at org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardRepository$RestoreContext.restore(BlobStoreIndexShardRepository.java:784)
... 6 more
[2014-12-29 00:51:20,734][WARN ][cluster.action.shard ] [es-test-hist01] [2014_11][1] sending failed shard for [2014_11][1], node[BU9hbOrJSnmdASggfIzEEg], [P], restoring[my_s3_repository:es-test-transfer], s[INITIALIZING], indexUUID [k9lXKiIDQGe-2zqPBsP78w], reason [Failed to start shard, message [IndexShardGatewayRecoveryException[[2014_11][1] failed recovery]; nested: IndexShardRestoreFailedException[[2014_11][1] restore failed]; nested: IndexShardRestoreFailedException[[2014_11][1] failed to restore snapshot [es-test-transfer]]; nested: IndexShardRestoreFailedException[[2014_11][1] Failed to recover index]; nested: SocketTimeoutException[Read timed out]; ]]
[2014-12-29 01:04:47,413][WARN ][indices.cluster ] [es-test-hist01] [2014_11][3] failed to start shard
:1

Please Advise ?

Thanks,
Costya.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant