Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add long running log store tests. #419

Merged
merged 1 commit into from
Jun 12, 2024

Conversation

sanebay
Copy link
Contributor

@sanebay sanebay commented May 13, 2024

Changes.

  1. Fix the truncation in corner cases.
  2. Add long running test for logstore. Stripped down version of test_log_store. Add burst of requests to all logstores, truncate, restart, add/remove logstores logdev's, ad rollbacks etc.
  3. Enable logstore test except the parallel truncation UT.

More details.
Journalvdev maintains a list of chunks to store all the log entries. All append log entries are appended to the last chunk in the list(right side/tail offset) and truncate is applied to the head of the chunk list(left part/ data start offset). Whenever we append log entries, if we dont have enough space, we create a new chunk and append to the list. So log groups(batch of log entries) dont go across chunks. So there will be holes in these chunks at the end which are marked by end_of_chunk in chunk private data. The hole lies between end_of_chunk and (chunk_start + chunk_size) . When we read and we reach this hole, there is no data so we skip and move to the next chunk. Similarly if truncate happens to be in that hole also, we release the whole chunk and move to the next chunk. Also we set the data_start_offset to the start of the next chunk.

@sanebay sanebay force-pushed the logstore_long_running branch from 6f1a673 to 0db6b23 Compare May 15, 2024 19:39
@sanebay sanebay requested review from yamingk and hkadayam May 15, 2024 19:41
@sanebay sanebay linked an issue May 16, 2024 that may be closed by this pull request
@sanebay sanebay force-pushed the logstore_long_running branch from 0db6b23 to 54dd65e Compare May 21, 2024 20:30
@yamingk yamingk added this to the MileStone4.2 milestone May 21, 2024
@yamingk
Copy link
Contributor

yamingk commented Jun 7, 2024

We also need to add flip point when we add new chunks to the journal_vdev, in which we do persistence of the private data and think it through what will happen on reboot and verify with a test case.

@yamingk
Copy link
Contributor

yamingk commented Jun 7, 2024

Ideally we should have a long running either at log store level or journal vdev level, that we do truncate periodically for a few hours and make sure it is running fine.
Another test case is run journal or logstore test for 30 mins (with trunation happening more frequently), do a clean shutdown and do another 15 mins run and repeat clean shutdown for multiple times.

We can review these and see if there is anything comment from others and create issues (not necessaries to be included in this PR).

@sanebay
Copy link
Contributor Author

sanebay commented Jun 10, 2024

Ideally we should have a long running either at log store level or journal vdev level, that we do truncate periodically for a few hours and make sure it is running fine. Another test case is run journal or logstore test for 30 mins (with trunation happening more frequently), do a clean shutdown and do another 15 mins run and repeat clean shutdown for multiple times.

We can review these and see if there is anything comment from others and create issues (not necessaries to be included in this PR).

test_log_store_long_run.cpp is doing that what you mentioned. Will have to run it on 85 namespace.

@sanebay
Copy link
Contributor Author

sanebay commented Jun 10, 2024

We also need to add flip point when we add new chunks to the journal_vdev, in which we do persistence of the private data and think it through what will happen on reboot and verify with a test case.

Added this in the doc.

@sanebay sanebay force-pushed the logstore_long_running branch from 54dd65e to 52e158d Compare June 10, 2024 23:50
@yamingk
Copy link
Contributor

yamingk commented Jun 11, 2024

Ideally we should have a long running either at log store level or journal vdev level, that we do truncate periodically for a few hours and make sure it is running fine. Another test case is run journal or logstore test for 30 mins (with trunation happening more frequently), do a clean shutdown and do another 15 mins run and repeat clean shutdown for multiple times.
We can review these and see if there is anything comment from others and create issues (not necessaries to be included in this PR).

test_log_store_long_run.cpp is doing that what you mentioned. Will have to run it on 85 namespace.

Right, is the truncation point ramdomized?

@yamingk
Copy link
Contributor

yamingk commented Jun 11, 2024

We also need to add flip point when we add new chunks to the journal_vdev, in which we do persistence of the private data and think it through what will happen on reboot and verify with a test case.

Added this in the doc.

Better to create issue otherwise it will lose track which are already there which are todos.

@sanebay
Copy link
Contributor Author

sanebay commented Jun 11, 2024

Ideally we should have a long running either at log store level or journal vdev level, that we do truncate periodically for a few hours and make sure it is running fine. Another test case is run journal or logstore test for 30 mins (with trunation happening more frequently), do a clean shutdown and do another 15 mins run and repeat clean shutdown for multiple times.
We can review these and see if there is anything comment from others and create issues (not necessaries to be included in this PR).

test_log_store_long_run.cpp is doing that what you mentioned. Will have to run it on 85 namespace.

Right, is the truncation point ramdomized?

Yes test_log_store_long_run.cpp:459.

@sanebay
Copy link
Contributor Author

sanebay commented Jun 11, 2024

Created a ticket to use version instead of created time (#441)

@sanebay sanebay force-pushed the logstore_long_running branch from 52e158d to a728a20 Compare June 11, 2024 22:19
yamingk
yamingk previously approved these changes Jun 11, 2024
Fix truncation issues on boundary cases. Release chunks
if truncate cross end of chunk boundaries.
Enable logstore test except the parallel write and truncate
test case. Truncate can cause data start to go to next chunk
start offset. Change truncate api to return that offset.
@sanebay sanebay merged commit 6ba47be into eBay:master Jun 12, 2024
21 checks passed
@sanebay sanebay deleted the logstore_long_running branch June 12, 2024 17:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Logstore long duration testing with multiple logdevs
3 participants