raft test framework for homeobject #218

JacksonYao287 · 2024-10-23T00:50:32Z

1 add a raft test framework for homeobject , which will enable 3-replica raft-based test.
2 move all the current UT from single replica to 3 replicas raft-based
3 find and fix a blob sequence number bug after involving the new test framework.

codecov-commenter · 2024-10-23T01:05:09Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 75.00000% with 1 line in your changes missing coverage. Please review.

Project coverage is 63.95%. Comparing base (acb04e8) to head (2490a3c).
Report is 24 commits behind head on main.

Files with missing lines	Patch %	Lines
src/lib/homestore_backend/hs_blob_manager.cpp	50.00%	1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #218      +/-   ##
==========================================
- Coverage   68.69%   63.95%   -4.74%     
==========================================
  Files          30       32       +2     
  Lines        1581     1784     +203     
  Branches      163      193      +30     
==========================================
+ Hits         1086     1141      +55     
- Misses        408      546     +138     
- Partials       87       97      +10

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

src/lib/homestore_backend/hs_http_manager.cpp

xiaoxichen · 2024-10-25T03:24:05Z

src/lib/homestore_backend/tests/bits_generator.hpp

@@ -37,6 +37,19 @@ class BitsGenerator {
    }

    static void gen_random_bits(sisl::blob& b) { gen_random_bits(b.size(), b.bytes()); }
+


why dont we add blob_it into parameters? if default value(not provided) use random device otherwise use blob_id to initialise .

src/lib/homestore_backend/tests/homeobj_fixture.hpp

xiaoxichen · 2024-10-25T04:00:59Z

src/lib/homestore_backend/tests/hs_repl_test_helper.hpp

+            sync_for(cleanup_start_count_, repl_test_phase_t::CLEANUP, num_members);
+        }
+
+        void sync_for_uint64_id(uint32_t max_count = 0) {


this is a very bad naming

this is used for sync shard_id or blob_id among different replicas? actually, I wrote two functions for shard_id and blob_id , but they are almost the same, the only different is shard_id_t and blob_id_t. so I merge them into this one.

do you have any suggestion of a better name ^v^

xiaoxichen · 2024-10-25T04:02:21Z

src/lib/homestore_backend/tests/homeobj_fixture.hpp

-                }
+    ShardInfo seal_shard(shard_id_t shard_id) {
+        // before seal shard, we need to wait all the memebers to complete shard state verification
+        g_helper->sync_for_verify_start();


this is very wired, why we can only do seal_shard in verify stage?

we can not use sync_for_verify_start or sync_for_test_start twice without a different sync type between them.
sync_for_verify_start() can be replaced by sync_for_test_start(). what I want is just a sync point.

will try to create another pr to change this to a more unified sync point

remove this

src/lib/homestore_backend/tests/homeobj_fixture.hpp

xiaoxichen

There are some limitations in current implementation,

we cannot do multi-thread put_blob as the implementation based on blob_id prediction, i.e it generate blob_data based on predicted_blob_id, if the blob get assign another blob id it will fail either in write phase or in verify phase.
The sync_for_sth() is not friendly and be putted into some common place like seal_shard(). The fundamental problem is we need to sync before/after each write operations but we dont well implement that.

Jie and I had an offline discussion regarding whether we can use IPC for data sharing across replicas, i.e leader can put an array of {blobid, digest} to IPC and follower consume that. I think it is a solution which simplify the implementation and make the code cleaner, as well as support concurrent write. But there are surely other solutions can solve the problem in another way. This PR moves all existing UTs into multi process version which is a nice step forward, so I dont want to block this for personal preference.

Would like to hear more from the user of this UT framework @sanebay and @Hoolu, if the UT in your case can be well supported.

sanebay

LGTM

src/lib/homestore_backend/hs_http_manager.cpp

sanebay · 2024-10-30T22:12:36Z

src/lib/homestore_backend/tests/homeobj_fixture.hpp

    }

-    void TearDown() override { app->clean(); }
+    // schedule create_pg to replica_num
+    void create_pg(pg_id_t pg_id, uint32_t replica_num = 0) {


why do we need create_pg with replica_num , isnt it executed only on leader and other replica's will get it via raft.

HSHomeObject::_create_pg will first create a repl_dev, then replicate a create_pg message across the raft group(repl_dev). only until we have a repl_dev , we can have a leader.

so, in homeobject level, before we create a pg(HSHomeObject::_create_pg is called), we do not have a repl_dev for this pg and thus we do not have a leader. here, replica_num means which replica will be the leader when creating repl_dev of this pg

sanebay · 2024-10-31T22:18:24Z

Can you add another PR to add support for spare replicas. This is needed for PG move.
https://github.com/eBay/HomeStore/blob/master/src/tests/test_common/hs_repl_test_common.hpp#L41

JacksonYao287 · 2024-11-01T00:07:59Z

Can you add another PR to add support for spare replicas. This is needed for PG move.

ok， will do

xiaoxichen · 2024-11-01T00:57:24Z

pls do a squash merge with some commit msg

JacksonYao287 requested review from yamingk, xiaoxichen, szmyd and sanebay October 23, 2024 01:38

JacksonYao287 self-assigned this Oct 23, 2024

xiaoxichen reviewed Oct 25, 2024

View reviewed changes

raft test framework

6656058

JacksonYao287 force-pushed the test-framework branch from 9ef1875 to 6656058 Compare October 25, 2024 08:18

xiaoxichen reviewed Oct 25, 2024

View reviewed changes

JacksonYao287 force-pushed the test-framework branch 2 times, most recently from 3cf48da to 6656058 Compare October 26, 2024 05:43

refine

9db5929

sanebay reviewed Oct 30, 2024

View reviewed changes

change log level back for hs_http_manager

2490a3c

JacksonYao287 requested a review from sanebay October 31, 2024 13:09

xiaoxichen approved these changes Nov 1, 2024

View reviewed changes

JacksonYao287 merged commit a90a20f into eBay:main Nov 1, 2024
25 checks passed

JacksonYao287 deleted the test-framework branch November 1, 2024 00:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

raft test framework for homeobject #218

raft test framework for homeobject #218

JacksonYao287 commented Oct 23, 2024 •

edited

Loading

codecov-commenter commented Oct 23, 2024 •

edited

Loading

xiaoxichen Oct 25, 2024

xiaoxichen Oct 25, 2024

JacksonYao287 Oct 25, 2024

JacksonYao287 Oct 29, 2024

xiaoxichen Oct 25, 2024

JacksonYao287 Oct 25, 2024

JacksonYao287 Oct 28, 2024

xiaoxichen left a comment

sanebay left a comment

sanebay Oct 30, 2024

JacksonYao287 Oct 31, 2024 •

edited

Loading

sanebay commented Oct 31, 2024

JacksonYao287 commented Nov 1, 2024

xiaoxichen commented Nov 1, 2024

		@@ -37,6 +37,19 @@ class BitsGenerator {
		}

		static void gen_random_bits(sisl::blob& b) { gen_random_bits(b.size(), b.bytes()); }

raft test framework for homeobject #218

raft test framework for homeobject #218

Conversation

JacksonYao287 commented Oct 23, 2024 • edited Loading

codecov-commenter commented Oct 23, 2024 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xiaoxichen left a comment

Choose a reason for hiding this comment

sanebay left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JacksonYao287 Oct 31, 2024 • edited Loading

Choose a reason for hiding this comment

sanebay commented Oct 31, 2024

JacksonYao287 commented Nov 1, 2024

xiaoxichen commented Nov 1, 2024

JacksonYao287 commented Oct 23, 2024 •

edited

Loading

codecov-commenter commented Oct 23, 2024 •

edited

Loading

JacksonYao287 Oct 31, 2024 •

edited

Loading