Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adopt device type #167

Merged
merged 11 commits into from
Apr 19, 2024
Merged

Adopt device type #167

merged 11 commits into from
Apr 19, 2024

Conversation

xiaoxichen
Copy link
Collaborator

No description provided.

conanfile.py Outdated Show resolved Hide resolved
@xiaoxichen xiaoxichen force-pushed the dev_type branch 4 times, most recently from 1e133ac to 187a74f Compare April 18, 2024 03:30
Signed-off-by: Xiaoxi Chen <[email protected]>
Signed-off-by: Xiaoxi Chen <[email protected]>
Signed-off-by: Xiaoxi Chen <[email protected]>
Signed-off-by: Xiaoxi Chen <[email protected]>
Signed-off-by: Xiaoxi Chen <[email protected]>
https://jubianchi.github.io/semver-check/#/^6.2/6.5

Caret constraint

Constraint will be satisfied by versions matching >=6.2.0 <7.0.0-0.
^6.2 is a caret constraint. It means that it will match several versions.

Given the constraint you entered, you will get:

The next minor releases which will provide new features
The next patch releases which will fix bugs
^6.2 is a range constraint. It means that it will match several versions.

Signed-off-by: Xiaoxi Chen <[email protected]>
conanfile.py Outdated
@@ -40,8 +40,8 @@ def build_requirements(self):
self.build_requires("gtest/1.14.0")

def requirements(self):
self.requires("homestore/[>=6.2, include_prerelease=True]@oss/master")
self.requires("sisl/[>=12.1, include_prerelease=True]@oss/master")
self.requires("homestore/[^6.2, include_prerelease=True]@oss/master")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: if HS made some api change that will break HO build, previously we only bump minor version, this stays true still, right? e.g. we don't need to bump major, just bump minor for HS.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure...previously we bumped the major version. We discussed in HS meeting but I am not sure a decision has been made.

If that is the case, we need to pin to [^6.2.0]

{HS_SERVICE::REPLICATION,
hs_format_params{.dev_type = HSDevType::Data,
.size_pct = 99.0,
.num_chunks = 65000,
Copy link
Contributor

@yamingk yamingk Apr 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't num_chunks from NVME drives + num_chunks from HDD equal to 65000?

Is it because maximum is 64K which is 65536, and we leave the 536 for NVME chunks?

Copy link
Collaborator Author

@xiaoxichen xiaoxichen Apr 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes that is the idea, as said, lower it to 60000 to give more chunks to nvme

HomeStore::instance()->format_and_start({
{HS_SERVICE::META, hs_format_params{.dev_type = HSDevType::Fast, .size_pct = 9.0, .num_chunks = 64}},
{HS_SERVICE::LOG,
hs_format_params{.dev_type = HSDevType::Fast, .size_pct = 45.0, .chunk_size = 32 * Mi}},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand if chunk_size is specified, HS starts from 0 num_chunks, but how does it know how many chunks has been created in total, and what is the available number chunks this LOG service can use, is there a place we assert that the total number chunks created is less than 64K (this is that FIXME you've put in creaet_vdev, right?).

I did some calculation, say one NVME drive is 500GB, 45% is around 200GB and it will require 6400 number of chunks with 32MB chunk size. 6400 + 64 + 128 + 65000 will exceed 64K, right? Please correct me if I missed something here.

Copy link
Collaborator Author

@xiaoxichen xiaoxichen Apr 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes this is very tricky part. The log device go with chunk_size and it creates chunk dynamically, so worth case it could create FAST_SIZE * PCT / CHUNK_SIZE . But we dont know have the intelligent to distribute #chunks across service, that is part of the fixme.

For the configuration now I tend to believe the behavior will be only 344 chunks is availalbe for log, bounding max logstore size to 344*32MB = 11G.

I will reduce the 65000 to 60000 for now. Other enhancement can be taken care later.

Copy link
Contributor

@yamingk yamingk Apr 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the ideal solution (with your change, in homestore) seems to be, log device should use that chunk_size, and num_chunks should not exceeding the total 64K, regardless of the pct set for this log service. This will result in losing some space not being used, better than creating chunk numbers that will overflow the chunk number and cause correctness issues.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I think we need to change the log vdev , when creating a max_num_chunks parameterscan be provided and logstore need to stay within the limit to avoid create_chunk failure, also ensure flushing/truncating the logdev properly based on the max_num_chunks. Logstore can take this chance to decide whether it want to adjust the chunk_size.

There will not be correctness issue as in pdev::create_chunk , it will throw std::out_of_range but that probably crash the process. https://github.com/eBay/HomeStore/blob/2c500c573fa2c5733218e0cbdd44226fb3e6504f/src/lib/device/physical_dev.cpp#L275

{HS_SERVICE::REPLICATION,
hs_format_params{.dev_type = run_on_type,
.size_pct = 79.0,
.num_chunks = 65000,
Copy link
Contributor

@yamingk yamingk Apr 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should also be adjusted, right? 10 pct of total device would be still very large and exceed 64K in total?

Also can we be more conservative say setting this to 40000, before the fixme part is done? I am not sure what will be the total disk size in production, we probably need some careful calculation.

For mixed mode, I remember hearing something from John D saying we would have around 900GB of nvme per SM, and it would result in around 12800 num_chunks for logstore with 32MB chunk_size.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, lets postpone this calculation and will get it from testing on real environment.

that give 5536 chunks for NVME.

Signed-off-by: Xiaoxi Chen <[email protected]>
Signed-off-by: Xiaoxi Chen <[email protected]>
cxxopts use >> to parse opts.

Signed-off-by: Xiaoxi Chen <[email protected]>
yamingk
yamingk previously approved these changes Apr 19, 2024
conanfile.py Outdated
@@ -40,8 +40,8 @@ def build_requirements(self):
self.build_requires("gtest/1.14.0")

def requirements(self):
self.requires("homestore/[>=6.2, include_prerelease=True]@oss/master")
self.requires("sisl/[>=12.1, include_prerelease=True]@oss/master")
self.requires("homestore/[~6.2.0, include_prerelease=True]@oss/master")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be ~6.2, right? So that HO can still pick any patch changes automatcially from HS?
Or is ~6.2.0 also do the work?

@xiaoxichen xiaoxichen merged commit 7dda682 into eBay:main Apr 19, 2024
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants