Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

github issue #159: Setup CP periodic timer and dirty buf exceed callback #182

Merged
merged 2 commits into from
Sep 27, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion conanfile.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

class HomestoreConan(ConanFile):
name = "homestore"
version = "4.5.0"
version = "4.5.1"

homepage = "https://github.com/eBay/Homestore"
description = "HomeStore Storage Engine"
Expand Down
3 changes: 2 additions & 1 deletion src/include/homestore/checkpoint/cp_mgr.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -163,6 +163,7 @@ class CPManager {
std::unique_ptr< CPWatchdog > m_wd_cp;
superblk< cp_mgr_super_block > m_sb;
std::vector< iomgr::io_fiber_t > m_cp_io_fibers;
iomgr::timer_handle_t m_cp_timer_hdl;

public:
CPManager();
Expand Down Expand Up @@ -206,7 +207,7 @@ class CPManager {

/// @brief Trigger a checkpoint flush on all subsystems registered. There is only 1 checkpoint per checkpoint
/// manager. Checkpoint flush will wait for cp to exited all critical io sections.
/// @param force : Do we need to force queue the checkpoint flush, in case previous checkpoint is been flushed
/// @param force : Do we need to force queue the checkpoint flush, in case previous checkpoint is being flushed
folly::Future< bool > trigger_cp_flush(bool force = false);

const std::array< std::unique_ptr< CPCallbacks >, (size_t)cp_consumer_t::SENTINEL >& consumer_list() const {
Expand Down
14 changes: 13 additions & 1 deletion src/lib/checkpoint/cp_mgr.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,10 @@
#include <homestore/homestore.hpp>
#include <homestore/meta_service.hpp>
#include <homestore/checkpoint/cp_mgr.hpp>

#include <homestore/homestore.hpp>
#include "common/homestore_assert.hpp"
#include "common/homestore_config.hpp"
#include "common/resource_mgr.hpp"
#include "cp.hpp"

namespace homestore {
Expand All @@ -35,6 +36,9 @@ CPManager::CPManager() :
[this](meta_blk* mblk, sisl::byte_view buf, size_t size) { on_meta_blk_found(std::move(buf), (void*)mblk); },
nullptr);

resource_mgr().register_dirty_buf_exceed_cb(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we wait for cp flush to complete in this exceeded size case ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cp manager controls this, right? If a cp is running, a back-2-back cp will be created.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: will dirty_buf_exceed_cb every time a new dirty_buf added after exceeding the limit?
Trying to avoid the case that we exceeded the dirty limit for a moment and a lot of unnecessary CP get triggered.

Copy link
Contributor Author

@yamingk yamingk Sep 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CP manager will garunteen only one cp can be running at anytime. e.g. if a cp is already running, it returns immediately. And if consumer asks for a forced trigger_cp, a back-2-back cp will be created.
If cp is slow or stuck (due to bugs or disk issues), it push the back pressure to consumers, e.g. no more dirty buffer can be allocated and push back I/O failures.

[this]([[maybe_unused]] int64_t dirty_buf_count) { this->trigger_cp_flush(false /* false */); });

start_cp_thread();
}

Expand All @@ -46,6 +50,11 @@ void CPManager::start(bool first_time_boot) {
create_first_cp();
m_sb.write();
}

LOGINFO("cp timer is set to {} usec", HS_DYNAMIC_CONFIG(generic.cp_timer_us));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is here instead of constructor to make some of the recover test happy.

m_cp_timer_hdl = iomanager.schedule_global_timer(
HS_DYNAMIC_CONFIG(generic.cp_timer_us) * 1000, true, nullptr /*cookie*/, iomgr::reactor_regex::all_worker,
[this](void*) { trigger_cp_flush(false /* false */); }, true /* wait_to_schedule */);
}

void CPManager::on_meta_blk_found(const sisl::byte_view& buf, void* meta_cookie) {
Expand All @@ -70,6 +79,9 @@ void CPManager::shutdown() {
m_wd_cp->stop();
m_wd_cp.reset();
}

iomanager.cancel_timer(m_cp_timer_hdl, true);
m_cp_timer_hdl = iomgr::null_timer_handle;
}

void CPManager::register_consumer(cp_consumer_t consumer_id, std::unique_ptr< CPCallbacks > callbacks) {
Expand Down
4 changes: 2 additions & 2 deletions src/lib/common/homestore_config.fbs
Original file line number Diff line number Diff line change
Expand Up @@ -116,8 +116,8 @@ table LogStore {
}

table Generic {
// blk alloc cp timer in us
blkalloc_cp_timer_us: uint64 = 60000000 (hotswap);
// cp timer in us
cp_timer_us: uint64 = 60000000 (hotswap);

// writeback cache flush threads
cache_flush_threads : int32 = 1;
Expand Down
2 changes: 1 addition & 1 deletion src/lib/common/resource_mgr.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ class RsrcMgrMetrics : public sisl::MetricsGroup {
~RsrcMgrMetrics() { deregister_me_from_farm(); }
};

typedef std::function< void(int64_t) > exceed_limit_cb_t;
typedef std::function< void(int64_t /* dirty_buf_cnt */) > exceed_limit_cb_t;
const uint32_t max_qd_multiplier = 32;

class ResourceMgr {
Expand Down