Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sqlite database framework and simple tests #277

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

dailinsubjam
Copy link
Contributor

Closes #184

This PR:

  • Add sqlite database framework by adding basic function like new(), load(), append(), remove()
  • Add tests for these functions

This PR does not:

  • Connect to a real-world database in a reliable way yet, now it can only be tested out locally by creating a database file locally.
  • Interact with real transaction submitted to builder yet, this will be done in the next PR.
  • Add txn_status yet.
  • Touch other types of database(e.g. postgres) yet, this might be touched in future PR.

Key places to review:

All changes.

How to test this PR:

You can run these tests locally after creating transactions.db under shared/src/persistence/test_data/sqlite:
cargo test test_persistence_append_and_load_txn -- --nocapture and cargo test test_persistence_remove_txn -- --nocapture

@dailinsubjam dailinsubjam requested review from shenkeyao and QuentinI and removed request for shenkeyao December 9, 2024 12:02
Comment on lines 9 to 11
async fn append(&self, tx_data: Vec<u8>) -> Result<(), sqlx::Error>;
async fn load(&self, timeout_after: Instant) -> Result<Vec<Vec<u8>>, sqlx::Error>;
async fn remove(&self, tx: Vec<u8>) -> Result<(), sqlx::Error>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't these functions be public? Also, can we add documentation for them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the trait BuilderPersistence is public these functions should be public automatically, correct me if I'm wrong. But this is a good reminder that clear() and new() should also be public. Also added more comments.

Comment on lines 15 to 16
// Sishan TODO: make it more clean and find a more reliable way
let current_dir = env::current_dir().expect("Failed to get current working directory");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this TODO isn't going to be resolved in this PR, can we make a GH issue and add the link here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be resolved now!

Comment on lines +13 to +15
pub struct SqliteTxnDb {
pool: SqlitePool,
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Are we going to add more fields to this struct? If not, can we make it SqliteTxnDb(SqlitePool)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future we're going to add some pruning parameters here like minimum retention or target retention. It will be expanded later but I don't have a strong opinion for now.

Comment on lines 17 to 18
#[allow(dead_code)]
async fn new(database_url: String) -> Result<Self, sqlx::Error> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this constructor and fn clear both be public? Then we can remove #[allow(dead_code)].

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's the reason!

Comment on lines 60 to 63
// Convert Instant to SystemTime
let now = SystemTime::now();
let elapsed = timeout_after.elapsed();
let target_time = now - elapsed;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to directly convert timeout_after from Instant to SystemTime without doing now - elapsed? The current conversion seems a bit redundant, and may even be inaccurate because it relies on an additional now time.

Copy link
Contributor Author

@dailinsubjam dailinsubjam Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be the most straightforward way as far as I know. The reason for this is that Instant represents a point in time relative to an unspecified reference (it doesn't have a fixed "epoch" like SystemTime), while SystemTime represents an absolute point in time from the UNIX epoch. So without using now - elapsed, we want to first obtaining the system time at the moment of the Instant's creation.

Ok(())
}

async fn load(&self, timeout_after: Instant) -> Result<Vec<Vec<u8>>, sqlx::Error> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the parameter a bit confusing. Can we call this before to indicate we want transactions before the given timestamp? I don't think the function itself cares whether the input time is relevant to a timeout or not.

Besides the argument name, I was also wondering whether this function would be what we want. I thought we would use load to get transactions after a certain timestamp, not before. See my comment on the test about why the current implementation doesn't seem to provide all the information we want.

Copy link
Contributor Author

@dailinsubjam dailinsubjam Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about the naming thing, I just realized it's really confusing, updated to before_instant. For the use case of the function load, see my reply on the test.

Comment on lines 142 to 151
// Append one more transaction
db.append(vec![7, 8, 9]).await.expect("In test_persistence_append_and_load_txn, there shouldn't be any error when doing append");

// Load transactions before timeout_after
let tx_data_list = db.load(timeout_after).await.expect(
"In test_persistence_append_and_load_txn, it should be able to load some transactions.",
);
tracing::debug!("Transaction data before timeout: {:?}", tx_data_list);

assert_eq!(tx_data_list, test_tx_data_list);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use case seems a bit off or incomplete to me. If we append a transaction after the timeout, don't we want a load function to provide us with the new transaction? In this example, it's not clear how the builder will eventually submit [1, 2, 3], [4, 5, 6], and [7, 8, 9]. After it gets [1, 2, 3], [4, 5, 6] by calling load, is the builder going to call load again to get [7, 8, 9]?

I was thinking load would give us all three transactions at once, or even if not by one call, there would be a convenient way to get all transactions, but this test seems to suggest that we don't need [7, 8, 9].

Copy link
Contributor Author

@dailinsubjam dailinsubjam Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

load(self, instant) would give us all the transactions created at some point before the given instant. Sorry about the confusion from naming. The test is just to check load could behave correctly, i.e. if we have some transactions submitted after the given instant, they shouldn't be returned, and only those submitted before should be returned.

load(self, instant) is designed this way because we have a function collect_txns(timeout_after) to collect all the submitted txn till an instant. Now I think for database it might be the same to return all the transactions, as the only timeout_after we'll set is near future, and this is our only use case. I'd vote to remove it if there's more overhead due to this argument or else I'd keep it. How do you think?


// Load all transactions
let mut all_transactions = db
.load(Instant::now())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to my comment on load about the argument name: Here Instant::now() isn't a timeout timestamp at all, so the argument name shouldn't be tied with the timeout concept.

"get_sqlite_test_db_path() = {:?}",
get_sqlite_test_db_path()
);
let db = SqliteTxnDb::new(get_sqlite_test_db_path()).await.expect(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this create temporary files? If any assertion fails before the test completes, will those files be deleted?

Also, will it be better if we add an index or test name, etc. to distinguish the get_sqlite_test_db_path calls in tests and avoid race conditions (is it necessary?)? E.g., We can call get_sqlite_test_db_path("test_append_and load") in this function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is updated to use the crate tempfile which will create a new temporary directory on the filesystem and the directory will be automatically deleted when it goes out of scope (if the returned TempDir value is dropped).

@coveralls
Copy link

coveralls commented Dec 10, 2024

Coverage Status

coverage: 90.884% (+0.08%) from 90.807%
when pulling 42c1fb8 on sishan/introduce_sqlite
into 7df48a9 on main.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Persistence Mempool] - Add database framework and simple tests
3 participants