Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iterator: unify behavior and implement prev and to_last to all Iterators #184

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

GanZiheng
Copy link
Contributor

@GanZiheng GanZiheng commented Jul 21, 2022

Signed-off-by: GanZiheng [email protected]

Unify all kinds of iterators which implement the AgateIterator trait.

Suppose we have another two extra elements in the data which the iterator iterates over, one is before the original first element, called before first, and another is after the last original element, namely after last.

  1. If we arrives the first element, and call prev, we will arrive before first, and valid will return false;
  2. If we arrives the last element, and call next, we will arrive after last, and valid will return false;
  3. At before first, if we call prev, nothing will change, and if we call next, we will move to the first element;
  4. At after last, if we call next, nothing will change, and if we call prev, we will move to the last element;

When entries are set but not committed yet, they will stay in pending_writes of the transaction, and if we build PendingWritesIterator to iterate over them, the timestamp of each entry is set to read timestamp of the transaction. So, we may get duplicate keys in MergeIterator of Iterator generated by Transaction.
However, the logic of prev in MergeIterator is complicated if there exists duplicated keys. So I also update the timestamp appended to key. When writing an entry, we will append commit timestamp * 10 to the key and we will use read timestamp * 10 + 5 to search an entry or build PendingWritesIterator.
Since all keys are different in PendingWritesIterator and other iterators respectively, to a specific key, there may be at most one pair of identical keys in all iterators. We can simplify logic of MergeIterator with this limitation.

* update prev and to last

Signed-off-by: GanZiheng <[email protected]>

* update

Signed-off-by: GanZiheng <[email protected]>

* add tests

Signed-off-by: GanZiheng <[email protected]>
…v` and `to_last` to `Iterator` (#5)

* simply update iterator and need refine

Signed-off-by: GanZiheng <[email protected]>

* update table iterator

Signed-off-by: GanZiheng <[email protected]>

* update skl iterator

Signed-off-by: GanZiheng <[email protected]>

* update concat iterator

Signed-off-by: GanZiheng <[email protected]>

* fix clippy

Signed-off-by: GanZiheng <[email protected]>

* update pending writes

Signed-off-by: GanZiheng <[email protected]>

* update merge iterator

Signed-off-by: GanZiheng <[email protected]>
@BusyJay
Copy link
Member

BusyJay commented Jul 21, 2022

Why not define seek_to_first and seek_to_last just like what rocksdb does?

@GanZiheng
Copy link
Contributor Author

GanZiheng commented Jul 21, 2022

Why not define seek_to_first and seek_to_last just like what rocksdb does?

Currently, rewind is the same as seek_to_first, and to_last added int this pull request is actually equivalent with seek_to_last. maybe I should rename these methods?

@codecov
Copy link

codecov bot commented Jul 21, 2022

Codecov Report

Merging #184 (b7138ea) into master (475e17b) will increase coverage by 1.19%.
The diff coverage is 98.55%.

@@            Coverage Diff             @@
##           master     #184      +/-   ##
==========================================
+ Coverage   89.32%   90.52%   +1.19%     
==========================================
  Files          39       39              
  Lines        8752     9487     +735     
==========================================
+ Hits         7818     8588     +770     
+ Misses        934      899      -35     

Signed-off-by: GanZiheng <[email protected]>
@BusyJay BusyJay changed the title iterator: uniform behavior and implement prev and to_last to all Iterators iterator: unify behavior and implement prev and to_last to all Iterators Jul 21, 2022
GanZiheng and others added 3 commits July 23, 2022 12:36
Signed-off-by: GanZiheng <[email protected]>
Signed-off-by: GanZiheng <[email protected]>
Signed-off-by: GanZiheng <[email protected]>
@GanZiheng GanZiheng marked this pull request as ready for review July 25, 2022 06:48
Signed-off-by: GanZiheng <[email protected]>
src/iterator.rs Outdated Show resolved Hide resolved
/// Just like `parse_item`, but used for `prev`.
///
/// This function advances the iterator.
fn parse_item_for_prev(&mut self) -> bool {
Copy link
Member

@Connor1996 Connor1996 Jul 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we reuse parse_item by passing a parameter to indicate the direction?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can combine parse_item and parse_item_for_prev once the logic is reviewed.

src/iterator.rs Outdated
let key_with_ts = key_with_ts(key, self.item().version);

// TODO: Consider add direction flag to avoid seek.
self.table_iter.seek(&key_with_ts);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why seek

Copy link
Contributor Author

@GanZiheng GanZiheng Jul 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since parse_item and parse_item_for_prev would advance and retreat the table_iter inside the Iterator respectively, what we should do in case of next after next and prev after next are different. We may add flag or update parse_item and parse_item_for_prev to avoid seek.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a flag current_direction to indicate the "direction" of the iterator. If the last operation is prev, it will be set to false, and we will advance the inner iterator double times in that case if we call next subsequently.

Signed-off-by: GanZiheng <[email protected]>
@skyzh
Copy link
Member

skyzh commented Jul 27, 2022

Generally LGTM. But I wonder if there is any use case inside AgateDB that we will need to do both forward iterating and backward iterating on a single iterator? If not, we can use const generics to implement forward/backward iterator using one piece of code. It will also simplifies the case on how to seek.

@skyzh
Copy link
Member

skyzh commented Jul 27, 2022

@GanZiheng
Copy link
Contributor Author

Generally LGTM. But I wonder if there is any use case inside AgateDB that we will need to do both forward iterating and backward iterating on a single iterator? If not, we can use const generics to implement forward/backward iterator using one piece of code. It will also simplifies the case on how to seek.

Previously, AgateDB has both forward and reverse iterators, but once the kind of iterator is determined, it could only go ahead but cannot go back. Current I support both forward and reverse iterator to go back.

@skyzh
Copy link
Member

skyzh commented Jul 27, 2022

Generally LGTM. But I wonder if there is any use case inside AgateDB that we will need to do both forward iterating and backward iterating on a single iterator? If not, we can use const generics to implement forward/backward iterator using one piece of code. It will also simplifies the case on how to seek.

Previously, AgateDB has both forward and reverse iterators, but once the kind of iterator is determined, it could only go ahead but cannot go back. Current I support both forward and reverse iterator to go back.

Yes, so I'm wondering whether this is necessary... I can hardly come up with a use case that users want to change direction on-the-fly.

@GanZiheng
Copy link
Contributor Author

Generally LGTM. But I wonder if there is any use case inside AgateDB that we will need to do both forward iterating and backward iterating on a single iterator? If not, we can use const generics to implement forward/backward iterator using one piece of code. It will also simplifies the case on how to seek.

Previously, AgateDB has both forward and reverse iterators, but once the kind of iterator is determined, it could only go ahead but cannot go back. Current I support both forward and reverse iterator to go back.

Yes, so I'm wondering whether this is necessary... I can hardly come up with a use case that users want to change direction on-the-fly.

In fact, that's because RocksDB supports it and TiKV needs it 🤣.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants