Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fast lookup feature spec #127

Closed
wants to merge 5 commits into from
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 31 additions & 0 deletions docs/specs/lookup.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Product Requirements Specification

Implement fast lookup via primary key or secondary indices

## Problem:

There are use cases within big-data scenarios where fast lookup with index are beneficial.
Especially in streaming processing, lookup latency have a significant impact to overall
sunxiaoguang marked this conversation as resolved.
Show resolved Hide resolved
throughput. TiBigData implements full table scan only at this time, which makes it suboptimal to
sunxiaoguang marked this conversation as resolved.
Show resolved Hide resolved
workloads that requires low latency. Therefore, flink-tidb-connector utilize JDBC lookup table
sunxiaoguang marked this conversation as resolved.
Show resolved Hide resolved
source instead. Since Flink, like other big-data computation frameworks, is at computing layer by
sunxiaoguang marked this conversation as resolved.
Show resolved Hide resolved
itself. Sending lookup request to TiDB server which is at computing layer as well is redundant and
sunxiaoguang marked this conversation as resolved.
Show resolved Hide resolved
waste of resources. Lookup with TiDB server contributes to growing of latency with extra hops in
sunxiaoguang marked this conversation as resolved.
Show resolved Hide resolved
access path.

## Goals:

Implement primary key and secondary index based lookup in TiBigData. Decreases lookup latency with
sunxiaoguang marked this conversation as resolved.
Show resolved Hide resolved
the best efforts and saves unnecessary resources waste for load balancers and TiDB servers.
sunxiaoguang marked this conversation as resolved.
Show resolved Hide resolved

## Solutions:

1. Extract columns from predicates
2. Try match columns with PK and secondary indices
3. Get row handle from PK or secondary index
sunxiaoguang marked this conversation as resolved.
Show resolved Hide resolved
4. Get row data by handle or by PK if table uses clustered index
sunxiaoguang marked this conversation as resolved.
Show resolved Hide resolved

## What reports do we need:

* Latency benchmark for native point get and point get via TiDB server.
* TiDB server CPU usage.