Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement]: improve ProtoBuf marshal and unmarshal with vtprotobuf #38847

Open
1 task done
jaime0815 opened this issue Dec 30, 2024 · 4 comments
Open
1 task done
Assignees
Labels
kind/enhancement Issues or changes related to enhancement

Comments

@jaime0815
Copy link
Contributor

jaime0815 commented Dec 30, 2024

Is there an existing issue for this?

  • I have searched the existing issues

What would you like to be added?

See more about https://github.com/planetscale/vtprotobuf

This benchmark for InsertRequest shows that unmarshal latency is 3x faster than the original proto unmarshal method, while marshal latency is 70% slower than the original proto marshal method.

goos: darwin
goarch: amd64
pkg: vtprotobuf-bench
cpu: Intel(R) Core(TM) i7-1068NG7 CPU @ 2.30GHz
BenchmarkInsertRequest
BenchmarkInsertRequest/Marshal
BenchmarkInsertRequest/Marshal-8         	     703	   1739091 ns/op	   3078891 bytes
BenchmarkInsertRequest/MarshalVT
BenchmarkInsertRequest/MarshalVT-8       	     450	   2874546 ns/op	   3078891 bytes
BenchmarkInsertRequest/Unmarshal
BenchmarkInsertRequest/Unmarshal-8       	     180	   8882166 ns/op	   3078891 bytes
BenchmarkInsertRequest/UnmarshalVT
BenchmarkInsertRequest/UnmarshalVT-8     	     384	   3296314 ns/op	   3078891 byte

Another key advantage is reduced CPU usage, with unmarshalling saving about 1X CPU time and marshalling saving 25%. Using the memory pool function of vtprotobuf will further reduce CPU usage.

image

image

benchmark code snippet

func getInsertReq() *protos.InsertRequest {
	return &protos.InsertRequest{
		DbName:         "db1",
		CollectionName: "col1",
		FieldsData:     []*protos.FieldData{NewFloatVectorFieldData("f1", 1000, 768)},
		HashKeys:       GenerateHashKeys(1000),
		NumRows:        uint32(1000),
	}
}

func BenchmarkInsertRequest(b *testing.B) {
	req := getInsertReq()
	bs, err := proto.Marshal(req)
	if err != nil {
		b.Fatal(err)
	}
	b.Run("Marshal", func(b *testing.B) {
		total := 0
		for i := 0; i < b.N; i++ {
			bs, err := proto.Marshal(req)
			total += len(bs)
			if err != nil {
				b.Fatal(err)
			}
		}
		b.ReportMetric(float64(total)/float64(b.N), "bytes")
	})

	b.Run("MarshalVT", func(b *testing.B) {
		total := 0
		for i := 0; i < b.N; i++ {
			bs, err := req.MarshalVT()
			total += len(bs)
			if err != nil {
				b.Fatal(err)
			}
		}
		b.ReportMetric(float64(total)/float64(b.N), "bytes")
	})

	b.Run("Unmarshal", func(b *testing.B) {
		total := 0
		for i := 0; i < b.N; i++ {
			var l protos.InsertRequest
			total += len(bs)
			if err := proto.Unmarshal(bs, &l); err != nil {
				b.Fatal(err)
			}
		}
		b.ReportMetric(float64(total)/float64(b.N), "bytes")
	})

	b.Run("UnmarshalVT", func(b *testing.B) {
		total := 0
		for i := 0; i < b.N; i++ {
			var l protos.InsertRequest
			total += len(bs)
			if err := l.UnmarshalVT(bs); err != nil {
				b.Fatal(err)
			}
		}
		b.ReportMetric(float64(total)/float64(b.N), "bytes")
	})
}

Why is this needed?

No response

Anything else?

No response

@jaime0815 jaime0815 added the kind/enhancement Issues or changes related to enhancement label Dec 30, 2024
@jaime0815 jaime0815 self-assigned this Dec 30, 2024
@xiaofan-luan
Copy link
Collaborator

https://github.com/planetscale/vtprotobuf

  1. how does it be compatible with current implementation?
  2. what's the size comparison with vprotobuf and protobuf

@jaime0815
Copy link
Contributor Author

https://github.com/planetscale/vtprotobuf

  1. how does it be compatible with current implementation?
  2. what's the size comparison with vprotobuf and protobuf
  1. It is fully compatible with the current implementation, but the API requires changes.
  2. The sizes remain the same after marshaling or unmarshaling.

@alexanderguzhva
Copy link
Contributor

@jaime0815

  1. the results may be misleading because of a CPU type (cpu: Intel(R) Core(TM) i7-1068NG7 CPU @ 2.30GHz)
  2. it needs to be ensured that the most recent versions of both baseline and candidate libraries are used. Is it requirement satisfied?

@xiaofan-luan
Copy link
Collaborator

@jaime0815

  1. the results may be misleading because of a CPU type (cpu: Intel(R) Core(TM) i7-1068NG7 CPU @ 2.30GHz)
  2. it needs to be ensured that the most recent versions of both baseline and candidate libraries are used. Is it requirement satisfied?

Good Suggestion, we can test it on R7gd. Ideally this won't make too much difference becasue the optimization is more on the data structure side

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement Issues or changes related to enhancement
Projects
None yet
Development

No branches or pull requests

3 participants