Support for specifying an array of metadata objects to use for the outgoing requests #234

Kami · 2020-10-20T15:10:48Z

This pull request updates the code so in addition to specifying a single metadata object which is used for all the outgoing gRPC calls / requests, user can also specify a list of metadata objects to use for the outgoing requests.

This allows people to use different metadata for different requests. It's similar to the change implemented in #87, but this one is for the actual request metadata and for the payload / body / data.

Background, Context, Rationale

#87 implemented support which allows users to specify different messages for unary calls. Those messages are then used in a round robin fashion for the outgoing requests (thanks to @ezsilmar for implementing that).

That functionality comes handy in many scenarios where sending the same data with every request will not results in a representative result (e.g. due to the server logic, caching or similar).

In our case, actual processing performed by the server also depends on the metadata field values.

This means if we want to get a representative result when performing simple gRPC server level benchmark using this tool, we need to send different (and specific) metadata for each outgoing request.

And in that specific case we can't utilize call template data functionality since that data is not available in the template context.

Proposed Implementation

This PR implements a change which allows user to either specify an object directly or an array of objects for the metadata argument.

If an array is specified, we will use different metadata values for different outgoing requests in a round-robin fashion.

This follows similar approach which is already used for the actual messages and was implemented in #86.

Proposed implementation in this PR is just a quick WIP version of this change.

If other people agree this is indeed a good feature / functionality to have, I will clean it up and finish it.

Open Questions

Making sure the change is fully backward compatible (aka that both approaches - single object and an array with objects is supported) is pretty straight forward when metadata is either specified directly as a command line argument or read from a file.

This becomes more problematic though when reading configuration from a JSON or TOML configuration file.

One approach I could think of is to "rewrite" config on load and ensure metadata field value is always an array. This approach is somewhat fragile not ideal so I wonder if there is a better way to handle that (suggestions and feedback is welcome).

Another approach is to have two Config struct definitions - one of the map and one of the array of maps and then internally after parsing we make sure we always convert that field to an array.

EDIT: I pushed a change so we use the same generic approach as we use for the Data argument - this way we can handle this change config wise in a fully backward compatible manner.

TODO

Finish the implementation
- Make sure the change is fully backward compatible (ensure we support both notations - single object or an array for objects for metadata specified either via command line or config file)
- Clean up the code
- Tests
- Update docs
- Update examples
- Update readme

which will be used for subsequent requests in a round-robin fashion (in the same way we handle the actual payload / body if multiple protobuf objects are specified).

for the metadata config option - either an object or an array of objects.

JSON string into an object.

Kami · 2020-10-21T10:20:47Z

Alright I pushed additional changes:

The change is now fully backward compatible not matter how you specify the data (command line argument with actual values, command line argument with file path, json / yaml / toml config file)
Added end to end tests
Updated readme, docs and examples

I believe functionality wise this should be more or less complete, I just need to clean up and refactor some duplicated code.

If I missed something or some place, please let me know.

bojand · 2020-10-22T21:17:08Z

cmd/ghz/main.go

+			if err := json.Unmarshal([]byte(*md), &metadataArray); err != nil {
+				return fmt.Errorf("Error unmarshaling metadata '%v': %v", *md, err.Error())
+			}
+		}


With respect to determining the underlying type we are trying to Unmarshal, instead of relying on strings.Contains() of the error message, I think we can take advantage of the actual UnmarshalTypeError. The approach may be a little bit more reliable. Example of what I mean:

md := `["foo", "bar"]` var metadataMap map[string]string if err := json.Unmarshal([]byte(md), &metadataMap); err != nil { if e, ok := err.(*json.UnmarshalTypeError); ok && e.Value == "array" { fmt.Println("trying to Unmarshal array into map", e) } } else { fmt.Println(metadataMap) }

bojand · 2020-10-22T21:21:54Z

runner/call_template_data.go

+// an array. If the input is an object, but not an array, it's converted to an array.
+func (td *callTemplateData) executeMetadataArray(metadata string) ([]map[string]string, error) {
+	var mdArray []map[string]string
+	var metadataSanitized = strings.TrimSpace(metadata)


Maybe I am missing something, but can we ensure that the metadata string is sanitized / trimmed earlier, wherever we set it as w.config.metadata. That way we do not have to trim on every invocation of executeMetadataArray().

bojand · 2020-10-22T21:23:00Z

runner/call_template_data.go

+	var metadataSanitized = strings.TrimSpace(metadata)
+
+	// If the input is an object and not an array, we ensure we always work with an array
+	if !strings.HasPrefix(metadataSanitized, "[") && !strings.HasSuffix(metadataSanitized, "]") {


probably not a big deal, but should this go in the check if len(metadata) > 0 { check?

bojand · 2020-10-22T21:25:11Z

runner/config.go

+							for k, v := range objData3 {
+								sk, isString := k.(string)
+								if !isString {
+									return errors.New("Data key must string")


I think the errors here should reference "Metadata key..."?

bojand · 2020-10-22T21:32:35Z

runner/config.go

+					var array []map[string]interface{}
+					for _, item := range arrData {
+						objData3, isObjData3 := item.(map[interface{}]interface{})
+						newItem := make(map[string]interface{})


I believe valid metadata object has to be map[string]string so here we should be asserting that both key and value in the internal representation of the objects in array are strings and then converting those to that type?

bojand · 2020-10-22T21:37:21Z

Hello, thank you for the PR and the detailed description! I understand the problem, and I think this is a sensible approach. I will probably need another read through, but I have left a few comments to double check on from my initial review. I'll probably re-read again when I get some more time. Once we get these addressed and when you feel you've cleaned up the code we can get this merged. Thanks again!

Kami · 2021-01-06T12:19:42Z

@bojand Sorry for the delay - I some how missed your review feedback.

I'll try to address this review feedback this weekend (I've been using those changes for a while now, but I do agree that it needs more clean up, etc. :)).

Kami · 2021-01-06T18:52:26Z

runner/call_template_data.go

+
+	if len(metadata) > 0 {
+		input := []byte(metadata)
+		tpl, err := td.execute(metadata)


@bojand I started looking and profiling this code again and I think it would be good to add another optimization here so we don't re-render / evaluate the template and unmarshall the string for every single request.

That's especially important when using large metadata JSON objects like in my case.

Basically, I'm looking at adding some new flag (e.g. --plaintext-metadata, open to better naming) and when this flag is used we don't evaluate metadata as a template and cache it on the worker object and re-use it for subsequent requests, similar as we do with w.cachedMessages.

This should substantially speed things up and reduce memory usage (I'm just testing this change to confirm that).

WDYT?

Here we go, here is a quick WIP version - d586c35.

I confirmed that using --plaintext-metadata flag is much more efficient when working with large metadata JSON objects / arrays (like in my case) and results in lower worker CPU usage.

Which is kinda expected, because trying to render the template + parsing JSON for every single request will never be efficient when working with large metadata objects.

When this flag is used, we don't templatize metadata JSON object for every single request and json load it, but we only do that once and re-use cached version for subsequent requests. This results in much less overhead when metadata template functionality is not needed and utilizing large metadata objects.

Kami added 6 commits October 20, 2020 16:54

Add WIP change which allows user to specify a list of metadata objects

ff8c81c

which will be used for subsequent requests in a round-robin fashion (in the same way we handle the actual payload / body if multiple protobuf objects are specified).

Update config loading code and ensure we handle both supported notations

439f571

for the metadata config option - either an object or an array of objects.

Fix failing test.

51f3353

Immediately propagate fatal errors when trying to de-serialize metadata

d294b1b

JSON string into an object.

Update readme, command line argument description and examples.

a59228f

Add "end to end" tests for variable meta data round robin functionality.

f46a37e

Kami changed the title ~~[RFC] [WIP] Support for specifying an array of metadata objects to use for the outgoing requests~~ Support for specifying an array of metadata objects to use for the outgoing requests Oct 21, 2020

bojand reviewed Oct 22, 2020

View reviewed changes

Kami commented Jan 6, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for specifying an array of metadata objects to use for the outgoing requests #234

Support for specifying an array of metadata objects to use for the outgoing requests #234

Kami commented Oct 20, 2020 •

edited

Loading

Kami commented Oct 21, 2020

bojand Oct 22, 2020

bojand Oct 22, 2020

bojand Oct 22, 2020

bojand Oct 22, 2020 •

edited

Loading

bojand Oct 22, 2020

bojand commented Oct 22, 2020

Kami commented Jan 6, 2021

Kami Jan 6, 2021

Kami Jan 6, 2021 •

edited

Loading

Support for specifying an array of metadata objects to use for the outgoing requests #234

Are you sure you want to change the base?

Support for specifying an array of metadata objects to use for the outgoing requests #234

Conversation

Kami commented Oct 20, 2020 • edited Loading

Background, Context, Rationale

Proposed Implementation

Open Questions

TODO

Kami commented Oct 21, 2020

bojand Oct 22, 2020

Choose a reason for hiding this comment

bojand Oct 22, 2020

Choose a reason for hiding this comment

bojand Oct 22, 2020

Choose a reason for hiding this comment

bojand Oct 22, 2020 • edited Loading

Choose a reason for hiding this comment

bojand Oct 22, 2020

Choose a reason for hiding this comment

bojand commented Oct 22, 2020

Kami commented Jan 6, 2021

Kami Jan 6, 2021

Choose a reason for hiding this comment

Kami Jan 6, 2021 • edited Loading

Choose a reason for hiding this comment

Kami commented Oct 20, 2020 •

edited

Loading

bojand Oct 22, 2020 •

edited

Loading

Kami Jan 6, 2021 •

edited

Loading