You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a uber-ticket to track work for CST validation when converting between (both to and from) Recap schemas and other serdes (Avro, JSON schema, Proto).
The idea is to define a test directory like:
/tests/converters/0000
from.avsc
from.proto
from.json
recap.yaml
/tests/converters/0001
to.avsc
to.proto
to.json
recap.yaml
/tests/converters/0002
to.avsc
to.proto
to.json
recap.yaml
from.avsc
The test harness would iterate over every test subdir and verify that using a serde's converter and types.py's to_dict/from_dict would result in the expected structures.
For example, in 0000, the test harness would load the from.avsc into a dict, then use AvroConverter.to_recap to convert the Avro schema to a Recap StructType (the Recap AST). It'd then use recap.types.to_dict to convert the Recap AST to the Recap CST (dict). Finally, it'd load the recap.yaml and compare it to the Recap CST--they should be identical. It'd then do the same thing for the .proto and .json files.
In the second directory example it'd do the inverse. In the third directory example (0002) it'd do both to and from checks for the Avro schemas. The reason for splitting to/from is that you might not get the same schema when converting to Recap as when converting from Recap since some conversions are lossy or require coercion.
I think building out a test suite like that will be very valuable. There are a lot of good ideas expressed in this. It was a little hard for me to parse.
I think you should put more explicit meaning into the names and directories of the tests, and group the tests semantically.
"from" vs "to" tests. I think using input and output and expected in the file names inside the directory will make it more clear.
I think the directory should identify which conversion direction is happening.
.schema.json to differentiate from .json
Instead of numbers, tests named based on what they're trying to test. This I'm not so convinced on. It may be worth it to just put a readme.md in each test directory to explain what each test is trying to cover.
I like all your suggestions. I'll incorporate them.
Regarding the third test semantic, what I was trying to emphasize was that a single test could validate both avsc -> Recap and Recap -> avsc. The first two tests just validate one direction or the other. I don't think this is strictly required, since you could just have two tests, but I thought it might be more convenient.
This is a uber-ticket to track work for CST validation when converting between (both to and from) Recap schemas and other serdes (Avro, JSON schema, Proto).
The idea is to define a test directory like:
The test harness would iterate over every test subdir and verify that using a serde's converter and types.py's to_dict/from_dict would result in the expected structures.
For example, in 0000, the test harness would load the from.avsc into a dict, then use AvroConverter.to_recap to convert the Avro schema to a Recap StructType (the Recap AST). It'd then use recap.types.to_dict to convert the Recap AST to the Recap CST (dict). Finally, it'd load the recap.yaml and compare it to the Recap CST--they should be identical. It'd then do the same thing for the .proto and .json files.
In the second directory example it'd do the inverse. In the third directory example (0002) it'd do both to and from checks for the Avro schemas. The reason for splitting to/from is that you might not get the same schema when converting to Recap as when converting from Recap since some conversions are lossy or require coercion.
@mjperrone LMK if this makes sense.
The text was updated successfully, but these errors were encountered: