You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The implicit unmarshalling is problematic as it doesn't really fit to the Kafka Connect (converter/transforms) data flow in my opinion, as it requires additional serialization/deserialization steps for the Apache Camel pipeline.
In the case of the Cassandra connector specifically, the problem is more apparent as the JSON array format is not usable for the CQL statement builder.
The problem with the JSON format is the data-type ambiguity, for example in the Cassandra you might have int64 column but in the JSON it's just a number that can deserialized either to int32 or int64, depending on the value. There are more problems for example how to pass the timestamp value in the JSON in a reliable way.
Normally, in the Kafka Connector, you would use the transformation plugins to cast the types into a specific format, or use the Struct data type, instead of schema-less JSON.
But because of the unmarshalling to Jackson step, the connect record value, even with the right types must still be marshalled into the JSON format, which leads to the type ambiguity problem described above.
IMHO the default Kamelet definitions for all connectors should be stripped out with any implicit marshalling or unmarshalling steps.
Alternatively there should be a way to remove this step with the means of connector configuration.
The text was updated successfully, but these errors were encountered:
It would be a breaking change indeed, so I understand the hesitation.
For more intermediate solution, I would at least appreciate the option to remove the default configuration for marshalling with configuration, for example something like: camel.sink.unmarshal: none.
The Cassandra Sink Connector Kamelet descriptor is configured by default with the Jackson unmarshalling step:
The implicit unmarshalling is problematic as it doesn't really fit to the Kafka Connect (converter/transforms) data flow in my opinion, as it requires additional serialization/deserialization steps for the Apache Camel pipeline.
In the case of the Cassandra connector specifically, the problem is more apparent as the JSON array format is not usable for the CQL statement builder.
The problem with the JSON format is the data-type ambiguity, for example in the Cassandra you might have int64 column but in the JSON it's just a number that can deserialized either to int32 or int64, depending on the value. There are more problems for example how to pass the timestamp value in the JSON in a reliable way.
Normally, in the Kafka Connector, you would use the transformation plugins to cast the types into a specific format, or use the Struct data type, instead of schema-less JSON.
But because of the unmarshalling to Jackson step, the connect record value, even with the right types must still be marshalled into the JSON format, which leads to the type ambiguity problem described above.
IMHO the default Kamelet definitions for all connectors should be stripped out with any implicit marshalling or unmarshalling steps.
Alternatively there should be a way to remove this step with the means of connector configuration.
The text was updated successfully, but these errors were encountered: