Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implicit unmarshalling in Cassandra Sink Connector Kamelet descriptor #1579

Open
jakubmalek opened this issue Dec 18, 2023 · 4 comments
Open

Comments

@jakubmalek
Copy link
Contributor

jakubmalek commented Dec 18, 2023

The Cassandra Sink Connector Kamelet descriptor is configured by default with the Jackson unmarshalling step:

spec:
...
  template:
    from:
      uri: "kamelet:source"
      steps:
      - unmarshal:
          json: 
            library: Jackson
            useList: true

The implicit unmarshalling is problematic as it doesn't really fit to the Kafka Connect (converter/transforms) data flow in my opinion, as it requires additional serialization/deserialization steps for the Apache Camel pipeline.

In the case of the Cassandra connector specifically, the problem is more apparent as the JSON array format is not usable for the CQL statement builder.
The problem with the JSON format is the data-type ambiguity, for example in the Cassandra you might have int64 column but in the JSON it's just a number that can deserialized either to int32 or int64, depending on the value. There are more problems for example how to pass the timestamp value in the JSON in a reliable way.

Normally, in the Kafka Connector, you would use the transformation plugins to cast the types into a specific format, or use the Struct data type, instead of schema-less JSON.
But because of the unmarshalling to Jackson step, the connect record value, even with the right types must still be marshalled into the JSON format, which leads to the type ambiguity problem described above.

IMHO the default Kamelet definitions for all connectors should be stripped out with any implicit marshalling or unmarshalling steps.
Alternatively there should be a way to remove this step with the means of connector configuration.

@oscerd
Copy link
Contributor

oscerd commented Dec 18, 2023

The main problem is just that Kamelets are not exclusively for Kafka Connect Runtime

@oscerd
Copy link
Contributor

oscerd commented Dec 18, 2023

I can think about change it, but I don't think it's so easy

@jakubmalek
Copy link
Contributor Author

It would be a breaking change indeed, so I understand the hesitation.
For more intermediate solution, I would at least appreciate the option to remove the default configuration for marshalling with configuration, for example something like: camel.sink.unmarshal: none.

@oscerd
Copy link
Contributor

oscerd commented Sep 9, 2024

This is done now and it should be available in 4.8.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants