What would be really cool is the ability to write into Kafka.
Imagine I read from a source, transform all the data in DuckDB and then copy that data into Kafka.
This would entail
- Read the latest version of a subject provided by the user from the schema registry.
- If it exists and is compatible, use it. If it does not exist or must be evolved into a new version, create a schema based on the dataset. Only if the user allowed that, else generate an error.
- Each row is serialized into Avro and published in Kafka.
- The partition to be used is optionally provided by an integer column of the table.
- The partition-offsets of each posted record is made available somehow. If the source was a table, we could add another column for the offset and the writer updates it.
- Alternative would be to the optional feature to write into a commit topic. That is another topic in Avro where the transaction id and the max offset for each partition is stored.
What would be really cool is the ability to write into Kafka.
Imagine I read from a source, transform all the data in DuckDB and then copy that data into Kafka.
This would entail