During job execution, users may scale out upstream Kafka partitions. If Flink fails to detect this expansion, data loss may occur. Therefore, it is essential to detect Kafka partition scaling events.
- Check the number of Kafka partitions every five minutes; the interval is configurable.
- Each task must perform its own investigation.
- Newly discovered partitions will begin consuming data starting from the earliest available records.