Add a basic flatten operator for data pipelines#1229
Add a basic flatten operator for data pipelines#1229MartinGleize wants to merge 1 commit intoybe/optfrom
Conversation
cbalioglu
left a comment
There was a problem hiding this comment.
Thanks for the PR! Left a couple comments.
| std::unique_ptr<data_source> inner_; | ||
| std::optional<std::string> selector_; | ||
| std::queue<data> elements_queue_; | ||
| //std::exception_ptr exception_ptr_{}; |
| if (selector_) { | ||
| std::string error_msg = "Cannot use selector with flatten."; | ||
| std::cerr << error_msg << std::endl; | ||
| std::terminate(); // Force immediate program termination |
There was a problem hiding this comment.
I would prefer to raise a "not implemented exception" here, instead of forcibly terminating the process. In data_pipeline.cc:93 we mark the pipeline as broken if we catch an unhandled exception.
| // If no selector is provided, assume the example itself is a list | ||
| else if (example.is_list()) { | ||
| auto example_list = example.as_list(); | ||
| for (size_t i = 0; i < example_list.size(); ++i) { |
There was a problem hiding this comment.
My understanding when looking at next() implementation, we can actually use move semantics here. So the signature of extract_elements can be:
extract_elements(data &&example);
If that works, you can instead use element.emplace(std::move(example_list[i])).
| elements.push(example_list.at(i)); | ||
| } | ||
| } | ||
| // If no valid elements were found, the queue remains empty |
There was a problem hiding this comment.
Does this mean we simply discard/drop the example we just read from the underlying data source?
|
|
||
| // If the queue is empty (no elements could be extracted), recursively call next() | ||
| if (elements_queue_.empty()) | ||
| return next(); |
There was a problem hiding this comment.
Wouldn't be cleaner if we had this:
while (elements_queue_.empty()) {
std::optional<data> maybe_example = ...
elements_queue = extract_elements(*maybe_example);
|
I realize it pinged you as a reviewer, I didn't intend this to be ready for review and merged into main. I'll revisit this to make it cleaner overall, this was a rushed change. |
|
i think i can be achieved with yield_from pattern ? |
I think you're right, I hadn't taken a look at this operator before. I'll close this PR. |
What does this PR do? Please describe:
Adds an operator turning a pipeline of lists into a pipeline of their flattened elements.
Does your PR introduce any breaking changes? If yes, please list them:
None
Check list: