fix/run_cluster-additional-patch#899
Conversation
…h run_cluster + nwb_template
| merged_nwb = cls.__deep_merge(merged_nwb, nwb_data) | ||
|
|
||
| result_input_info["nwbfile"] = merged_nwb | ||
| result_input_info["nwbfile"] = {"input": merged_nwb} |
There was a problem hiding this comment.
What is the purpose of this fix (what problem does it aim to avoid)?
Please provide additional explanation.
There was a problem hiding this comment.
Overall, the validation enhancements in this PR are a solution to post_process (a node? specific feature of optinist-for-cloud).
Here, when running a workflow from the regular GUI (optinist-for-cloud), I believe post_process is currently processed successfully.
Excuse me, but could you please explain the logic behind why a process (runner.py) that succeeds via the GUI results in an error when using run_cluster?
There was a problem hiding this comment.
post_process occurs in the nwb_template version of snakemake.yaml
Please check snakemake_nwb_template.yaml.txt
post_process:
input:
- 1/b6b0ae2e/eta_elysjdrr7j/eta.pkl
return_arg: null
params: null
output: 1/b6b0ae2e/post_process_0/post_process.pkl
type: post_process
nwbfile: null
hdf5Path: null
matPath: null
path: null
last_output:
- 1/b6b0ae2e/eta_elysjdrr7j/eta.pkl
- 1/b6b0ae2e/post_process_0/post_process.pkl
There was a problem hiding this comment.
Just to confirm,
If I run it from the GUI (at optinist-for-cloud) instead of using run_cluster, will an error occur if I don't apply this fix?
(Am I correct in thinking that the original logic in optinist-for-cloud's runner.py was flawed?)
There was a problem hiding this comment.
Sorry, I do not understand. This is barebone, and the nwb_template snakefile.yaml was created using barebone
There was a problem hiding this comment.
I understand that post_process.pkl is a pkl that is not generated by barebone-studio (it is only generated by optinist-for-cloud), but is this correct?
*Does this mean that in this PR, potential issues that may occur in optinist-for-cloud are being fixed in barebone first?
| for key in list(input_info): | ||
| if key not in __rule.return_arg.values(): | ||
| input_info.pop(key) | ||
| if __rule.return_arg is not None: |
There was a problem hiding this comment.
Here, if __rule.return_arg is None, input_info remains empty and the process continues. Is this okay? (Is it necessary to make it an Assertion?)
There was a problem hiding this comment.
Current Logic Flow:
- __align_input_info_content_keys() is called first:
- If return_arg is None: Returns {"nwbfile": {"input": {}}}
- If return_arg is not None: Returns processed input_info with mapped arguments - Then the filtering logic runs:
- If return_arg is None: The filtering is skipped, so input_info remains as {"nwbfile": {"input": {}}}
- If return_arg is not None: Keys are filtered to only include those needed
Analysis:
For return_arg is None case (like post_process):
- input_info = {"nwbfile": {"input": {}}}
- This gets passed to __execute_function()
- But __execute_function() returns early because path is None
- So the empty input_info is never actually used for function execution
This is actually correct behavior because:
- Rules with return_arg: None typically also have path: None (like post_process)
- These rules don't execute actual functions, so they don't need input arguments
- The nwbfile structure is preserved for data flow continuity
Should We Add an Assertion?
No, an assertion is not necessary because:
- It's expected behavior: Rules with return_arg: None are designed to not need input arguments
- It's safe: The empty input_info is never passed to actual function execution due to the path is None
check - It maintains data flow: The nwbfile structure is preserved for downstream processing
The design ensures that:
- Function-executing rules get properly filtered input arguments
- Non-function rules (like post_process) get minimal input_info and skip function execution
- Data flow continuity is maintained through the nwbfile structure
So no assertion or additional safety check is needed here.
Content
Additional update to runner.py to fix some non-critical errors that occurred wit to runner.py to work with nwb_template snakemake.yaml
Summary
What it fixed: Prevents crashes when return_arg is None (as in the post_process rule). Without this check,
the code would crash trying to iterate over None.values().
Original:
New:
What it fixed: Prevents NWB processing crashes for rules with path: null (like post_process). These rules
don't execute actual functions, so they shouldn't try to create NWB files.
What it fixed: Prevents AttributeError when trying to call path.split("/") on None. Rules like post_process
have path: null and don't need to execute wrapper functions.
Source: Added for this fix (develop-main was missing this safety)
In __align_input_info_content_keys:
What it fixed:
Tests