-
Notifications
You must be signed in to change notification settings - Fork 232
Add --wait option to databricks runs submit CLI command #487
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 8 commits
654301e
8781f67
950fce6
7fb6c5c
54838e5
6a92534
a81db2b
1ab5c09
6e92837
c1bc00c
6f4d5b4
4862b46
b4bd046
561e8c5
3b4012e
2c4382d
e28bcfe
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -21,13 +21,17 @@ | |
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| import sys | ||
| import time | ||
| from json import loads as json_loads | ||
|
|
||
| import click | ||
| from tabulate import tabulate | ||
|
|
||
| from databricks_cli.click_types import OutputClickType, JsonClickType, RunIdClickType | ||
| from databricks_cli.jobs.cli import check_version | ||
| from databricks_cli.utils import eat_exceptions, CONTEXT_SETTINGS, pretty_format, json_cli_base, \ | ||
| truncate_string | ||
| from databricks_cli.utils import eat_exceptions, CONTEXT_SETTINGS, pretty_format, truncate_string, \ | ||
| error_and_quit | ||
| from databricks_cli.configure.config import provide_api_client, profile_option, debug_option, \ | ||
| api_version_option | ||
| from databricks_cli.runs.api import RunsApi | ||
|
|
@@ -39,21 +43,42 @@ | |
| help='File containing JSON request to POST to /api/2.*/jobs/runs/submit.') | ||
| @click.option('--json', default=None, type=JsonClickType(), | ||
| help=JsonClickType.help('/api/2.*/jobs/runs/submit')) | ||
| @click.option('--wait', is_flag=True, | ||
| help='If specified, the CLI will wait for the submitted run to complete.') | ||
| @api_version_option | ||
| @debug_option | ||
| @profile_option | ||
| @eat_exceptions | ||
| @provide_api_client | ||
| def submit_cli(api_client, json_file, json, version): | ||
| def submit_cli(api_client, json_file, json, wait, version): | ||
| """ | ||
| Submits a one-time run. | ||
|
jerrylian-db marked this conversation as resolved.
Outdated
|
||
|
|
||
| The specification for the request json can be found | ||
| https://docs.databricks.com/api/latest/jobs.html#runs-submit | ||
| """ | ||
| check_version(api_client, version) | ||
| json_cli_base(json_file, json, lambda json: RunsApi( | ||
| api_client).submit_run(json, version=version)) | ||
| if json_file: | ||
| with open(json_file, 'r') as f: | ||
| json = f.read() | ||
| submit_res = RunsApi(api_client).submit_run(json_loads(json), version=version) | ||
| click.echo(pretty_format(submit_res)) | ||
|
jerrylian-db marked this conversation as resolved.
|
||
| if wait: | ||
| run_id = submit_res['run_id'] | ||
| completed_states = set(['TERMINATED', 'SKIPPED', 'INTERNAL_ERROR']) | ||
| # Wait for run to complete | ||
| while True: | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wonder if we should have a time-out? So it's not perpetually waiting if something goes wrong.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Leaning towards not having a time-out for now. I believe that the submitted run themselves have an internal timeouts in Databricks. Users can also force exit on their own.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +1 Users can always CTRL-C |
||
| run = RunsApi(api_client).get_run(run_id, version=version) | ||
| run_state = run['state'] | ||
| if run_state['life_cycle_state'] in completed_states: | ||
|
jerrylian-db marked this conversation as resolved.
Outdated
|
||
| if run_state['result_state'] == 'SUCCESS': | ||
| sys.exit(0) | ||
| else: | ||
| error_and_quit('job failed with state ' + run_state['result_state'] + | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. QQ:
jerrylian-db marked this conversation as resolved.
Outdated
|
||
| ' and state message ' + run_state['state_message']) | ||
| click.echo('Job still running with lifecycle state ' + run_state['life_cycle_state'] + | ||
|
jerrylian-db marked this conversation as resolved.
Outdated
|
||
| '. URL: ' + run['run_page_url'], err=True) | ||
| time.sleep(5) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wonder if we're ok with this polling interval to start with, or if we need to add more complex backoff/jitter logic upfront. To start with, I'd prefer to keep this simple and not implement backoff logic but just hardcode a constant polling interval that the JAWS stability reviewer (@shivamdixit) is comfortable with, even if it's >5 seconds (I think e.g. up to 10s would be fine for detecting that run submitted in order to integration test a notebook succeeded/failed)
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this is fine. The same is used elsewhere (e.g. dbx and your GH action). Beware that there are uses for this beyond integration tests.
jerrylian-db marked this conversation as resolved.
Outdated
|
||
|
|
||
|
|
||
| def _runs_to_table(runs_json): | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.