Skip to content

Commit 477af33

Browse files
ericfe-googlegcf-owl-bot[bot]tswast
authored
feat: support visualization of graph queries by adding the --graph argument. (#94)
* Implement graph visualization with the --graph argument. * Skip graph server tests when spanner-graph-notebook is missing. Also, add a test to bigquery magic for when spanner-graph-notebook is missing. * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md * Fix lint errors and unit tests under nox. * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md * Add spanner_graphs to owlbot. Also, fix typo in package name in error message. * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md * Fix owlbot entry * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md * Add spanner_graphs as optional dependency with minimum version 1.1.1. This is required for graph visualization. * Change owlbot so spanner-graph-notebook is added to the config using python runtime 3.12 instead of 3.8, as spanner-graph-notebook does not support runtime version 3.8. * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md * Fix typo * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md * Fix dependent package name: spanner_graphs -> spanner-graph-notebook * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md * Add unit test coverage for the GraphServer object. * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md * Add more unit tests for convert_graph_data() to boost code coverage. * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md * Add more tests. * Move get_ping() and post_ping() out of the GraphServer class, into the unit test * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md * Add unit test for handle_post_query() in GraphServer. * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md * Fix lint, remove a line of dead code, mark test_post_query for skipping if spanner_graphs is not present. * Remove more dead code in graph server, add test for --graph without spanner-graph-notebook present. * Add unit tests for colab paths. * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md * Use pytest.raises() instead of try/except * Add test coverage for the case where a graph query is run after the graph server is already running, due to another graph query having run previously. * Add docstrings * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md * Move networkx and portpicker to extras under "spanner-graph-notebook". * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md * Finish making graph dependencies optional * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md * fix port * Convert graph server to singleton object. * reformat * Fix handling of null json elements. * reformat * Apply suggestions from code review * Pin spanner-graph-notebook to exactly version 1.1.1, as subsequent changes to that repository broke our use of is. --------- Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com> Co-authored-by: Tim Sweña (Swast) <swast@google.com>
1 parent 60e1d57 commit 477af33

7 files changed

Lines changed: 1259 additions & 9 deletions

File tree

packages/bigquery-magics/bigquery_magics/bigquery.py

Lines changed: 82 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,8 @@
5353
amount of time for the query to complete will not be cleared after the
5454
query is finished. By default, this information will be displayed but
5555
will be cleared after the query is finished.
56+
* ``--graph`` (Optional[line argument]):
57+
Visualizes the query result as a graph.
5658
* ``--use_geodataframe <params>`` (Optional[line argument]):
5759
Return the query result as a geopandas.GeoDataFrame.
5860
If present, the argument that follows the ``--use_geodataframe`` flag
@@ -61,7 +63,6 @@
6163
6264
See geopandas.GeoDataFrame for details.
6365
The Coordinate Reference System will be set to “EPSG:4326”.
64-
6566
* ``--params <params>`` (Optional[line argument]):
6667
If present, the argument following the ``--params`` flag must be
6768
either:
@@ -97,14 +98,15 @@
9798
import ast
9899
from concurrent import futures
99100
import copy
101+
import json
100102
import re
101103
import sys
104+
import threading
102105
import time
103106
from typing import Any, List, Tuple
104107
import warnings
105108

106109
import IPython # type: ignore
107-
from IPython import display # type: ignore
108110
from IPython.core import magic_arguments # type: ignore
109111
from IPython.core.getipython import get_ipython
110112
from google.api_core import client_info
@@ -114,10 +116,12 @@
114116
from google.cloud.bigquery.dataset import DatasetReference
115117
from google.cloud.bigquery.dbapi import _helpers
116118
from google.cloud.bigquery.job import QueryJobConfig
119+
import pandas
117120

118121
from bigquery_magics import line_arg_parser as lap
119122
import bigquery_magics._versions_helpers
120123
import bigquery_magics.config
124+
import bigquery_magics.graph_server as graph_server
121125
import bigquery_magics.line_arg_parser.exceptions
122126
import bigquery_magics.version
123127

@@ -391,6 +395,12 @@ def _create_dataset_if_necessary(client, dataset_id):
391395
"Defaults to engine set in the query setting in console."
392396
),
393397
)
398+
@magic_arguments.argument(
399+
"--graph",
400+
action="store_true",
401+
default=False,
402+
help=("Visualizes the query results as a graph"),
403+
)
394404
def _cell_magic(line, query):
395405
"""Underlying function for bigquery cell magic
396406
@@ -425,7 +435,7 @@ def _cell_magic(line, query):
425435

426436
def _parse_magic_args(line: str) -> Tuple[List[Any], Any]:
427437
# The built-in parser does not recognize Python structures such as dicts, thus
428-
# we extract the "--params" option and inteprpret it separately.
438+
# we extract the "--params" option and interpret it separately.
429439
try:
430440
params_option_value, rest_of_args = _split_args_line(line)
431441

@@ -586,6 +596,72 @@ def _handle_result(result, args):
586596
return result
587597

588598

599+
def _is_colab() -> bool:
600+
"""Check if code is running in Google Colab"""
601+
try:
602+
import google.colab # noqa: F401
603+
604+
return True
605+
except ImportError:
606+
return False
607+
608+
609+
def _colab_callback(query: str, params: str):
610+
return IPython.core.display.JSON(
611+
graph_server.convert_graph_data(query_results=json.loads(params))
612+
)
613+
614+
615+
singleton_server_thread: threading.Thread = None
616+
617+
618+
def _add_graph_widget(query_result):
619+
try:
620+
from spanner_graphs.graph_visualization import generate_visualization_html
621+
except ImportError as err:
622+
customized_error = ImportError(
623+
"Use of --graph requires the spanner-graph-notebook package to be installed. Install it with `pip install 'bigquery-magics[spanner-graph-notebook]'`."
624+
)
625+
raise customized_error from err
626+
627+
# In Jupyter, create an http server to be invoked from the Javascript to populate the
628+
# visualizer widget. In colab, we are not able to create an http server on a
629+
# background thread, so we use a special colab-specific api to register a callback,
630+
# to be invoked from Javascript.
631+
if _is_colab():
632+
from google.colab import output
633+
634+
output.register_callback("graph_visualization.Query", _colab_callback)
635+
else:
636+
global singleton_server_thread
637+
alive = singleton_server_thread and singleton_server_thread.is_alive()
638+
if not alive:
639+
singleton_server_thread = graph_server.graph_server.init()
640+
641+
# Create html to invoke the graph server
642+
html_content = generate_visualization_html(
643+
query="placeholder query",
644+
port=graph_server.graph_server.port,
645+
params=query_result.to_json().replace("\\", "\\\\").replace('"', '\\"'),
646+
)
647+
IPython.display.display(IPython.core.display.HTML(html_content))
648+
649+
650+
def _is_valid_json(s: str):
651+
try:
652+
json.loads(s)
653+
return True
654+
except (json.JSONDecodeError, TypeError):
655+
return False
656+
657+
658+
def _supports_graph_widget(query_result: pandas.DataFrame):
659+
num_rows, num_columns = query_result.shape
660+
if num_columns != 1:
661+
return False
662+
return query_result[query_result.columns[0]].apply(_is_valid_json).all()
663+
664+
589665
def _make_bq_query(
590666
query: str,
591667
args: Any,
@@ -634,7 +710,7 @@ def _make_bq_query(
634710
return
635711

636712
if not args.verbose:
637-
display.clear_output()
713+
IPython.display.clear_output()
638714

639715
if args.dry_run:
640716
# TODO(tswast): Use _handle_result() here, too, but perhaps change the
@@ -671,6 +747,8 @@ def _make_bq_query(
671747
else:
672748
result = result.to_dataframe(**dataframe_kwargs)
673749

750+
if args.graph and _supports_graph_widget(result):
751+
_add_graph_widget(result)
674752
return _handle_result(result, args)
675753

676754

0 commit comments

Comments
 (0)