[WIP] feat: executor to gcp custom container#6136
Conversation
| raise HTTPException(status_code=499, detail=status.description) | ||
| else: | ||
| return {"predictions": resp.docs} | ||
| return output_model(predictions=resp.docs) |
There was a problem hiding this comment.
I guess the first return is not good
| client.containers.prune() | ||
|
|
||
|
|
||
| class chdir(AbstractContextManager): |
There was a problem hiding this comment.
why do we need this? I do not like chdir behavior in tests
|
|
||
|
|
||
| def test_provider_gcp_pod_inference(): | ||
| with chdir(os.path.join(os.path.dirname(__file__), 'SampleExecutor')): |
There was a problem hiding this comment.
do not use chdir, pass the proper path
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #6136 +/- ##
===========================================
- Coverage 75.06% 47.90% -27.17%
===========================================
Files 152 151 -1
Lines 14061 14133 +72
===========================================
- Hits 10555 6770 -3785
- Misses 3506 7363 +3857
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
| from jina.serve.runtimes.gateway.models import _to_camel_case | ||
|
|
||
| if not docarray_v2: | ||
| logger.warning('Only docarray v2 is supported with GCP. ') |
There was a problem hiding this comment.
would this raise an Exception at some point? it should
| allow_methods=['*'], | ||
| allow_headers=['*'], | ||
| ) | ||
| logger.warning('CORS is enabled. This service is accessible from any website!') |
| return await process(input_model(data=[])) | ||
| else: | ||
| raise HTTPException( | ||
| status_code=400, |
There was a problem hiding this comment.
please use constants and not magic numbers.
| if status.code == jina_pb2.StatusProto.ERROR: | ||
| raise HTTPException(status_code=499, detail=status.description) | ||
| else: | ||
| return VertexAIResponse(predictions=req.data.docs) |
There was a problem hiding this comment.
I wonder why I got this error when running pytest -v -s -x test_gcp.py -k test_provider_gcp_deployment_inference, looks like it's failing on some docarray validation
File "pydantic/main.py", line 449, in pydantic.main.BaseModel.dict
File "pydantic/main.py", line 868, in _iter
File "pydantic/main.py", line 794, in pydantic.main.BaseModel._get_value
File "/Users/zacli/Code/jina/env/lib/python3.9/site-packages/docarray/array/doc_list/doc_list.py", line 135, in __init__
super().__init__(docs)
File "/Users/zacli/Code/jina/env/lib/python3.9/site-packages/docarray/array/doc_list/doc_list.py", line 163, in _validate_docs
yield self._validate_one_doc(doc)
File "/Users/zacli/Code/jina/env/lib/python3.9/site-packages/docarray/array/doc_list/doc_list.py", line 170, in _validate_one_doc
raise ValueError(f'{doc} is not a {self.doc_type}')
ValueError: {'id': '5d814334ba65cd66b356f363262dd2ad', 'text': 'hello world', 'embeddings': NdArray([[0.72778929, 0.23900942, 0.03286786, 0.95298338, 0.77614308,
0.03805365, 0.52810364, 0.70904448, 0.03221386, 0.40657516,
0.05665328, 0.73416605, 0.04400287, 0.79719694, 0.42839442,
0.43580189, 0.07334091, 0.60751732, 0.10342848, 0.51660762,
0.06586024, 0.88129511, 0.42874586, 0.88633873, 0.31858046,
0.71453457, 0.31093969, 0.92729799, 0.25257687, 0.92740828,
0.61681301, 0.51788409, 0.00997884, 0.21210052, 0.91713272,
0.37283237, 0.53357638, 0.77158797, 0.96699236, 0.13919022,
0.92879329, 0.78211408, 0.65285461, 0.89399441, 0.42854324,
0.94649221, 0.1966403 , 0.39643899, 0.42736609, 0.73116549,
0.88923411, 0.9894739 , 0.9124282 , 0.83614148, 0.27662465,
0.69740887, 0.34939623, 0.07094385, 0.38615401, 0.04148779,
0.0133029 , 0.68877819, 0.60318039, 0.64945002]])} is not a <class 'executor.EmbeddingResponseModel'>
Goals
Similar to #6046, we need to use
jinato serve as a fastAPI for GCP.Bunch of setup codes are copied, as Sagemaker and GCP requirements are mostly the same, except that GCP requires API request and response in certain format:
Request:
https://cloud.google.com/vertex-ai/docs/predictions/custom-container-requirements#request_requirements
Response:
https://cloud.google.com/vertex-ai/docs/predictions/custom-container-requirements#response_requirements
TODO