Conversation
Since ES lacks analogous operation primitive we need to handle it differently depending when - only aggregations present we must specify track_total_hits to true, so as to get an accurate count above the default 10000 - when grouping clauses are present, the doc_count field suffices, which is present by default.
mildbyte
left a comment
There was a problem hiding this comment.
This one was simple enough, good work covering it by tests in the main repo! Moving on to the Multicorn part.
pg_es_fdw/_es_query.py
Outdated
|
|
||
| if op == "~~": | ||
| return {"match": {col: value.replace("%", "*")}} | ||
| return {"wildcard": {col: value.replace("%", "*")}} |
There was a problem hiding this comment.
Ideally this would handle underscores (single-character match) and escapes (e.g. \% matches %), so something like:
%(when not preceded by a single\) becomes*_(when not preceded by a single\) becomes?\%becomes%(since it has no special meaning in ES wildcards)\_becomes_(same)
It does become complex though. I'd be happy with us returning false positives (since PG will recheck and filter them out) but I don't know how to process the input LIKE pattern to get to a pattern that definitely over-matches instead of possibly under-matching (e.g. right now the x LIKE 'a_ple' won't match apple AFAIU)
There was a problem hiding this comment.
Right, I think this should do it, let me add a couple of tests in the engine repo.
There was a problem hiding this comment.
Now that I think of it, I need to handle case like \\%(becomes \*), and \\\% (becomes \%?).
There was a problem hiding this comment.
\\%in PG patterns means "backslash + any characters", so it'd become\\*\\\%means "backslash +%", so it'd become\\%(since%has no special meaning)
There was a problem hiding this comment.
Got it, this should now be covered as well. Here are the corresponding tests.
There was a problem hiding this comment.
Can you also test / check that this escapes special ES pattern characters, e.g.:
*becomes\*?becomes\?
Building on the changes from splitgraph/Multicorn#2, extend the ES FDW so that we can push down
WHERE) clausesCOUNT(*), which does not have a simple equivalent that is suitable for all query translationsCU-1z461e4