Skip to content

Conversation

@fivetran-ashashankar
Copy link
Collaborator

Annotates APPROXIMATE_TOP_K_ACCUMULATE expressions as returning VARIANT.

Updates the type mapping in the Snowflake dialect.

Includes generic tests in snowflake, databricks and spark to validate type inference for this function.

…n. Tests added all dialects that support APPROX_TOP_K_ACCUMULATE
Copy link
Collaborator

@geooo109 geooo109 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments, also can you share the docs ofAPPROX_TOP_K_ACCUMULATE for Spark and Databricks ? I'm not sure if the function exists in these dialects.

@fivetran-ashashankar
Copy link
Collaborator Author

Left some comments, also can you share the docs ofAPPROX_TOP_K_ACCUMULATE for Spark and Databricks ? I'm not sure if the function exists in these dialects.

https://spark.apache.org/docs/preview/api/sql/index.html#approx_top_k_accumulate

@geooo109
Copy link
Collaborator

Left some comments, also can you share the docs ofAPPROX_TOP_K_ACCUMULATE for Spark and Databricks ? I'm not sure if the function exists in these dialects.

https://spark.apache.org/docs/preview/api/sql/index.html#approx_top_k_accumulate

I see - does it exist in Databricks? I tried to run a test but it fails.

self.assertEqual(null_type.sql(), "NULL")
self.assertEqual(null_type.sql("databricks"), "VOID")

self.validate_identity("SELECT APPROX_TOP_K_ACCUMULATE(col, 10)")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also believe that DBX should be compatible with Spark but I'll go ahead and remove this test for now since it's not documented properly and @geooo109 mentioned it doesn't seem to work.

We can always revisit when we iterate through DBX

Suggested change
self.validate_identity("SELECT APPROX_TOP_K_ACCUMULATE(col, 10)")

@VaggelisD VaggelisD merged commit e903883 into main Nov 12, 2025
7 checks passed
@VaggelisD VaggelisD deleted the RD-1060135-annotate-type-for-APPROX_TOP_K_ACCUMULATE branch November 12, 2025 12:35
@fivetran-ashashankar
Copy link
Collaborator Author

Left some comments, also can you share the docs ofAPPROX_TOP_K_ACCUMULATE for Spark and Databricks ? I'm not sure if the function exists in these dialects.

https://spark.apache.org/docs/preview/api/sql/index.html#approx_top_k_accumulate

https://docs.databricks.com/aws/en/sql/language-manual/functions/approx_top_k

@fivetran-ashashankar
Copy link
Collaborator Author

Left some comments, also can you share the docs ofAPPROX_TOP_K_ACCUMULATE for Spark and Databricks ? I'm not sure if the function exists in these dialects.

https://spark.apache.org/docs/preview/api/sql/index.html#approx_top_k_accumulate

I see - does it exist in Databricks? I tried to run a test but it fails.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants