The Isthmus library converts Substrait plans to and from SQL Plans. There are two examples showing conversion in each direction.
The Calcite library is used to do parsing and generation of the SQL String. Calcite has its own relational object model, distinct from Substrait's. There are classes within Isthmus to convert Substrait to and from Calcite's object model.
The conversion flows work as follows:
SQL to Substrait:
SQL ---[Calcite parsing]---> Calcite Object Model ---[Isthmus conversion]---> Substrait
Substrait to SQL:
Substrait ---[Isthmus conversion]---> Calcite Object Model ---[Calcite SQL generation]---> SQL
The examples:
- FromSql - creates a plan starting from SQL
- ToSql - reads a plan and creates the SQL
- DynamicFnToSql - uses the
substrait-javaAPI to create a plan using a Dynamic Function - CustomDialectDynamicFnToSql - using the same approach with a dynamic function but using a custom SQL dialect to make it applicable to SparkSQL
To run these you will need Java 17 or greater, and this repository cloned to your local system.
To run FromSql.java, execute the command below from the root of this repository.
./gradlew examples:isthmus-api:run --args "FromSql substrait.plan"The example writes a binary plan to substrait.plan and outputs the text format of the protobuf to stdout. The output is quite lengthy, so it has been abbreviated here.
> Task :examples:isthmus-api:run
extension_uris {
extension_uri_anchor: 2
uri: "/functions_aggregate_generic.yaml"
}
extension_uris {
extension_uri_anchor: 1
uri: "/functions_comparison.yaml"
}
extensions {
extension_function {
extension_uri_reference: 1
function_anchor: 1
name: "equal:any_any"
extension_urn_reference: 1
}
}
extensions {
extension_function {
extension_uri_reference: 2
function_anchor: 2
name: "count:"
extension_urn_reference: 2
}
}
relations {....}
}
version {
minor_number: 77
producer: "isthmus"
}
extension_urns {
extension_urn_anchor: 1
urn: "extension:io.substrait:functions_comparison"
}
extension_urns {
extension_urn_anchor: 2
urn: "extension:io.substrait:functions_aggregate_generic"
}
File written to substrait.planPlease see the code comments for details of how the conversion is done.
To run ToSql.java, execute the command below from the root of this repository.
./gradlew examples:isthmus-api:run --args "ToSql substrait.plan"The example reads from substrait.plan (likely the file created by FromSql) and outputs SQL. The text format of the protobuf has been abbreviated.
> Task :examples:isthmus-api:run
Reading from substrait.plan
extension_uris {
extension_uri_anchor: 2
uri: "/functions_aggregate_generic.yaml"
}
extension_uris {
extension_uri_anchor: 1
uri: "/functions_comparison.yaml"
}
extensions {
extension_function {
extension_uri_reference: 1
function_anchor: 1
name: "equal:any_any"
extension_urn_reference: 1
}
}
extensions {....}
relations {....}
version {
minor_number: 77
producer: "isthmus"
}
extension_urns {
extension_urn_anchor: 1
urn: "extension:io.substrait:functions_comparison"
}
extension_urns {
extension_urn_anchor: 2
urn: "extension:io.substrait:functions_aggregate_generic"
}
SELECT `t2`.`colour0` AS `COLOUR`, `t2`.`$f1` AS `COLOURCOUNT`
FROM (SELECT `vehicles`.`colour` AS `colour0`, COUNT(*) AS `$f1`
FROM `vehicles`
INNER JOIN `tests` ON `vehicles`.`vehicle_id` = `tests`.`vehicle_id`
WHERE `tests`.`test_result` = 'P'
GROUP BY `vehicles`.`colour`
ORDER BY COUNT(*) IS NULL, 2) AS `t2`
The SQL statement in the selected dialect will be created (MySql is used in the example).