Skip to content

Conversation

@MuhammadTahaNaveed
Copy link
Member

@MuhammadTahaNaveed MuhammadTahaNaveed commented Jan 10, 2026

  • Introduce vertex and edge as pg composite types
    vertex: (id, label, properties)
    edge: (id, label, end_id, start_id, properties)
  • Property access (a.name) now directly uses a.properties for agtype_access_operator instead of rebuilding via _agtype_build_vertex/edge
  • Optimize accessor functions (id, properties, label, type, start_id, end_id) to use direct FieldSelect on composite types instead of agtype functions
  • Add casts: vertex/edge to agtype, vertex/edge to json/jsonb
  • Add eq and not eq ops for vertex/edge composite types.
  • Fix label_name specific routine to use cache instead of ag_label scan
  • Write/update clauses have executors strictly tied to agtype, due to which the variables after any write/update clause are carried forward as agtype.
  • Allows users to completely skip agtype build functions and return vertex/edge for pure read queries.
  • Change _label_name to return agtype since record comparisons are not allowed with cstring. Consequently, _agtype_build_vertex/edge now accept agtype as label.
  • Fix MERGE clause type mismatch when accessing properties from previous MATCH clauses by wrapping columns with agtype_volatile_wrapper before namespace lookup.
  • Update expression index in pgvector.sql, since now it uses raw properties column instead of _agtype_build_vertex/edge.
  • Add regression tests

Assisted-by AI

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request introduces vertex and edge as PostgreSQL composite types to optimize graph query performance. The changes enable direct field access for properties and accessor functions instead of rebuilding agtype structures.

Changes:

  • Introduces vertex (id, label, properties) and edge (id, label, start_id, end_id, properties) composite types
  • Optimizes property access to use direct field selection instead of agtype rebuild
  • Changes _label_name to return agtype instead of cstring for type consistency
  • Adds implicit casts from vertex/edge to agtype and json
  • Fixes MERGE clause type mismatches with volatile wrapper

Reviewed changes

Copilot reviewed 23 out of 23 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
src/include/utils/agtype.h Added OID accessors for VERTEXOID and EDGEOID composite types
src/include/parser/cypher_expr.h Added helper functions for entity type checking and field extraction
src/include/parser/cypher_clause.h Added field access helper functions for composite types
src/include/catalog/ag_label.h Added get_label_name function signature
src/backend/utils/adt/agtype.c Implemented composite type OID caching, vertex/edge to agtype casts, label parameter type change
src/backend/parser/cypher_expr.c Implemented accessor function optimization and entity type coercion
src/backend/parser/cypher_clause.c Updated entity expression builders to use RowExpr, added field selection helpers
src/backend/executor/*.c Updated executors to use agtype labels instead of cstrings
src/backend/catalog/ag_label.c Changed _label_name to return agtype, added cache-based get_label_name
sql/agtype_graphid.sql Defined vertex/edge composite types and cast functions
sql/age_scalar.sql Added _get_vertex_by_graphid helper function
sql/age_main.sql Removed duplicate _label_name definition
regress/sql/*.sql Added comprehensive tests for accessor optimizations and composite types

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 23 out of 23 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

src/backend/utils/adt/agtype.c:2394

  • Potential memory leak: pnstrdup allocates memory for label at line 2377, but if an error occurs after this point (e.g., at line 2391 when properties validation fails), the allocated memory for label is not freed before the error is thrown. Consider using a PG_TRY/PG_CATCH block or allocating in the current memory context that will be cleaned up on error.
    label = pnstrdup(label_value->val.string.val, label_value->val.string.len);

    if (fcinfo->args[2].isnull)
    {
        bstate = init_agtype_build_state(0, AGT_FOBJECT);
        properties = build_agtype(bstate);
        pfree_agtype_build_state(bstate);
    }
    else
    {
        properties = AG_GET_ARG_AGTYPE_P(2);

        if (!AGT_ROOT_IS_OBJECT(properties))
        {
            ereport(ERROR,
                    (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
                     errmsg("_agtype_build_vertex() properties argument must be an object")));
        }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +2175 to +2210
/* For startNode/endNode functions on edges, look up the vertex */
if (is_node_func && entity_type == EDGEOID)
{
Oid func_oid;
Node *graph_name;

graph_name = (Node *) makeConst(TEXTOID, -1, InvalidOid, -1,
CStringGetTextDatum(cpstate->graph_name),
false, false);

/* Call _get_vertex_by_graphid(graph_name, vertex_id) */
func_oid = get_ag_func_oid("_get_vertex_by_graphid", 2, TEXTOID, GRAPHIDOID);
result = (Node *)makeFuncExpr(func_oid, AGTYPEOID,
list_make2(graph_name, field),
InvalidOid, InvalidOid,
COERCE_EXPLICIT_CALL);
}
else
{
result = field;
}
Copy link

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition at line 2176 checks for is_node_func AND EDGEOID, but startNode/endNode can only be called on edges, never on vertices. When entity_type is VERTEXOID and is_node_func is true, neither the if block (line 2176) nor the else block (line 2192) is entered, so result remains uninitialized. The logic should handle VERTEXOID with is_node_func as an error case.

Copilot uses AI. Check for mistakes.
STABLE
RETURNS NULL ON NULL INPUT
PARALLEL SAFE
AS 'MODULE_PATHNAME';
Copy link

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new function _get_vertex_by_graphid(text, graphid) exposes a low-level vertex lookup to any caller without performing any privilege or RLS checks on the underlying label tables. Its C implementation uses get_vertex, which directly calls table_open / table_beginscan on the graph’s label tables by name, bypassing PostgreSQL GRANT and row-level security, so any role with EXECUTE on _get_vertex_by_graphid can fetch vertex data from graphs whose schemas/tables they may not otherwise be allowed to SELECT. To prevent unauthorized graph data access, restrict EXECUTE on _get_vertex_by_graphid (and similar helpers) to trusted roles only, or add explicit ACL/RLS checks before opening the target relation (e.g., verifying the caller has appropriate privileges on the resolved label table).

Suggested change
AS 'MODULE_PATHNAME';
AS 'MODULE_PATHNAME';
REVOKE ALL ON FUNCTION ag_catalog._get_vertex_by_graphid(text, graphid) FROM PUBLIC;

Copilot uses AI. Check for mistakes.
@jrgemignani
Copy link
Contributor

jrgemignani commented Jan 13, 2026

@MuhammadTahaNaveed Cursory thoughts so far -

  1. There aren't any additions to the age--1.6.0--y.y.y.sql file which is needed for upgrading. I won't be able to test actual performance without this file.
  2. These changes to the types will generally make it difficult to upgrade.
  3. The composite vertex and edge types are an interesting touch, but, it doesn't appear that you removed how the vertices and edges are represented in agtype. If this is intended, won't this increase the space taken by these objects?

@jrgemignani
Copy link
Contributor

jrgemignani commented Jan 13, 2026

@MuhammadTahaNaveed Ambiguous operator added -

Current master branch

psql-17.4-5432-pgsql=#  SELECT create_graph('bug_demo');
NOTICE:  graph "bug_demo" has been created
 create_graph
--------------

(1 row)

psql-17.4-5432-pgsql=# SELECT * FROM cypher('bug_demo', $$ CREATE (:Person {name: 'Alice'}) $$) AS (v agtype);
 v
---
(0 rows)

psql-17.4-5432-pgsql=# SELECT * FROM cypher('bug_demo', $$ CREATE (:Person {name: 'Bob'}) $$) AS (v agtype);
 v
---
(0 rows)

psql-17.4-5432-pgsql=# SELECT * FROM cypher('bug_demo', $$
psql-17.4-5432-pgsql$#   MATCH (a:Person), (b:Person)
psql-17.4-5432-pgsql$#   WHERE a IN [a, b]
psql-17.4-5432-pgsql$#   RETURN a.name
psql-17.4-5432-pgsql$# $$) AS (name agtype);
  name
---------
 "Alice"
 "Alice"
 "Bob"
 "Bob"
(4 rows)

PR 2303

psql-17.7-5432-pgsql=# SELECT create_graph('bug_demo');
SELECT * FROM cypher('bug_demo', $$ CREATE (:Person {name: 'Alice'}) $$) AS (v agtype);
SELECT * FROM cypher('bug_demo', $$ CREATE (:Person {name: 'Bob'}) $$) AS (v agtype);
SELECT * FROM cypher('bug_demo', $$
  MATCH (a:Person), (b:Person)
  WHERE a IN [a, b]
  RETURN a.name
$$) AS (name agtype);
NOTICE:  graph "bug_demo" has been created
 create_graph 
--------------
 
(1 row)

 v 
---
(0 rows)

 v 
---
(0 rows)

ERROR:  operator is not unique: vertex = vertex
LINE 3:   WHERE a IN [a, b]
                  ^
HINT:  Could not choose a best candidate operator. You might need to add explicit type casts.

Edge too

psql-17.7-5432-pgsql=# SELECT * FROM cypher('bug_demo', $$
  MATCH ()-[r1]->(), ()-[r2]->()
  WHERE r1 IN [r1, r2]
  RETURN id(r1)
$$) AS (id agtype);
ERROR:  operator is not unique: edge = edge
LINE 3:   WHERE r1 IN [r1, r2]
                   ^
HINT:  Could not choose a best candidate operator. You might need to add explicit type casts.
psql-17.7-5432-pgsql=#

…tion

- Introduce vertex and edge as pg composite types
  vertex: (id, label, properties)
  edge: (id, label, start_id, end_id, properties)
- Property access (a.name) now directly uses a.properties for
  agtype_access_operator instead of rebuilding via _agtype_build_vertex/edge
- Optimize accessor functions (id, properties, label, type, start_id, end_id)
  to use direct FieldSelect on composite types instead of agtype functions
- Add casts: vertex/edge to agtype, vertex/edge to json/jsonb
- Add eq and not eq ops for vertex/edge composite types.
- Fix label_name specific routine to use cache instead of ag_label scan
- Write/update clauses have executors strictly tied to agtype, due to which
  the variables after any write/update clause are carried forward as agtype.
- Allows users to completely skip agtype build functions and return vertex/edge
  for pure read queries.
- Change _label_name to return agtype since record comparisons are not
  allowed with cstring. Consequently, _agtype_build_vertex/edge now accept
  agtype as label.
- Fix MERGE clause type mismatch when accessing properties from previous
  MATCH clauses by wrapping columns with agtype_volatile_wrapper
  before namespace lookup.
- Update expression index in pgvector.sql, since now it uses raw properties column
  instead of _agtype_build_vertex/edge.
- Add regression tests

Assisted-by AI
@MuhammadTahaNaveed
Copy link
Member Author

@jrgemignani

  1. There aren't any additions to the age--1.6.0--y.y.y.sql file which is needed for upgrading. I won't be able to test actual performance without this file.

Updated age--1.6.0--y.y.y.sql

  1. These changes to the types will generally make it difficult to upgrade.

The changed functions (_label_name, _agtype_build_vertex, _agtype_build_edge) are intended for internal use only. They are invoked by the parser/executor and are not meant to be called directly by users.

The upgrade script in age--1.6.0--y.y.y.sql manages the transition by dropping the old function signatures and creating the new ones. No data migration is required, as the underlying table structure remains unchanged.

From a user perspective, existing queries that return agtype continue to work without any changes. The composite types are introduced to improve internal query transformation and serve as an additional output option, rather than a breaking change.

  1. The composite vertex and edge types are an interesting touch, but, it doesn't appear that you removed how the vertices and edges are represented in agtype. If this is intended, won't this increase the space taken by these objects?

The composite types don’t increase storage, as they are only a query-time representation. The underlying tables continue to store the same data as before.

The agtype representation is still required for paths (which are agtype arrays of vertices/edges), for storing entities in agtype containers, and for compatibility with existing agtype operators. When agtype is needed, the implicit casts(backed by functions vertex_to_agtype and edge_to_agtype) handle the conversion.

Ambiguous operator added

Good catch. Fixed it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

master override-stale To keep issues/PRs untouched from stale action

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants