Skip to content

Latest commit

 

History

History
922 lines (690 loc) · 18.1 KB

File metadata and controls

922 lines (690 loc) · 18.1 KB

VCL Type System Specification

Table of Contents

1. Overview

This document specifies the static type system for VCL, including:

  1. Type syntax and semantics

  2. Typing rules for all VCL constructs

  3. Type checking algorithm

  4. Subtyping and type equivalence

  5. Differences between dependent-type and slipstream paths

This complements VCL Formal Semantics which covers operational semantics.

2. Type Language

2.1. Type Syntax

The VCL type language \(\tau\) is defined by:

\[\begin{aligned} \tau ::= &\ \text{UUID} \mid \text{String} \mid \text{Int} \mid \text{Float} \mid \text{Bool} \\ | &\ \text{Vector}[n] \mid \text{Tensor}[d_1, \ldots, d_k] \\ | &\ \text{Timestamp} \mid \text{RDFTriple} \\ | &\ \text{Octad} \mid \text{OctadRef} \\ | &\ \text{Modality} \mid \text{ModalitySet} \\ | &\ \text{List}(\tau) \mid \text{Option}(\tau) \\ | &\ \tau_1 \times \tau_2 \quad \text{(product)} \\ | &\ \tau_1 \to \tau_2 \quad \text{(function)} \\ | &\ \{x : \tau \mid \phi(x)\} \quad \text{(refinement)} \\ | &\ \Pi x : \tau_1. \tau_2(x) \quad \text{(dependent product)} \\ | &\ \Sigma x : \tau_1. \tau_2(x) \quad \text{(dependent sum)} \\ | &\ \text{Proof}[\phi] \quad \text{(proof type)} \end{aligned}\]

2.2. Type Constructors

2.2.1. Product Types

Represent pairs/tuples:

\[\tau_1 \times \tau_2 = \{(v_1, v_2) \mid v_1 : \tau_1 \land v_2 : \tau_2\}\]

Example: (UUID, Timestamp) - a octad ID paired with a timestamp

2.2.2. Function Types

\[\tau_1 \to \tau_2 = \{f \mid \forall v : \tau_1. f(v) : \tau_2\}\]

Example: Octad → Bool - predicate on octads (used in WHERE clauses)

2.2.3. List Types

\[\text{List}(\tau) = \{[], [v_1], [v_1, v_2], \ldots \mid v_i : \tau\}\]

Example: List(Octad) - query results

2.2.4. Option Types

\[\text{Option}(\tau) = \{\text{None}\} \cup \{\text{Some}(v) \mid v : \tau\}\]

Example: Option(Vector[n]) - optional embedding

2.2.5. Refinement Types

\[\{x : \tau \mid \phi(x)\} = \{v : \tau \mid \phi(v) = \text{true}\}\]

Example: {n : Int | n > 0} - positive integers

2.2.6. Dependent Product (Pi Types)

\[\Pi x : \tau_1. \tau_2(x) = \{f \mid \forall v : \tau_1. f(v) : \tau_2(v)\}\]

Example: Vector dimension-indexed types:

\[\Pi n : \text{Nat}. \text{Vector}[n] \to \text{Float}\]

A function that takes vectors of any dimension and returns a float (e.g., norm).

2.2.7. Dependent Sum (Sigma Types)

\[\Sigma x : \tau_1. \tau_2(x) = \{(v, w) \mid v : \tau_1 \land w : \tau_2(v)\}\]

Example: Dimensioned vectors:

\[\Sigma n : \text{Nat}. \text{Vector}[n]\]

A pair of a dimension n and a vector of that dimension.

2.3. Modality Types

Each modality has an associated type:

Modality Type Signature

GRAPH

\(\text{Octad} \to \text{Set}(\text{RDFTriple})\)

VECTOR

\(\Pi n : \text{Nat}. \text{Octad} \to \text{Option}(\text{Vector}[n\))]

TENSOR

\(\Pi d_1, \ldots, d_k : \text{Nat}. \text{Octad} \to \text{Option}(\text{Tensor}[d_1, \ldots, d_k\))]

SEMANTIC

\(\text{Octad} \to \text{Set}(\text{TypeAnnotation})\)

DOCUMENT

\(\text{Octad} \to \text{Option}(\text{String})\)

TEMPORAL

\(\text{Octad} \to \text{List}(\text{Version})\)

Note: Vector and tensor modalities have dependent types - the return type depends on the dimension parameters.

3. Type Environments

3.1. Context (Type Environment)

A type environment \(\Gamma\) is a finite map from variables to types:

\[\Gamma ::= \emptyset \mid \Gamma, x : \tau\]

Operations:

  • \(\Gamma(x)\) - lookup type of x in \(\Gamma\)

  • \(\Gamma, x : \tau\) - extend \(\Gamma\) with binding x : τ

  • \(\text{dom}(\Gamma)\) - domain of \(\Gamma\) (set of variables)

3.2. Well-Formed Environments

\[\frac{ \Gamma \vdash \tau : \text{Type} \quad x \notin \text{dom}(\Gamma) }{ \Gamma, x : \tau \text{ well-formed} }\]

4. Typing Judgments

We use the following typing judgments:

Judgment Meaning

\(\Gamma \vdash e : \tau\)

Expression e has type \(\tau\) in context \(\Gamma\)

\(\Gamma \vdash Q : \tau\)

Query Q has type \(\tau\) in context \(\Gamma\)

\(\Gamma \vdash C : \text{Octad} \to \text{Bool}\)

Condition C is a predicate on octads

\(\Gamma \vdash \tau : \text{Type}\)

\(\tau\) is a well-formed type

\(\Gamma \vdash \tau_1 \leq \tau_2\)

\(\tau_1\) is a subtype of \(\tau_2\)

5. Core Typing Rules

5.1. Variables

\[\frac{ x : \tau \in \Gamma }{ \Gamma \vdash x : \tau } \text{(T-Var)}\]

5.2. Literals

\[\frac{ n \in \mathbb{Z} }{ \Gamma \vdash n : \text{Int} } \text{(T-Int)} \quad \frac{ f \in \mathbb{R} }{ \Gamma \vdash f : \text{Float} } \text{(T-Float)}\]
\[\frac{ s \in \text{UTF-8} }{ \Gamma \vdash s : \text{String} } \text{(T-String)} \quad \frac{ b \in \{\text{true}, \text{false}\} }{ \Gamma \vdash b : \text{Bool} } \text{(T-Bool)}\]

5.3. UUID Literals

\[\frac{ u \text{ matches UUID format} }{ \Gamma \vdash u : \text{UUID} } \text{(T-UUID)}\]

5.4. Vector Literals

\[\frac{ \Gamma \vdash v_1 : \text{Float} \quad \cdots \quad \Gamma \vdash v_n : \text{Float} }{ \Gamma \vdash [v_1, \ldots, v_n] : \text{Vector}[n] } \text{(T-VectorLit)}\]

5.5. List Construction

\[\frac{ \Gamma \vdash v_1 : \tau \quad \cdots \quad \Gamma \vdash v_n : \tau }{ \Gamma \vdash [v_1, \ldots, v_n] : \text{List}(\tau) } \text{(T-ListLit)}\]

6. Query Typing Rules

6.1. Slipstream Query (No Proof)

\[\frac{ \Gamma \vdash \mathcal{M} : \text{ModalitySet} \quad \Gamma \vdash S : \text{Source} \quad \Gamma \vdash C : \text{Octad} \to \text{Bool} }{ \Gamma \vdash \texttt{SELECT } \mathcal{M} \texttt{ FROM } S \texttt{ WHERE } C : \text{QueryResult}[\mathcal{M}] } \text{(T-SlipstreamQuery)}\]

Where:

\[\text{QueryResult}[\mathcal{M}] = \text{List}(\text{Octad}_\mathcal{M})\]

And:

\[\text{Octad}_\mathcal{M} = \{h : \text{Octad} \mid \forall m \in \mathcal{M}. m(h) \neq \text{None}\}\]

6.2. Dependent-Type Query (With Proof)

\[\frac{ \Gamma \vdash \mathcal{M} : \text{ModalitySet} \quad \Gamma \vdash S : \text{Source} \quad \Gamma \vdash C : \text{Octad} \to \text{Bool} \quad \Gamma \vdash P : \text{ProofSpec}[\phi] }{ \Gamma \vdash \texttt{SELECT } \mathcal{M} \texttt{ ... PROOF } P : \text{ProvedResult}[\mathcal{M}, \phi] } \text{(T-ProvedQuery)}\]

Where:

\[\text{ProvedResult}[\mathcal{M}, \phi] = \Sigma r : \text{QueryResult}[\mathcal{M}]. \text{Proof}[\phi(r)]\]

Interpretation: A dependent sum - a pair of query results r and a proof that \(\phi(r)\) holds.

6.3. Modality Set Typing

\[\frac{ \forall m \in \{m_1, \ldots, m_k\}. m \in \{\texttt{GRAPH}, \texttt{VECTOR}, \ldots\} }{ \Gamma \vdash \{m_1, \ldots, m_k\} : \text{ModalitySet} } \text{(T-ModalitySet)}\]
\[\frac{ }{ \Gamma \vdash * : \text{ModalitySet} } \text{(T-AllModalities)}\]

6.4. Source Typing

6.4.1. Octad Source

\[\frac{ \Gamma \vdash u : \text{UUID} }{ \Gamma \vdash \texttt{HEXAD } u : \text{Source} } \text{(T-OctadSource)}\]

6.4.2. Federation Source

\[\frac{ \Gamma \vdash p : \text{Pattern} \quad \Gamma \vdash d : \text{DriftMode} }{ \Gamma \vdash \texttt{FEDERATION } p \texttt{ WITH DRIFT } d : \text{Source} } \text{(T-FederationSource)}\]

6.4.3. Store Source

\[\frac{ \Gamma \vdash \text{id} : \text{String} }{ \Gamma \vdash \texttt{STORE } \text{id} : \text{Source} } \text{(T-StoreSource)}\]

7. Condition Typing Rules

7.1. Simple Conditions

7.1.1. Graph Condition (SPARQL Pattern)

\[\frac{ \Gamma \vdash p : \text{SPARQLPattern} }{ \Gamma \vdash p : \text{Octad} \to \text{Bool} } \text{(T-GraphCond)}\]

7.1.2. Vector Similarity

\[\frac{ \Gamma \vdash v : \text{Vector}[n] \quad \Gamma \vdash t : \text{Float} }{ \Gamma \vdash \texttt{h.embedding SIMILAR TO } v \texttt{ WITHIN } t : \text{Octad} \to \text{Bool} } \text{(T-VectorSimilarity)}\]

7.1.3. Vector Nearest-K

\[\frac{ \Gamma \vdash k : \text{Int} \quad k > 0 \quad \Gamma \vdash m : \text{MetricType} }{ \Gamma \vdash \texttt{h.embedding NEAREST } k \texttt{ USING } m : \text{Octad} \to \text{Bool} } \text{(T-VectorNearest)}\]

7.1.4. Tensor Condition

\[\frac{ \Gamma \vdash \text{field} : \text{Octad} \to \text{Tensor}[\ldots] \quad \Gamma \vdash \text{op} : \text{TensorOp} \quad \Gamma \vdash \text{literal} : \text{Tensor}[\ldots] }{ \Gamma \vdash \text{field op literal} : \text{Octad} \to \text{Bool} } \text{(T-TensorCond)}\]

7.1.5. Semantic Condition (Contract Satisfaction)

\[\frac{ \Gamma \vdash c : \text{Contract} \quad \Gamma \vdash p : \text{Params} }{ \Gamma \vdash \texttt{SATISFIES } c(p) : \text{Octad} \to \text{Bool} } \text{(T-SemanticCond)}\]

7.1.6. Temporal Condition (As Of)

\[\frac{ \Gamma \vdash t : \text{Timestamp} }{ \Gamma \vdash \texttt{AS OF } t : \text{Octad} \to \text{Bool} } \text{(T-TemporalAsOf)}\]

7.2. Compound Conditions

7.2.1. Conjunction

\[\frac{ \Gamma \vdash C_1 : \text{Octad} \to \text{Bool} \quad \Gamma \vdash C_2 : \text{Octad} \to \text{Bool} }{ \Gamma \vdash C_1 \texttt{ AND } C_2 : \text{Octad} \to \text{Bool} } \text{(T-And)}\]

7.2.2. Disjunction

\[\frac{ \Gamma \vdash C_1 : \text{Octad} \to \text{Bool} \quad \Gamma \vdash C_2 : \text{Octad} \to \text{Bool} }{ \Gamma \vdash C_1 \texttt{ OR } C_2 : \text{Octad} \to \text{Bool} } \text{(T-Or)}\]

7.2.3. Negation

\[\frac{ \Gamma \vdash C : \text{Octad} \to \text{Bool} }{ \Gamma \vdash \texttt{NOT } C : \text{Octad} \to \text{Bool} } \text{(T-Not)}\]

8. Proof Specification Typing

8.1. Existence Proof

\[\frac{ \Gamma \vdash c : \text{ExistenceContract} }{ \Gamma \vdash \texttt{PROOF EXISTENCE}(c) : \text{ProofSpec}[\exists h. \text{Valid}(h)] } \text{(T-ProofExistence)}\]

8.2. Citation Proof

\[\frac{ \Gamma \vdash c : \text{CitationContract} }{ \Gamma \vdash \texttt{PROOF CITATION}(c) : \text{ProofSpec}[\forall h. \text{CitationValid}(h)] } \text{(T-ProofCitation)}\]

8.3. Access Proof

\[\frac{ \Gamma \vdash c : \text{AccessContract} \quad \Gamma \vdash u : \text{User} }{ \Gamma \vdash \texttt{PROOF ACCESS}(c) : \text{ProofSpec}[\forall h. \text{hasPermission}(u, h)] } \text{(T-ProofAccess)}\]

8.4. Integrity Proof

\[\frac{ \Gamma \vdash c : \text{IntegrityContract} }{ \Gamma \vdash \texttt{PROOF INTEGRITY}(c) : \text{ProofSpec}[\forall h. \text{Untampered}(h)] } \text{(T-ProofIntegrity)}\]

8.5. Provenance Proof

\[\frac{ \Gamma \vdash c : \text{ProvenanceContract} }{ \Gamma \vdash \texttt{PROOF PROVENANCE}(c) : \text{ProofSpec}[\forall h. \text{ValidLineage}(h)] } \text{(T-ProofProvenance)}\]

9. Subtyping

9.1. Subtyping Relation

\(\tau_1 \leq \tau_2\) means "\(\tau_1\) is a subtype of \(\tau_2\)" (values of type \(\tau_1\) can be used where \(\tau_2\) is expected).

9.1.1. Reflexivity

\[\frac{ }{ \tau \leq \tau } \text{(Sub-Refl)}\]

9.1.2. Transitivity

\[\frac{ \tau_1 \leq \tau_2 \quad \tau_2 \leq \tau_3 }{ \tau_1 \leq \tau_3 } \text{(Sub-Trans)}\]

9.1.3. Refinement Subsumption

\[\frac{ \forall v. \phi_1(v) \Rightarrow \phi_2(v) }{ \{x : \tau \mid \phi_1(x)\} \leq \{x : \tau \mid \phi_2(x)\} } \text{(Sub-Refine)}\]

Intuition: If \(\phi_1\) is stronger (more restrictive) than \(\phi_2\), then the refined type with \(\phi_1\) is a subtype.

9.1.4. List Covariance

\[\frac{ \tau_1 \leq \tau_2 }{ \text{List}(\tau_1) \leq \text{List}(\tau_2) } \text{(Sub-List)}\]

9.1.5. Function Contravariance (Input) and Covariance (Output)

\[\frac{ \tau_1' \leq \tau_1 \quad \tau_2 \leq \tau_2' }{ \tau_1 \to \tau_2 \leq \tau_1' \to \tau_2' } \text{(Sub-Arrow)}\]

Note: Input is contravariant, output is covariant.

9.1.6. Modality Subsumption

\[\frac{ \mathcal{M}_1 \subseteq \mathcal{M}_2 }{ \text{Octad}_{\mathcal{M}_2} \leq \text{Octad}_{\mathcal{M}_1} } \text{(Sub-Octad)}\]

Intuition: A octad with more modalities can be used where fewer are required (contravariant).

9.2. Subsumption in Typing

\[\frac{ \Gamma \vdash e : \tau_1 \quad \tau_1 \leq \tau_2 }{ \Gamma \vdash e : \tau_2 } \text{(T-Sub)}\]

10. Type Equivalence

10.1. Definitional Equality

Two types \(\tau_1\) and \(\tau_2\) are definitionally equal (\(\tau_1 \equiv \tau_2\)) if they are syntactically identical up to alpha-renaming.

Examples:

  1. \(\text{Vector}[10\) \equiv \text{Vector}[10]]

  2. \(\Pi x : \tau. \sigma \equiv \Pi y : \tau. \sigma[y/x\)]

10.2. Semantic Equality

Two types are semantically equal if they denote the same set of values:

\[\tau_1 \simeq \tau_2 \iff \forall v. (v : \tau_1 \iff v : \tau_2)\]

Example:

\[\{x : \text{Int} \mid x \geq 0\} \simeq \{x : \text{Int} \mid x > -1\}\]

11. Type Checking Algorithm

11.1. Bidirectional Type Checking

VCL uses bidirectional type checking with two modes:

  1. Synthesis (\(\Uparrow\)): Infer type of an expression

  2. Checking (\(\Downarrow\)): Check expression against expected type

11.1.1. Synthesis Rules

\[\Gamma \vdash e \Uparrow \tau\]

"In context \(\Gamma\), synthesize that expression e has type \(\tau\)"

Examples:

\[\frac{ x : \tau \in \Gamma }{ \Gamma \vdash x \Uparrow \tau } \text{(Synth-Var)}\]
\[\frac{ n \in \mathbb{Z} }{ \Gamma \vdash n \Uparrow \text{Int} } \text{(Synth-Int)}\]

11.1.2. Checking Rules

\[\Gamma \vdash e \Downarrow \tau\]

"In context \(\Gamma\), check that expression e has type \(\tau\)"

Example:

\[\frac{ \Gamma \vdash e \Uparrow \tau' \quad \tau' \leq \tau }{ \Gamma \vdash e \Downarrow \tau } \text{(Check-Sub)}\]

11.2. Type Checking Queries

11.2.1. Slipstream Query

Input: Query Q, context \(\Gamma\) Output: Type \(\tau\) or type error

check_slipstream_query Γ (SELECT M FROM S WHERE C) =
  let τ_M = check_modalities Γ M in
  let τ_S = check_source Γ S in
  let τ_C = check_condition Γ C in
  if τ_C  (Octad  Bool) then
    QueryResult[M]
  else
    TypeError "Condition must be predicate on Octad"

11.2.2. Dependent-Type Query

Input: Query Q with PROOF clause, context \(\Gamma\) Output: Proved result type or type error

check_proved_query Γ (SELECT M ... PROOF P) =
  let τ_base = check_slipstream_query Γ (SELECT M ...) in
  let (ProofSpec[φ]) = check_proof_spec Γ P in
  ProvedResult[M, φ]

11.3. Type Inference

Some VCL constructs support type inference:

11.3.1. Vector Dimension Inference

From vector literal:

WHERE h.embedding SIMILAR TO [0.1, 0.2, 0.3]

Inferred: Vector[3]

11.3.2. Refinement Predicate Inference

From conditions:

WHERE h.embedding SIMILAR TO v WITHIN 0.8

Inferred: {h : Octad | similarity(h.embedding, v) ≥ 0.8}

12. Type Safety Properties

12.1. Progress

Theorem (Progress):

If \(\Gamma \vdash Q : \tau\) and Q is a closed query (no free variables), then either:

  1. Q is a value, or

  2. \(\exists Q'. Q \rightarrow Q'\) (Q can take a step)

12.2. Preservation

Theorem (Preservation):

If \(\Gamma \vdash Q : \tau\) and \(Q \rightarrow Q'\), then \(\Gamma \vdash Q' : \tau\).

Informal: Type is preserved during evaluation.

12.3. Soundness

Theorem (Type Soundness):

If \(\Gamma \vdash Q : \tau\) and \(Q \Rightarrow^* v\), then \(\Gamma \vdash v : \tau\).

Informal: Well-typed queries don’t "go wrong" - they either diverge or produce a value of the expected type.

13. Dependent Types vs Slipstream Paths

13.1. Type Differences

Aspect Slipstream Path Dependent-Type Path

Result Type

\(\text{List}(\text{Octad})\)

\(\Sigma r : \text{List}(\text{Octad}). \text{Proof}[\phi(r)\)]

Guarantees

Best-effort filtering

Formally verified via ZKP

Performance

Fast (no proof overhead)

Slower (proof generation)

Use Case

Exploratory queries

Compliance, audits

Type Complexity

Simple (no dependent types)

Complex (refinements, dependent products)

13.2. Type Erasure

For runtime optimization, dependent types can be erased from slipstream queries:

\[\text{erase}(\Pi x : \tau_1. \tau_2) = \text{erase}(\tau_1) \to \text{erase}(\tau_2)\]
\[\text{erase}(\Sigma x : \tau_1. \tau_2) = \text{erase}(\tau_1) \times \text{erase}(\tau_2)\]
\[\text{erase}(\{x : \tau \mid \phi\}) = \text{erase}(\tau)\]
\[\text{erase}(\text{Proof}[\phi]) = \text{Unit}\]

Dependent-type path performs full type checking with refinements. Slipstream path can use erased types for performance.

14. Type System Extensions

14.1. Row Polymorphism (Future)

Support for flexible octad projections:

\[\text{Octad}\{\text{graph} : \text{Graph}, \text{vector} : \text{Vector} \mid r\}\]

Where r is a row variable representing "possibly more modalities".

14.2. Effect Types (Future)

Track side effects in query execution:

\[\text{Query} : \text{OctadSet} \xrightarrow{\{\text{IO}, \text{Network}\}} \text{List}(\text{Octad})\]

Indicating the query performs IO and network operations.

14.3. Linear Types (Future)

Ensure ZKP witnesses are used exactly once:

\[\text{Witness} : \text{LinearType}\]

15. References

  • VCL Formal Semantics

  • VCL Grammar

  • Pierce, B. C. (2002). Types and Programming Languages. MIT Press.

  • Pierce, B. C. (ed.) (2005). Advanced Topics in Types and Programming Languages. MIT Press.

  • Chlipala, A. (2013). Certified Programming with Dependent Types. MIT Press.

  • Norell, U. (2007). Towards a practical programming language based on dependent type theory. PhD thesis, Chalmers.