Skip to content

Latest commit

 

History

History
205 lines (154 loc) · 5 KB

File metadata and controls

205 lines (154 loc) · 5 KB

VoID (Vocabulary of Interlinked Datasets) Setup

What is VoID?

VoID is an RDF vocabulary for expressing metadata about RDF datasets. It helps:

  • Discover datasets
  • Understand dataset structure
  • Find linked data connections
  • Enable SPARQL endpoints
  • Interoperate with semantic web tools

Why VoID for verisim?

VoID enables verisim.dev to:

  1. Publish structured data as Linked Open Data
  2. Connect to other datasets (DBpedia, Wikidata, Schema.org)
  3. Enable semantic queries via SPARQL
  4. Improve discoverability in semantic web search engines
  5. Support research and data integration

VoID Files

  • .well-known/void.ttl - Turtle format (human-readable)
  • .well-known/void.rdf - RDF/XML format (tool-compatible)

Accessing VoID Metadata

# Turtle format
curl https://verisim.dev/.well-known/void.ttl

# RDF/XML format
curl https://verisim.dev/.well-known/void.rdf

Example: Querying with SPARQL

PREFIX void: <http://rdfs.org/ns/void#>
PREFIX dcterms: <http://purl.org/dc/terms/>

SELECT ?dataset ?title ?triples
WHERE {
  ?dataset a void:Dataset ;
           dcterms:title ?title ;
           void:triples ?triples .
}

Integration with verisim

VoID is perfect for verisim because:

  1. Semantic Database: verisim can expose its data as RDF
  2. Linksets: Connect verisim entities to external datasets
  3. SPARQL Endpoint: Query verisim using SPARQL
  4. Schema Alignment: Map verisim schema to standard ontologies

Example verisim Integration

# verisim dataset with linksets
<https://verisim.example.com/dataset> a void:Dataset ;
    dcterms:title "VerisimDB Verified Data" ;
    void:triples 1000000 ;
    void:entities 50000 ;

    # Link to DBpedia
    void:subset <https://verisim.example.com/linkset/dbpedia> ;

    # Link to Wikidata
    void:subset <https://verisim.example.com/linkset/wikidata> ;

    # SPARQL endpoint
    void:sparqlEndpoint <https://verisim.example.com/sparql> ;
.

# Linkset to DBpedia
<https://verisim.example.com/linkset/dbpedia> a void:Linkset ;
    void:linkPredicate owl:sameAs ;
    void:target <https://verisim.example.com/dataset> ;
    void:target <http://dbpedia.org> ;
    void:triples 25000 ;
.

Serving VoID via SSG

For static site generators (SSG), VoID files can be:

  1. Pre-generated during build
  2. Served as static files from .well-known/
  3. Content-negotiated (Turtle for browsers, RDF/XML for tools)

Example: ReScript SSG Integration

// void-generator.res
let generateVoID = (dataset: Dataset.t) => {
  let ttl = `
@prefix void: <http://rdfs.org/ns/void#> .

<https://example.com/dataset> a void:Dataset ;
    void:triples ${Int.toString(dataset.tripleCount)} ;
    void:entities ${Int.toString(dataset.entityCount)} .
`
  // Write to .well-known/void.ttl
  Node.Fs.writeFileSync(".well-known/void.ttl", ttl)
}

Linking to External Datasets

DBpedia

void:subset [
    a void:Linkset ;
    void:linkPredicate owl:sameAs ;
    void:target <http://dbpedia.org> ;
    void:exampleResource <https://verisim.dev/entity/example> ;
] .

Wikidata

void:subset [
    a void:Linkset ;
    void:linkPredicate owl:sameAs ;
    void:target <https://www.wikidata.org/> ;
    void:exampleResource <https://verisim.dev/entity/example> ;
] .

Schema.org

void:vocabulary <https://schema.org/> ;
void:vocabularyPartition [
    void:class schema:Person ;
    void:entities 1000 ;
] .

SPARQL Endpoint (Future)

To add a SPARQL endpoint:

  1. Cloudflare Worker can proxy SPARQL queries
  2. GitHub Pages can serve static SPARQL results
  3. Dedicated backend (for dynamic queries)
// sparql-worker.js (Cloudflare Worker)
addEventListener('fetch', event => {
  event.respondWith(handleSPARQL(event.request))
})

async function handleSPARQL(request) {
  const query = await request.text()
  // Parse SPARQL query
  // Execute against RDF store
  // Return results as JSON-LD or Turtle
}

Validation

Validate VoID files:

# Using rapper (RDF parser)
rapper -i turtle .well-known/void.ttl

# Using Apache Jena
riot --validate .well-known/void.ttl

Discovery

VoID metadata is discoverable via:

  • SPARQL endpoints: https://verisim.dev/sparql
  • .well-known/: https://verisim.dev/.well-known/void.ttl
  • HTTP Headers: Link: </.well-known/void.ttl>; rel="meta"
  • HTML <link>: <link rel="meta" href="/.well-known/void.ttl">

Next Steps for verisim

  1. Export verisim data as RDF (Turtle, N-Triples, RDF/XML)
  2. Update VoID statistics (triple count, entities, etc.)
  3. Create linksets to DBpedia, Wikidata, Schema.org
  4. Deploy SPARQL endpoint (Cloudflare Worker or dedicated server)
  5. Add content negotiation (serve different formats based on Accept header)

Resources

License

PMPL-1.0-or-later