Graph Databases
Introduction, Standardization,
Opportunities
Peter Eisentraut
[email protected]
@petereisentraut
graph
terms: vertex, node; edge, relationship, arc
directed graph
property graph
transaction
amount=$100.00
transaction
Person ownerOf Account amount=$150.00
name=Alice number=3916 Account transaction
transaction number=6058 amount=$500.00
amount=$300.00 transaction
Account amount=$450.00 Account
ownerOf ownerOf number=3224 number=9794
Company Person
worksFor name=Acme name=Bob
Person
name=Carol
ownerOf
terms: property, label
RDF
http://www.example.org/terms/name John Smith
http://www.example.org/staffid/85740 http://www.example.org/terms/age
http://purl.org/dc/elements/1.1/creator
27
http://www.example.org/terms/creation-date
http://www.example.org/index.html August 16, 1999
http://purl.org/dc/elements/1.1/language
en
terms: triple, subject, predicate, object
property graph vs. RDF
PG RDF
standardization ISO W3C
Cypher, PGQL, G-
languages SPARQL, OWL
CORE, GSQL, GQL
serialization (CSV) XML, JSON
Neo4j, Oracle, Virtuoso, Apache,
vendors
TigerGraph, AWS AWS, many
logic closed-world open-world
GraphQL
graph database uses
social network
recommendations
knowledge representation
bioinformatics
logistics
public infrastructure
finance analytics
access control
SPARQL
W3C RDF query language
PREFIX ex: <http://example.com/exampleOntology#>
SELECT ?capital
?country
WHERE
{
?x ex:cityname ?capital ;
ex:isCapitalOf ?y .
?y ex:countryname ?country ;
ex:isInContinent ex:Africa .
}
Cypher
graph query language by Neo4j
MATCH (nicole:Actor {name: 'Nicole Kidman'})-[:ACTED_IN]->(movie:Movie)
WHERE movie.year < $yearParameter
RETURN movie
PGQL
graph query language by Oracle
SELECT owner.name AS account_holder,
SUM(t.amount) AS total_transacted
FROM financial_transactions
MATCH (p:Person) -[:ownerOf]-> (:Account)
-[t:transaction]- (:Account) <-[:ownerOf]- (owner:Person|Company)
WHERE p.name = 'Alice'
GROUP BY owner
G-CORE
graph query research language by LDBC
CONSTRUCT (c)<-[:worksAt]-(n)
MATCH (c: Company) ON company_graph,
(n: Person) ON social_graph
WHERE c.name = n.employer
The GQL Manifesto
https://gql.today/
Cypher + PGQL + G-CORE = GQL?
GQL
proposed new standardization project of
ISO/IEC JTC1 SC32 WG3 (ISO
39075?)
could be ready in 3–4 years
not compatible with SQL
SQL/PGQ
will be new SQL:202x part 16
read-only graph queries on top of tables
SQL/PGQ: create graph
CREATE PROPERTY GRAPH my_graph
VERTEX TABLES (person, message)
EDGE TABLES (
created SOURCE person DESTINATION message,
commented SOURCE person DESTINATION message
);
SQL/PGQ: query graph
SELECT gt.creation_date, gt.content
FROM my_graph GRAPH_TABLE (
MATCH
(creator IS person WHERE creator.email = '
[email protected]')
-[ IS created ]->
(m IS message)
<-[ IS commented ]-
(commenter IS person)
WHERE creator.email <> commenter.email
COLUMNS (m.creation_date, m.content)
) AS gt;
conclusion
links and credits
RDF
https://www.w3.org/TR/rdf-primer/
SPARQL
https://en.wikipedia.org/wiki/SPARQL
https://www.w3.org/TR/sparql11-overview/
Cypher
https://en.wikipedia.org/wiki/Cypher_Query_Language
https://neo4j.com/docs/cypher-manual/current/
https://www.opencypher.org/
PGQL
http://pgql-lang.org/
G-CORE
http://ldbcouncil.org/sites/default/files/main-cr.pdf
GQL
https://www.gqlstandards.org/
SQL/PGQ
https://www.w3.org/Data/events/data-ws-2019/assets/lightning/OskarVanRest.pdf
https://www.w3.org/Data/events/data-ws-2019/