Table of Contents
Introduction
Recently, I have been reading an article for yet another query language, HypergraphQL. But do we really need another query language ? I do agree that it is always good to offer the developer a variety of choices for querying a database, especially if the new query language makes it easier or perhaps better in some sense to fetch or input data. But that’s not really a big deal, I will try to explain why in brief here.
Whether it is a standard based SPARQL, SQL, TMQL, XQuery language, or a more graph oriented query language like GraphQL (former Facebook Query language), Gremlin and Cypher, the end user (developer) has to write down the query in some form according to the formal system of the language. That also implies that you do know your data model in some detail, i.e. structural form of data, instances vs object types, and kind of connections/links. This is essential to describe your query in the language.
Therefore all Query Languages are dependent on their data model and naturally they also carry the deficiencies of their data model and several quirks of their vendor. So in my opinion that is where all bets are off. Better show me a new data model, describe how it deviates from other established data models, in what terms it is better and most important the functional operations, the basic mechanisms behind the implementation of a query language that is based on this data model.
Speaking about HypergraphQL, and former Facebook GraphQL that is not something new. They originate from Freebase MQL language, which according to Barak Michener, a former employer of Metaweb Technologies and later on of Google, it was mainly developed in the period between 2006-2008 to elevate GQL a query language for a kind of triple store database (graphd). And if we decide to search further back MQL approach looks like the Query by Example (QBE) language for relational databases that was devised during the mid-1970s, I still remember Ashton-Tate dBase product that I was playing with as a BSc student.
But what really puzzles me is that I hardly see a strong mathematical foundation, e.g. like Codd’s relational algebra, that covers important features of all these query languages I mentioned above. I am specifically referring to:
- closed vs open world assumption
- closure under operations
- constrains
- updates
- joins
That is why in a previous article of mine I have proposed to return back to the roots and investigate better the Relational Algebra and these reasons that made both SQL and NoSQL DBMS to deviate from the original relational model path.
For those that follow my posts, they are aware that I am not speaking purely from a theoretical point of view. There is an alternative data model (R3DM/S3DM) that I propose, a particular software technology based on this (Associative Semiotic Hypergraph) and two prototypes (TRIADB, HyperMorph) implemented and demonstrated with an intuitive functional declarative way to query things. Nevertheless performance is a top priority, and that is what I am currently investigating.