Neurorganon Upper Level Ontology (NULON)

Introduction

In information science, an upper ontology (also known as a top-level ontology or foundation ontology) is an ontology (in the sense used in information science) which describes very general concepts that are the same across all knowledge domains. An important function of an upper ontology is to support very broad semantic interoperability between a large number of ontologies which are accessible ranking "under" this upper ontology. As the rank metaphor suggests, it is usually a hierarchy of entities and associated rules (both theorems and regulations) that attempts to describe those general entities that do not belong to a specific problem domain.

Neurorganon Upper Level Ontology (NULON) is a top-level ontology also known as a foundation ontology. It includes a list of very general terms grouped in ontological distinct categories , such as 'address' , 'agent', 'artifact', 'device', 'document', 'space', 'time' and many others. These categories determine the most fundamental and the broadest classes of information resources. In NULON members of each class, i.e. vocabulary terms, preserve the same meaning across more specific knowledge domains and other reference work. In that aspect it is similar to a thesaurus because it groups together terms in disjoint collections of mostly similar concepts and it is like a dictionary because it can also include multiple, multi-lingual definitions. It is also an ontology because it defines relations and properties that describe how the terms can be linked together or bind them to web, computerized information resources. NULON recent version, numbers more than 500 core vocabulary terms grouped in 47 maximally disjoint categories. All these terms have been consistently mapped to standard definitions from at least two major reference works in the realm of semantically linked data; OpenCyc and Wikipedia. Respectively their information resources are used as identifiers and references for NULON terms. Most important, at this stage of architectural design what makes NULON really different from other Upper-Level Ontologies is the kind of grouping of the terms and the categories included and described in the οntology. This is the result of a big effort, the whole year, from a single person, to discover and organize efficiently most terms related to information technology, including abstraction, management, computer programming, data storage, metadata, data modeling, identity, reference, design patterns, web use, digital resources and devices to name a few of them.

Information Resource and Information Reference

The first public release of NULON is based on Topic Maps standard for information interchange. Any concept can be represented by a Topic. In Topic Map Data Model (TMAP-DM) terminology, a Topic is both a container and a handle for everything, including abstract concepts, web resources, physical objects, and others. NULON adopts the use of a Topic in an attempt to unify the reference mechanism from both a human and machine perspective, see also the presentation From WWW, the uncharted World Wide Web, to GGG, the charted Giant Global Graph at Ignite Athens 2012. Hopefully this fundamental unit of information will reference all kinds of information resources and might become the core mechanism of the next generation of Internet, i.e. Web 3.0 or GGG (Global Giant Graph). Nevertheless we are reluctant to use the term "Topic" because it holds connotations with the word "Subject" or the term "Item of Discussion". Instead of that we advocate a meaningful and purposeful use of the term "Information Reference/Resource" (TMAP-IR). The reader should feel comfortable interchanging the (R) in resource with the (R) in Reference because we treat (TMAP-IR) both as a container for data and as a reference to other information resources. Regarding to the container view, (TMAP-IR) is modeled like a frame with data held in three types of key-value slots; name identifiers, URL addresses, and any other type of property, also known as Topic Map - Occurrence (TMAP-PROP). This powerfull conception of a single data construct that is used to represent entities in the real world has already been of great success in object-oriented programming (OOP). The technical approach in the original WWW and in the present Semantic Web that information architects follow is very different. The fundamental unit for linking information is the Uniform Resource Identifier (ID-URI) and eventually this has been downgraded to the semantically overloaded Uniform Resource Locator (ADDR-URL). From that point everything else is still revolving around that concept (see RDF/OWL).

Information Reference vs Information Realization

NULON unifies but at the same time keeps distinct the constitutional characteristics of (TMAP-IR), namely the identifiers, the references, and other descriptive, structural and administrative meta-data properties. One of the innovative aspects in the design of NULON is the explicit distinction it makes between a Binary Information Resource (TMAP-BIR) and a Term Information Resource (TMAP-TIR). The later shares the same definition as the 'subject indicator' in topic map terminology. They are described by words, compound words, and phrases that in specific contexts are given specific meaning. TMAP-TIR in NULON is defined by using mainly Wikipedia and OpenCyc controlled vocabularies. TMAP-TIR is like an index_term. To distinguish the notion of the term from that of TMAP-BIR one has to think in terms of the container of information. For example natural objects, e.g. tree, animal, have a non-digital container. These are natural candidates for TMAP-TIR. On the other hand an electronic document in the form of a computer file is included in a digital container, therefore this a clear case of a TMAP-BIR. Binary Information Resource/Realization TMAP-BIR represents web resources, computer files and folders, web or database services, anything that appears to be in a digital format and that can be retrieved with an application protocol such as http, file, smtp, or ftp. In fact this is the definition of information resource given in the Topic Map Data Model TMAP-TMDM. In NULON TMAP-BIR is used instead to avoid the confusion with a similar term that is used in the Semantic Web and the OWL/RDF model.

Information vs Resource

Information resource is the central term of information science. It is the corner stone of modelling for any information system that manages data. This term has caused so much confusion among scientists, take for example the identity crisis problem that for many is still an issue. Even the definition of the term in W3C architecture is heavily dependent on other terms such as information, resource, representation, data encoding, essential characteristics, and message. This blending of unclear semantics and muddled statements leads to naive conclusions such as "dog is a resource". But what kind of resource a dog is? On the contrary, in NULON information resource is always representing "Information". Therefore "an information resource" is considered to be a representation of information.

Information describes fully some entity, a concept, an object, a subject, a being, an idea, therefore an information resource can be a representation for these terms. We can only approach anything that can be conceived or perceived from a certain perspective, in a subjective way and in a very limited sense, because of our communication obstacles. It is only part of information we have access to depending on our intelligence and spiritual endeavor.

In W3C the term "resource" is associated with the same notion as that of the term "information" translated in Greek, i.e. "πληροφορία", "carries, conveys context thoroughly, exhaustivelly, in a complete way conceptually, verbally". In section 2.2. URI/Resource Relationships we read, "although it is possible to describe a great many things about a car or a dog in a sequence of bits, the sum of those things will invariably be an approximation of the essential character of the resource".

The reader should have already been in trouble absorbing the definitions and relations among these terms. Perhaps the easiest way to obtain an accurate perspective will be to think in terms of a reference. From the W3C perspective, a resource is the referent of an information resource. From the NULON perspective, information refers to an entity, a concept, an object, a subject, a being, an idea. Information is an abstract thing, it is something conceived in the mind, or something preexisting conception. Information does not have a form of its own. This is contrary to the 15th century latin loan definition of the term "Information" in Wikipedia. When we start discussing the form of information, then we are entering the world of information resources and the various ways these are realized by humans. Information resources represent information not wholly but partially, i.e. some aspect of information. Hopefully the Unified Information Representation in Four Levels (Data - Information Resources - Information - Entities) diagram, a concept map designed especially for clarifying the basic concepts behind NULON, will help you see things from a different perspective and understand why we need to keep separate the addressing mechanism, i.e. URL, from identity, the content from the container and the content format, the description of a "resource" from the description of "information", the encoding/decoding at a human level from the same process simulated on a machine.

Key Design Aspects and Use

Part of the NULON effort is an attempt to define the core ontology of Web 3.0 in a practical, well defined, standardized and consistent way so that it can be shared and used among those users that generate or use ontologies. NULON is different in many aspects from other similar upper ontologies. It is entirely constructed and described with its own ontology entries. Based on the topic map standard, it is very easy to extend with definitions from other languages and ontologies, and it includes a minimal set of elements, categorized and abbreviated for memorizing.

Secondly TIR to BIR, BIR to BIR, BIR to TIR, and TIR to TIR relations are defined at a logical layer and instances of these relations are asserted. NULON follows the topic map standard and allows the definition of n-ary, undirected relations where the arguments play a specific role. NULON ontology is a system of ontological categories (classes) defined in such a way to alleviate the work of the information architect. One of these categories deals with the abstraction principles, hierarchy, taxonomy, meronomy, grouping, faceting, and instantiation and how these are modeled and used in topic maps (see also Metadata Modeling). Another category includes terms like Is Defined By, Subject of, Is About, Refers To, that are used for mapping BIRs to TIRs, i.e. subject indexing . The combination of these two kinds of relations, makes a well defined and intuitive method for searching and retrieving information resources. See also the relevant concept map from the NULON point of view that portrays these concepts.

Last but not least, NULON makes distinct the concepts of naming, addressing, and identification of information resources. Each information resource TMAP-IR that is included in NULON is assigned a GUID during creation. This identifier is used to create permanent URL addresses (ADDR-PURL), and uniform resource names (ID-URN), with the NULON namespace. Labelling of information resources, also known as naming, is supported with the preferred lexical string, i.e. base name, (NAM-prefer), alternative forms of the base name, (NAM-alt), and also other types of naming such as abbreviation, human names, and prefixes. NULON can be easily extended to include multi-lingual definitions and names in any natural language (LANG-natural).

Purposeful Design

To solve efficiently the puzzle of data modeling, you have to match three central pieces. These are linked in Neurorganon concept map as data model layers:

Therefore purposeful design is occuring naturally when you solve efficiently the puzzle above. We will talk about construction of ontologies and this is closely related to the logical layer. We will refer to integration and subject indexing of information resources and this is dependent on the conceptual layer. And we will mention storage and retrieval of data that is naturally referring to the implementation layer.

Construction of ontologies

Scaffolding for the construction of other domain ontologies. Use NULON as a template to construct any other more specific domain ontology. Template includes the set of relations that have been defined in NULON, as well as the administrative, descriptive and structural metadata. But most important, it is the key design principles that other domain experts will have to follow to make their ontology compatible with NULON. The logical layer of the database model, is critical for this task. For example, how you define classes of entities, how you define and link properties to them, how you define relations between members of classes. That kind of analysis and construction of data models has been given the term Metadata Modeling and includes:
- Generalization
- Association
- Multiplicity
- Aggregation/Meronomy
- Properties
Integration of information resources

Promote interoperability with other metadata sets and domains. Data mapping between a data model constructed with NULON and any other data model should occur with a minimal effort. For any upper level ontology to succeed, it should become the lingua franca of information technology. Currently that role is played with a very limited scope, mainly for presentation purposes by HTML. While other ontologies that have been expressed with a markup language, i.e. XML oriented, have not been adopted by a wide user base. From this point of view, NULON looks very promising to fulfill that purpose because of the way it binds together definitions, references, identifiers, prefixes, and names from other vocabularies and ontologies. But the ultimum test is the application of it in real use cases.
Subject indexing of information resources

In social networks that are related to tagging bookmarks this practice is also known as a folksonomy. It is a method of collaboratively creating and managing tags. But a tag, is simply a name for a TIR and the web resource that is tagged is a BIR. The biggest problem with folksonomies is that they lack a common basis for relating similar content. The best of these systems are assigning tags from a controlled vocabulary such as Wikipedia. But even in that case, there is not any structure for the tags they use or the user is using a hierarchical taxonomy based on folder style containers for the bookmarks. Semantic tagging with NULON, i.e. mapping TIR to BIR, provides a means for asserting a common organizational framework for these information resources. In the most general case the Content Format of a BIR may have any kind of Data Structure stored as a Data Model. Ontology-based information extraction (OBIE) is the task of using an ontology for automatically extracting structured data from BIRs.
Storage and retrieval of data

The current version of NULON is based on the Topic Map standard, this can be modelled and visualized as an undirected graph. Linked data, social networks and relations (associations) in an ontology have a graph structure. Recently graph databases have become extremely popular and a hot topic in the evolution of database management systems. Topic maps are very close to a graph model, nodes are the information resources IRs (Topics) of NULON and edges are the associations (TMAP-ASSOC) that relate these nodes.
Inferencing

We consider Inference to be the final act on any system that will be based on NULON ontology. This process is dependent on all previously defined tasks.

Introduction

Information Resource and Information Reference

Information Reference vs Information Realization

Information vs Resource

Key Design Aspects and Use

Purposeful Design

Construction of ontologies

Integration of information resources

Subject indexing of information resources

Storage and retrieval of data

Inferencing