A Survey of Top-level Ontologies
3.1 General ontological requirements
The general requirements provide a context for the framework and comprise:
- the need for an ontological framework (3.1.1);
- how the need for an ontological framework translates into making choices of ontological commitments (3.1.2 and 3.1.3);
- the requirements that arise from the lack of prior knowledge (3.1.4); and
- the need for consistent independent (federated) development (3.1.5).
3.1.1 Real world ontology framework
If one wants to share data from different systems, then one needs to have something like a common framework within which to share it. When the data is in this common framework, its meaning needs to be clear and unambiguous to the systems sharing it. This is often called semantic interoperability. For example, it needs to be clear and unambiguous whether data items (for example, rows on a table) from two systems are referring to the same object or different objects in the ‘real world’ – for example the UNICLASS Code ‘Ac_05_50_91 – Timber sourcing’ is marked as mapping to NBS Code ‘45-60-90/340 – Timber procurement’. To implement this systematically, one needs to be clear and unambiguous about what these objects in the real world are. This involves knowing the ontology, in other words, knowing the set of objects that the common framework assumes exist.
3.1.2 Choices of ontological commitments
Unfortunately, when one starts to look closely, it is neither clear nor unambiguous exactly what the objects in the real world are. Ontologically, there are a variety of ways that one can take the real world to be. However, for our assessment, these can be crystallised into a small number of focussed choices – called ontological commitments – which build up into an integrated ontological architecture.
A key purpose of this paper is to provide a framework for understanding the range and nature of the ontological commitments and apply this to the collected top-level ontologies. Thus, providing the groundwork for the choice of an appropriate ontological architecture.
3.1.3 Implicit and explicit choices
Understandably, most of the datasets currently available are not clear about their ontological architecture, which of these ontological commitments have been made – their choices, such as they are, are implicit. In practice, datasets often make these choices implicitly – choosing, without realising, one way in one area and another way in another area. This point is often made in philosophy textbooks, see, for example, (Lowe, 1998).
The assessment framework gives a clear picture of the range of these choices. With this in hand, when selecting or developing a top-level ontology, one can be clear which choices are made (and which left to chance); and so have some idea how this will be, in turn, reflected in the data structures and data that are implemented.
3.1.4 Lack of prior knowledge
Usually the developers of a system of systems (which will include behavioural, societal and human elements) have prior knowledge of some of the systems that will use the common framework. However, often other requirements will arise as new systems are added to the common framework – of which it will have no prior knowledge. Hence, the framework needs to be sufficiently expressive to accommodate them. More specifically, care needs to be taken not to adopt ontological commitments which unnecessarily restrict its ability to express meanings that probably will occur in data from new source systems. For example, the commitment choices include whether to restrict the types to first order (where types cannot have types as instances). One cannot just assume that as the current data set only has first order types, then one can restrict oneself to these – one also needs some confidence that a requirement for higher order types will not emerge in the future. In this case, there is then a requirement to be sensitive to how a choice of ontological commitment might restrict useful expressivity.
3.1.5 Consistent independent (federated) development
Many systems for connecting systems, like the NDT, have a hub and spoke structure where spoke systems map their data into the central hub system. It is likely that these mappings will be done independently. In many cases, the content from different systems will overlap. Where this happens, the mappings produced should be equivalent. Adopting an ontological approach is a big step towards achieving this because it provides an independent basis for establishing identity between the systems. Fine tuning the choice of ontological commitments to ensure a clear notion of what is referred to is a good further step. In this case, there is then a requirement to be sensitive to how a choice of ontological commitment can be clearer about what is referred to and so give rise to equivalent independent mappings.
3.2 Overarching ontological architecture framework
As mentioned earlier, there have been several attempts to get to grips with the kinds of choices of ontological commitments that TLOs should make; these are listed in Appendix G. These attempts provide a good starting point and we refer to them when this is useful, usually in the technical appendices. However, all these lists are partial and, in some cases, not based upon sufficient familiarity with the relevant research. Furthermore, none of them provide an over-arching organising structure; one that provides a framework for understanding and assessing choices across a range of commitments. We develop this framework in 3.2.1 below.
There needs to be a clear and solid underlying basis for the framework. There is an established set of criteria for assessing what a good ontology is, based upon what makes a good scientific theory – listed in Appendix H. One of these criteria, simplicity, provides us with a good basis for a broad assessment of the architecture and is broadly outlined below, with more technical detail in Appendix I.
Simplicity can be thought of as having two aspects; structural and ontological. Where structural or syntactic simplicity is roughly concerned with the shape of the organising structure, ontological simplicity is roughly concerned with the number of objects.
For structural (syntactic) simplicity we look at the characteristic ways the ontological commitments shape the organising structure (see section 4). For ontological simplicity we do some more analysis to establish broad brush accounting principles (as set out in 126.96.36.199 in 188.8.131.52 below).
184.108.40.206 The laser
There is a revised approach that seems to capture some of the relevant complexity. This uses a distinction between fundamental and derived objects and updates Occam’s Razor with what Jonathan Schaffer (Schaffer, 2015) calls the laser – “do not multiply fundamental objects without necessity”. He illustrates the difference between the razor and the laser with this example. Imagine Esther posits a fundamental theory with 100 types of fundamental particle. Her theory is predictively excellent and is adopted by the scientific community. Then Feng comes along and—in a moment of genius—builds on Esther’s work to discover a deeper fundamental theory with 10 types of fundamental string, which in varying combinations make up Esther’s 100 types of particle. This looks like a paradigm case of scientific progress in which a deeper, more unified, and more elegant theory replaces a shallower, less unified, and less elegant theory. However, under razor accounting, both the number of particles and strings are counted and therefore Feng’s theory has 10 more objects and so should be replaced with Esther’s. Under laser accounting though, 100 fundamental objects have been replaced by 10 – so Feng has made an improvement. Here again we have a distinction between making a commitment and its cost. We make a commitment to both fundamental and derived objects, but the cost of derived objects is significantly less than that of fundamental objects.
As Schaffer notes, what emerges from this approach is a general pressure towards a permissive and abundant view of what there is, coupled with a restrictive and sparse view of what is fundamental. As he notes, classical mereology (the relations of parts to wholes)and pure set theory (where the only sets, well-determined collections of objects, under consideration are those whose members are also sets) come out as paradigms of methodological virtue, for making so much from so little. This suggests a preference for, what has been called, plenitude – not placing unnecessary constraints on what can exist; if it is possible for something to exist, then it does. Both classical mereology and (impure) set theory exhibit this. Simplifying a little, in classical mereology, given any two objects, their fusion exists – in set theory, their set exists. Where many of the candidate TLOs make explicit their mereological position, they chose classical mereology. However, where they make explicit their position on types, only a significant minority adopt a position of plenitude. Schaffer suggests a principle to capture this, the Ontological Bang for the Buck principle: optimally balance minimization of fundamental objects with maximization of derivative objects, especially useful ones.
In 220.127.116.11 above, fruitfulness was mentioned as being associated with simplicity. The examples provided above show, derivative objects are part of what makes a package of fundamental objects fruitful. In other words, they show that these fundamental objects can be used to produce something useful. However, as discussed, there is a need to be sensitive to both cost and benefits. If two very similar theories had roughly the same cost in terms of fundamental objects, but one had a large commitment to many useless entities but the other did not – and they were similar in all other relevant respects, this seems like overgeneration. The additional useless plenitude is more like profligacy or promiscuity – it is not fruitfulness. This gives us a ‘useful’ basis for assessing the TLOs.