The concept-oriented model (COM) of data is a general-purpose unified model. In this post, we describe one aspect of this model. More specifically, we describe how this model can unite two branches currently existing in computer science: value or domain modeling and relation modeling. It is achieved by introducing a new data modeling and programming construct, called concept, which is used for typing both domains and relations.
1. Relations and Domains
In the relational model, a domain is a set of values and a tuple is a combination of values from some domains. For example, a domain could consist of all integer values like 1, 2, 3 and so on.
A relation is defined over some domains via its schema and tuples. A relation schema is a number of domains for the relation attributes which are also called attribute types. Relation tuples are composed of values taken from these domains. For example, a ColorTable
could be defined as a set of triples each composed of three integers taken from one domain. To define a new relation, we have to specify domains for its attributes.
This classical approach clearly separates relations from domains. Here, relations and domains are modeled differently by using different modeling constructs and patterns. Data modeling is broken into two isolated areas: relation modeling and domain (or value) modeling. Relations are normally modeled using the relational model while domains are modeled using object-oriented methods. For instance, domains can be extended. In figures, relations will be shown in light blue color and domains will be shown in light green color.
2. Complex Values
Domains can be used not only to define relations. They can also be used to define complex domains which are sets of complex values. A complex value is a combination of several simpler values taken from other domains. Thus, complex values may have arbitrary structure which is defined in terms of existing domains. For example, we could define a new domain for colors where one value is composed of three integers. It is very similar to how we have defined a new color relation except that now colors are represented as values within a domain rather than tuples within a relation.
These complex values can now be used in relations as if they were primitive values. For instance, the ColorTable
could have an attribute with the type of the complex domain.
Thus complex domains (also known as user-defined types) allow us to model domains with arbitrary structure. And these complex domains can be then used to define relations.
3. Problem
So existing domains can be used to define either new relations or new domains. In other words, relations and domains are defined in terms of already existing domains.
The problem here is that
it is not possible to use existing relations when defining new relations or domains
Attributes in both domains and relations are typed using only domains and there is no possibility to have relation-typed attributes. Thus relations and domains are not only isolated but they are also asymmetric in their use because only domains are used when extending a schema.
Another problem is that relations cannot be extended like domains using the traditional object-oriented approach. For example, we can extend the domain People
when defining a new domain Employees
by adding more specific attributes, but we cannot naturally extend the relation People
by introducing a new relation Employees
.
There exist some solutions to this problem:
- One consists in introducing objects, which are modeled by classes, instead of using tuples and relations. In contrast to relations, classes can be used as attribute types. In this approach however, we will not be able to model custom references with arbitrary structure. In addition, this essentially means switching to the object-oriented approach which has always been controversial in data modeling. For example, it is not very suitable for set-oriented operations.
- Another solution consists in using foreign keys. Yet, foreign key is not a type – it is a constraint. Therefore, it can be used as a pattern or workaround, but not as a principled solution.
Our goal is to make relations and domains absolutely symmetric. So the main question is whether it is possible to combine relation modeling and domain modeling, by making them symmetric with respect to each other, as well as integral parts of one construct. Obviously, it is a quite non-trivial problem which touches the foundations of not only data modeling but also other branches of computer science.
4. Concepts
The solution provided within the concept-oriented model consists in introducing a new construct, called concept:
Concept is defined as a couple of two classes: one identity class and one entity class.
Identity and entity classes are also referred to as reference and object classes, respectively, in concept-oriented programming (COP). The main difference between them is that identities are always values and are passed and stored by-value while entities are passed by-reference.
Concept instances are identity-entity couples which are informally analogous to complex numbers in mathematics. Indeed, complex numbers also have two constituents but are manipulated as one whole. A domain in this case is defined as a set of identity-entity couples rather than either values or tuples. As a result, there is no need to distinguish between value domains and relations. Concepts are used instead of both relation types and domain types by unifying relation modeling and value modeling.
Concept-typed attributes contain references in the format of the identity class. Simultaneously, they reference an object in the format of the entity class. In this way, we can freely vary between by-value and by-reference constituents of data. If a concept has empty entity class, then its instances are values. If a concept has empty identity class, then its instances are represented by primitive references like objects.
5. Conclusion
In summary, concepts in the concept-oriented model allow us to unify domain and relation modeling by using only one construct for both purposes. Concepts provide a type-based mechanism for modeling domain-specific references or foreign keys. It is also important that concepts generalize conventional classes and are used also in concept-oriented programming.
More information on the concept-oriented model and concept-oriented programming can be found here.
Links
- Concept-oriented model: unifying domain and relation modeling. Youtube video
- Concept-oriented model: unifying domain and relation modeling. Slideshare slides
- A. Savinov, Concept-Oriented Model: Classes, Hierarchies and References Revisited, Journal of Emerging Trends in Computing and Information Sciences 3(4), 456-470, 2012. PDF