Introduction
In the previous article, I introduced an alternative database solution for .NET, the post-relational database Matisse. I used SQL programming demos to demonstrate its object-oriented features such as user-defined types, inheritance, and polymorphism.
Before we start writing .NET programs for Matisse, we need to know how to define database schemas. Although this is slightly different from relational databases, this is not difficult. If you are familiar with UML, you are almost ready to start. The key differentiator in schema definition has to do with the way in which you define relationships. Relationships work in ways similar to the relational Primary Key - Foreign Key concept, but they come with some specific advantages, as I will demonstrate in this article.
This article is part of a series. The next ones will cover .NET programming step by step, which is the ultimate goal of this series of articles.
UML, SQL DDL, or ODL?
Matisse supports three types of schema definition languages, UML, SQL DDL, and ODL (Object Definition Language). So, which one should you use? The answer is whichever one you like, because you can switch from one to the other whenever you need. For instance, you can define the database schema with SQL DDL at first, and then export the schema into an ODL file for a later update.
Here are a few tips:
SQL DDL is very handy when you need to do some tests, e.g., adding a class with a couple of attributes, or defining/dropping an index on an attribute. DDL is also useful when you need to migrate a relational schema into Matisse.
If you are familiar with C++ or IDL (Interface Definition Language), ODL is easy to understand. With ODL, you define your schema in a declarative way.
Rational Rose UML provides you with a more intuitive understanding of the model, and helps categorize a large schema into packages. However, Rational Rose does not come with a standard installation of Matisse.
Classes and Attributes
There are only three fundamental elements in schema definition for Matisse. These are class, attribute, and relationship. First, let us look at class and attribute.
Class is like a relational table, and attribute is like a relational column. The major differences are:
- You can use inheritance with classes
- You can define methods for classes
- Attribute can be of list type, e.g., list of integers
Here is the definition of the class Manager
in DDL, which inherits from Employee
and has an attribute Title
:
CREATE CLASS Manager UNDER Employee (
Title VARCHAR(64)
);
Note that "CREATE CLASS
" and "CREATE TABLE
" are identical. By default, an attribute defined by DDL is NULLABLE
. If the Title
attribute cannot be NULL
, the definition should use NOT NULL
:
CREATE CLASS Manager UNDER Employee (
Title VARCHAR(64) NOT NULL
);
Do you need to define a primary key in each class? Basically, the answer is no, because each class has an implicit OID attribute. But, if relational columns used as primary keys are meaningful in the application domain, you will obviously include these attributes in the class.
All the available built-in data types are listed in the Matisse Data Type Reference.
Relationships
The third fundamental element in schema definition is the relationship. A relationship defines an association between two classes. It works like the relational Primary Key - Foreign Key concept, but relationship is more intuitive, easier to maintain, and works faster for queries.
I will use the same schema as the one that I used in my first article. There are two relationships in the UML diagram, ManagedBy
/Manages
and Members
/WorksIn
.
The first relationship ManagedBy
/Manages
associates the Project
class with the Manager
class, and puts two constraints which are:
- A
Manager
manages zero or more projects, and - A
Project
always needs to be managed by exactly one Manager
.
If you are using Rational Rose, you simply export the diagram to a Matisse database. Then, the database has the relationship with the constraints. The relevant part of the equivalent SQL DDL statements look like this:
CREATE CLASS Manager UNDER Employee (
Manages REFERENCES (Project)
INVERSE Project.ManagedBy
);
CREATE CLASS Project (
ManagedBy REFERENCES (Manager)
CARDINALITY (1, 1)
INVERSE Manager.Manages
);
INVERSE
is a new syntax to indicate that the two references (Manages
and ManagedBy
) are related to each other. Another new syntax is CARDINALITY
, which specifies the minimum and maximum number of objects that can participate in the relationship. CARDINALITY
can be omitted, in which case the cardinality of the relationship is 0 to many (the case for Manages
in class Manager
). Although the most common cardinalities are (0, 1), (1, 1), (0, n) and (1, n), you may use any number like (0, 5).
Some of the things I really like about the Matisse schema definition are that you do not need to write triggers to maintain referential integrity, worry about orphan rows after deletion or update of parent rows, or use some tricks for insertion of rows. You never have orphan rows (or dangling reference) problems with Matisse. Matisse always enforces the referential integrity.
The other relationship, Members
/WorksIn
, associates the class Employee
and the class Project
with many-to-many cardinality. With relational databases, you need to introduce an intermediate table in this case, but this is not needed here with Matisse. You simply express the many-to-many relationship:
CREATE CLASS Project (
Members REFERENCES (Employee)
CARDINALITY (1, -1)
INVERSE Employee.WorksIn
);
CREATE CLASS Employee (
WorksIn REFERENCES (Project)
INVERSE Project.Members
);
Although both relationships above are bi-directional, mono-directional relationships are also available. Suppose that a project consists of many sub tasks (class Task
), a task object does not need to know to which project it belongs. In other words, you do not have to navigate from a task object to a project object. Then, the class definition in DDL would look like this:
CREATE CLASS Project (
Tasks REFERENCES (Task)
);
Another Example
As an example of database schema, which is more naturally expressed with the post-relational database, I will show a tree structure based document management model. It has two classes Document
and Folder
, where a Folder
contains Document
objects and Folder
objects.
CREATE CLASS Document (
Name VARCHAR(255) NOT NULL,
CreationDate DATE NOT NULL,
Content TEXT
);
CREATE CLASS Folder UNDER Document (
Documents REFERENCES LIST(Document)
);
The above schema definition allows a Folder
or a Document
to be contained in more than a Folder
. If you need a Document
or a Folder
to be contained only in a Folder
, the DDL would be like this:
CREATE CLASS Document (
Name VARCHAR(255) NOT NULL,
CreationDate DATE NOT NULL,
Content TEXT,
ContainedIn REFERENCES (Folder)
CARDINALITY (0, 1)
INVERSE Folder.Documents
);
CREATE CLASS Folder UNDER Document (
Documents REFERENCES LIST(Document)
INVERSE Document.ContainedIn
);
You do not need an intermediate table.
Indexes and Entry-point
After completing your logical model, you may need to define some indexes for better performance. When you define an index on an attribute (or up to four attributes), there are two things you need to know:
- The attribute(s) for the index criteria must be "
NOT NULL
", i.e., not NULLABLE
- If the attribute is of the string type, it needs to be defined with maximum size using
VARCHAR(n)
In order to define an index on BirthDate
of Employee
and another one on Name
of Employee
, the class needs to be defined like this:
CREATE TABLE Employee (
Name VARCHAR(255) NOT NULL,
BirthDate DATE NOT NULL,
);
The next DDL statements define an index on the attribute BirthDate
of the Employee
class, and another index on the attribute Name
of Employee
.
CREATE INDEX birthdate_idx ON Employee (BirthDate);
CREATE INDEX emp_name_idx ON Employee (Name);
Matisse also has a full-text indexing feature, which is called 'Entry-Point dictionary'. For example, the next DDL statement defines an entry-point dictionary with full-text indexing on the Description
attribute of Project
:
CREATE ENTRY_POINT DICTIONARY proj_desc_ep_dict
ON Project (Description)
MAKE_ENTRY "make-full-text-entry";
The next SELECT
query returns projects whose description contains the word '.NET':
SELECT * FROM Project
WHERE ENTRY_POINT(proj_desc_ep_dict) = '.NET';
Schema Templates
In the Enterprise Manager's Query Analyser window, you can get the basic schema templates in SQL DDL by right-clicking in the window. There are templates for defining classes with inheritance or relationships, SQL methods, and indexes. Note that there is also Help
in the same menu, which covers other DDL statements.
Summary and Next Article
I outlined here the basics of schema definition with Matisse.
Although relationships may be a new notion as the way to define database schema for some developers, I believe it is not difficult but actually intuitive especially for those already familiar with UML modeling. If you have questions, do not hesitate to post a comment to this article. I will be happy to respond.
My next article will describe the first steps for .NET programming: how to access the database from .NET.
Appendix 1: The Complete SQL DDL Used in this Article
CREATE TABLE Task (
TaskName STRING,
StartDate DATE,
EndDate DATE
);
CREATE TABLE Project (
ProjectName VARCHAR(255),
Budget NUMERIC(19,2),
Description STRING,
Members REFERENCES (Employee)
CARDINALITY (1, -1)
INVERSE Employee.WorksIn,
ManagedBy REFERENCES (Manager)
CARDINALITY (1, 1)
INVERSE Manager.Manages,
Tasks REFERENCES (Task)
);
CREATE TABLE Employee (
Name VARCHAR(255) NOT NULL,
BirthDate DATE NOT NULL,
WorksIn REFERENCES (Project)
INVERSE Project.Members
);
CREATE TABLE Manager UNDER Employee (
Title VARCHAR(255),
Manages REFERENCES (Project)
INVERSE Project.ManagedBy
);
CREATE INDEX birthdate_idx ON Employee (BirthDate);
CREATE INDEX emp_name_idx ON Employee (Name);
CREATE ENTRY_POINT DICTIONARY proj_desc_ep_dict
ON Project ( Description )
MAKE_ENTRY "make-full-text-entry";
Note that VARCHAR
and STRING
in DDL are identical except that VARCHAR
can specify maximum size of string.
Appendix 2: Equivalent Schema in ODL
interface Employee : persistent {
attribute String<255> Name;
attribute Date BirthDate;
relationship Set<Project> WorksIn
inverse Project::Members;
mt_index birthdate_idx
unique_key FALSE
criteria {Employee::BirthDate MT_ASCEND};
mt_index emp_name_idx
unique_key FALSE
criteria {Employee::Name MT_ASCEND};
};
interface Manager : Employee : persistent {
attribute String<64> Nullable Title;
relationship Set<Project> Manages
inverse Project::ManagedBy;
};
interface Project : persistent {
attribute String<64> Nullable ProjectName;
attribute Numeric(19,2) Nullable Budget;
attribute String Nullable Description;
mt_entry_point_dictionary proj_desc_ep_dict entry_point_of Description
unique_key FALSE
make_entry_function "make-full-text-entry";
relationship Set<Employee> Members[1, -1]
inverse Employee::WorksIn;
relationship Manager ManagedBy
inverse Manager::Manages;
relationship Set<Task> Tasks;
};
interface Task : persistent {
attribute String Nullable TaskName;
attribute Date Nullable StartDate;
attribute Date Nullable EndDate;
};
<< Back | Next >>
History
- 24th May, 2004: Initial version
License
This article has no explicit license attached to it, but may contain usage terms in the article text or the download files themselves. If in doubt, please contact the author via the discussion board below. A list of licenses authors might use can be found here.