ConteXt Technology

Introduction

This text is not for people who are sure that in the year 2050 most databases will be using relational model and SQL. I am writing this article for those who have different opinions about the subject such as myself.

What is wrong with the relational model and why don't I like SQL?

Well the model does not seem to be bad and the language is good. However FORTRAN also worked very well. Why then develop other languages like C and C++? FORTRAN gurus probably believed that it was nonsense and called those working on new languages idiots. FORTRAN is well suited for math applications however it does not fare well for developing applications with advanced GUI. It worked well in character based screen environments but the interface has changed since the IBM370 times.

I think that somewhat the same situation is happening with relational databases. The relational model works well with the documents that came to computer world from being paper based. In the pre-computer era those documents were kept on cards or paper based tables. Such tables were often based on fixed number of columns and variable number of rows. The same paradigm was transferred to computers. Obviously E.F. Codd made a great contribution since he developed the mathematical foundation on which processing of such information is based. Operations on tables produce another table. That is all wonderful, but the shape of the table was still the same. I do not want to say that hierarcical or network models of data existing at the same time were better than the relational one. The relational model is an excellent match for financial application, however this model puts strong restrictions on the way the data can be organized. Representation of all information in relational form is not always possible and can be very artificial at best.

Concepts of ConteXt

Let's take a table from a relational database, but tweak the relational paradigm a little bit and see what will happen:
One of the cornerstones of the relational model is atomicity of the field. But why can the field not be a structure? Let's say that this is possible. In that case the fields of that structure could be structures themselves. Looks like the field has acquired hierarchy and the table got another dimension. Such structuring of the information allows retrieval of all information about the object in one call.

C/C++ has a very powerful concept called pointers. A pointer does not itself hold value but points to the location where the value is held. In case of databases the pointer should point to the location of the information. This pointer is not the common key field that joins two relational tables. The key as it is does not exist. The pointer is resembles a link from one web page to another one.
The pointer holds the value of a unique identifier that can be translated to the address of the record. Therefore each record must have a unique identifier. We can retrieve information from another table using the pointer. The pointer links a field in one record and a record in another (or the same) table. When retrieving the information by pointer there is no need for a search, so the access rate is faster.

Let's move further. Why must the number of fields in different records be the same? Suppose some information is the same for every record but we also give each record some individuality. Actually the same goal can be achieved in the relational model but it will require creation of one extra table per new field. In our case we will extend our definition of record - the record will have constant set of fields with any regular relational table plus variable set of fields that we will call properties.

There is one more suggestion. Why should any particular field have the same type in each row of the table? Why can't the field hold its own definition? Whether the field is atomic or a structure, it can hold its own definitions.

One of the very important changes in the approach to programming during last 10-15 years was wide acceptance of object-oriented languages. I think that simple translation of that paradigm to the world of databases did not prove itself but there are some qualities that we can still borrow. The main building block of any object-oriented language is a class, which is roughly a set of data and methods that work with that data. Add methods (functions) to the data we have been discussing and the result resembles an OO class.

Now our table has a more complicated structure. It still has rows but the rows are of variable length and type. One can see that the relation model is a subset of the one proposed here. It also incorporates features of earlier hierarchical and network database models as well as includes the so-called Entity-Attribute-Value model.

We have one little problem here however. It is not very logical to name such a complex structure as table. I think it would be better to transform the term 'Table' to 'Class' and the term 'Record' to 'Object'.

So what do we finally have here?

  • Class has an arbitrary number of fields and methods.
  • The object (instance of the class) has a constant set of fields defined at the class level and variable set of fields that are pertinent to that particular instance.
  • The field can hold either a value or be a pointer to the object.
  • The field can be atomic or be a structure itself.
  • The structure of the field can be self-describing (holds its own definition)

    Based on these principles, I have been developing a DBMS that I call ConteXt. During this process I have discovered that the major portion of application programmer time is spent on writing user interfaces. Usually the DBMS has only the "database engine" that supplies data access. An application programmer writes the interface between application and the "database engine" using SQL as well as the user interface. Some databases such as Access and FoxPro has a tool where a user interface were included. I follow in the path of such database products. ConteXt has a database browser (I call it DB-browser) that allows interaction between the end user and the data held in ConteXt. This browser incorporates various user interface and administration functions. The use of such tools changes the way to write database applications. Now the programmer does not write the whole application but just a few compoments to "help" the browser. All standard functions responsible for data display, manipulation, search, editing etc. are already incorporated into the browser. The programmer can create methods that are triggered when the user performs some actions. If there are no methods defined the browser invokes default actions.
    Along with the regular fields ConteXt also has "virtual fields". They are not present in instances of the class but are being calculated at runtime. Other than that, there is virtually no difference between regular and virtual fields.

    To display information the browser uses so called forms. The forms are pertinent to a class and define how the instances of that class are being displayed. Class can have an unlimited number of associated forms. The database browser can also work over the web in the context of any standard web browser that supports Java Applets.


    Alexander Lashenko Toronto, Canada

    DBMS ConteXt front page: DBMS ConteXt

    Questions or comments? Send a letter to info@unixspace.com