Introduction
This is the second part of a series concerning Cachalot
DB. The first part can be found here.
The business objects are stored internally in a type-agnostic format. Index fields are stored as Int64
or string
, and all the object data is stored as UTF-8 encoded JSON. The process of transforming a .NET object in the internal format is called “packing”. Packing is done client-side, the server only uses the indexes and manipulates the object as row data. It has no dependency on the concrete .NET datatype
.
By default, the object data is not compressed but for objects that take more than a few kilobytes, compression may be very useful. For an object that takes 10 KB in JSON, compression ratio is around 1:10.
To enable compression, add a single attribute on the business data type.
[Storage(compressed:true)]
public class Home
{
°°°
Using compressed objects is transparent for the client code. However, it has an impact on the packing time which is done on the client. When objects are retrieved, they are unpacked (which may imply decompression).
As a conclusion, compression may be very useful starting with medium size objects if you are ready to pay a small price, client-side only, for data insertion and retrieval.
Polymorphic collections are natively managed. Type information is stored internally in the JSON and it is used to deserialize the proper concrete type.
A small example from a trading system:
In order to store a collection of events, we must expose all required indexes on the base type.
Null
values are perfectly acceptable for index fields which allow to expose indexed properties which make sense only for a specific child type.
public abstract class ProductEvent
{
[PrimaryKey(KeyDataType.IntKey)]
public int Id { get; set; }
[Index(KeyDataType.StringKey)]
public abstract string EventType { get; }
[Index(KeyDataType.IntKey, ordered:true)]
public DateTime EventDate { get; set; }
[Index(KeyDataType.IntKey, ordered: true)]
public DateTime ValueDate { get; set; }
°°°
}
public abstract class NegotiatedProductEvent: ProductEvent
{
°°°
}
public class IncreaseDecrease : NegotiatedProductEvent
{
°°°
public override string EventType => "IncreaseDecrease";
}
This is an example of code which retrieves a collection of concrete events from a DataStore
typed with an abstract
base class.
var events = connector.DataSource<ProductEvent>();
var increaseEvents = events.Where(
evt => evt.EventType == "IncreaseDecrease" &&
evt.EventDate == DateTime.Today
).Cast<IncreaseDecrease>()
A normal “put
” operation adds an object or updates an existing one using the primary key as object identity.
More advanced use cases are implemented:
- Add an object only if it is not already there and tell me if it was really added
- Update an existent object only if the current version in the database satisfies a condition
The first one is available through the TryAdd
operation on the DataSource
class. If the object was already there, it is not modified, and it returns false
. The test on the object existence and the insertion are executed as an atomic operation. The object cannot be updated or deleted by another client in-between.
That can be useful for data initialization, creating singleton objects, distributed locks, etc.
The second use case is especially useful for, but not limited to, the implementation of “optimistic synchronization”. The UpdateIf
method on the DataSource
class implements it.
If we need to be sure that nobody else modified an object while we were editing it (manually or algorithmically), there are two possibilities:
- Lock the object during the edit operation. This is not the best option for a modern distributed system. A distributed lock is not suitable for massively parallel processing and if it is not released automatically (due to client or network failure), manual intervention by an administrator is required.
- Use “optimistic synchronization”: do not lock but require that, when saving the modified object, the one in the database did not change since it was loaded. Otherwise, the operation fails, and we must retry (load + edit + save). This can be achieved in different ways:
- Having a version on an object. When we save version
n+1
, we require that the object in the database is still at version n
. In Cachalot
DB, the syntax is items.UpdateIf(item, i=> i.Version == n-1)
- Having a
timestamp
on an object. When we save a modified object, we require that the timestamp
of the version in the database is identical to the one of the object before our update.
var oldTimestamp = item.Timestamp;
item.Timestamp = DateTime.Now;
items.UpdateIf(item, i=> i.Timestamp == oldTimestamp);
This can be even more useful when committing multiple object modifications in a transaction. If a condition is not satisfied on one object, rollback the whole transaction.
More on transaction in Part 3 ...
The fully open source code is available at:
Precompiled binaries and full documentation are available at:
The client code is available as nuget package at nuget.org.
To install: Install-Package Cachalot.Client