Scaleable Objects Persistence (SOP)

Gerardo Recinto

5.00/5 (1 vote)

25 Jan 2024MIT5 min read

2.5K

Scaleable objects persistence, ACID transaction adapter

This is a brief introduction to SOP, an enabler of Objects persistence/ACID transaction to backend databases such as Cassandra.

Download full source code from GitHub

Introduction

Objects Persistence and ACID transactions are two of the highly coveted ingredients in software development. Despite having RDBMS & SQL, architects and developers tend to prefer designing solutions based on Object Oriented paradigm. ACID transaction still proves to be highly sought after.

We have ORM, i.e., Object to Relational Mappers, such as JPA in Java and Entity Framework in .NET. But these solutions provide "mapping", not persistence of the actual Object. The solution imposes "mapping" and SQL middleware overhead. Such overhead becomes significant & apparent when you are authoring high performance & highly concurrent systems, where fine grained control on concurrency/parallelism across machines benefits with "alignment" with data persistence, i.e., highly efficient distributed processing.

Objects persistence (via SOP) solves this by removing the mapping and the SQL middleware layers and provides best fit features such as ACID transaction & built-in caching.

Background

Origins of SOP was in .NET. The first attempt was somewhat successful but never reached production release as implementation got bogged down as features added on top of features started to make the component complicated and hard to advance. It was a prototype of this new concept.

So, it was re-designed, features vs. schedule massaged, put a better stance on design adoption on patterns (e.g., Repository pattern) and ported it to Golang. The current version, V2, is a Golang port and achieved the goals as advertised.

Using the Code

SOP has B-Tree interface (CRUD methods) & has transaction API for managing transaction sessions.

Sample code is as follows:

import (
    "github.com/SharedCode/sop/in_red_ck"
    "github.com/SharedCode/sop/in_red_ck/cassandra"
    "github.com/SharedCode/sop/in_red_ck/redis"
)

// Cassandra cluster config.
var cassConfig = cassandra.Config{
    ClusterHosts: []string{"localhost:9042"},
    Keyspace:     "btree",
}
// Redis config.
var redisConfig = redis.Options{
    Address:                  "localhost:6379",
    Password:                 "", // no password set
    DB:                       0,  // use default DB
    DefaultDurationInSeconds: 24 * 60 * 60,
}

// Initialize Cassandra & Redis.
func init() {
    in_red_ck.Initialize(cassConfig, redisConfig)
}

var ctx = context.Background()
...

func main() {
    // Create a transaction session & begin it.
    trans, _ := in_red_ck.NewTransaction(true, -1)
    trans.Begin()

    // Create/instantiate a new B-Tree named "fooStore" w/ 500 slots & other parameters
    // including the "transaction" that it will participate in.
    //
    // Key is of type "int" & Value is of type "string".
    b3, _ := in_red_ck.NewBtree[int, string]
             (ctx, "fooStore", 500, false, false, true, "", trans)

    // Add an item with key 1 and value "hello world".
    b3.Add(ctx, 1, "hello world")

    ...

    // Once you are done with the management, 
    // call transaction commit to finalize changes, save to backend.
    trans.Commit(ctx)
}

And, another sample illustrating managing objects as key & value pair:

// Sample Key struct.
type PersonKey struct {
    Firstname string
    Lastname  string
}

// Sample Value struct.
type Person struct {
    Gender string
    Email  string
    Phone  string
    SSN    string
}

// Helper function to create Key & Value pair.
func newPerson(fname string, lname string, gender string, 
               email string, phone string, ssn string) (PersonKey, Person) {
    return PersonKey{fname, lname}, Person{gender, email, phone, ssn}
}

// The Comparer function that defines sort order.
func (x PersonKey) Compare(other interface{}) int {
    y := other.(PersonKey)

    // Sort by Lastname followed by Firstname.
    i := cmp.Compare[string](x.Lastname, y.Lastname)
    if i != 0 {
        return i
    }
    return cmp.Compare[string](x.Firstname, y.Firstname)
}

func main() {

    // Create and start a transaction session.
    trans, err := in_red_ck.NewTransaction(true, -1)
    trans.Begin()

    // Create the B-Tree (store) instance.
    b3, err := in_red_ck.NewBtree[PersonKey, Person]
               (ctx, "persondb", 500, false, false, false, "", trans)

    // Add a person record w/ details.
    pk, p := newPerson("joe", "krueger", "male", "email", "phone", "mySSN123")
    b3.Add(ctx, pk, p)

    ...
    // To illustrate the Find & Get Value methods.
    if ok, _ := b3.FindOne(ctx, pk, false); ok {
        v, _ := b3.GetCurrentValue(ctx)
        // Do whatever with the fetched value, "v".
        ...
    }

    // And lastly, to commit the changes done within the transaction.
    trans.Commit(ctx)
}

Methods are complete so you can do the basic CRUD operations (e.g., Add as shown above, Update, Remove, GetCurrentValue, GetCurrentKey, GetCurrentItem), plus a few more to support range queries and range updates such as:

FindOne - to search for an item given key
Next - to navigate to the next key relative to the current "cursor" position
Previous - to navigate to the previous key relative to the current "cursor" position
First - to move to the first record as per key ordering sequence
Last - to move to the last record as per key ordering sequence

You can manage multiple B-Tree stores within a transaction and multiple transactions in time or concurrently.

Here is the Wiki Excerpt of SOP to Give Detailed Description at High Level

SOP V2 is a Golang code library that provides Scaleable Objects Persistence. It is a general purpose database engine sporting ACID transactions & two phase commit for seamless application integration to 3^rd party databases.

Using SOP, all you need to do is author your application data definition (structs) and SOP will take care of storage and retrieval. Offering a transaction based API that gives you ACID attributes of a transaction.

SOP is able to combine all the best features of Cassandra (so we don't have to reinvent the wheel) & Redis for global caching & add on top of it, ACID transactions & Objects based data management. You no longer need to author tables (& SQL scripts/ORM) in Cassandra, as SOP takes care of Objects persistence & very fast searches (using M-Way Trie B-Tree) to Cassandra. Also, you no longer need to write Redis based data caching because SOP has that built-in.

Since SOP turns your application (micro-service, web server, etc.) into the database server itself, we essentially removed the heavy SQL middleware from the setup. Thus, giving your application the raw power of the backend storage engine.

And by using a cool language like Golang, your data mining logic suddenly becomes in full unison with your app. No more impedance mismatch.

All the components or ingredients required to create or to support ACID transactions had been done in a very efficient manner using Redis. Example, there is no row locking, but instead, an equivalent locking is achieved by using Redis based algorithm for batch oriented logical locks.

SOP is able to provide a general purpose, domain "flexible" solution to data management, that will always scale as M-Way Trie became a commodity (or core feature) of the Objects "container" (SOP calls it "store" but analogous to a table or a set of tables in RDBMS). In one case, it can be your database system for your payroll & inventory, in another case, it can be your AI database storing "vectors" and such, etc.

The SOP, as a framework, is very portable & highly adaptive. In V2, it combines Cassandra & Redis and enhances the solution, adding the cool features listed above. It also comes with an "in-memory" version that can be used like a sorted map.

Future versions can integrate with other storage sub-systems and such, or even, have its own storage features, if needed and requested.

Here is link to the project in GitHub: https://github.com/SharedCode/sop for the full source code.

In case you are interested or curious about it and perhaps itching to join the fun. I extend my warm invitation to you. The project is a startup, 'will benefit with needed talents for the entire stack - from Managers, Leads, Architects, Developers, QA, Documenters and such.

The code is solid and has passed batteries of automated tests, but the concept is new and this is the first publication or discussion done about it. Thus, it is only me who is a contributor so far today.

In any case, I hope you had fun learning about the concept and got your interests going in this avenue. I do believe this project and the current implementation can really help accelerate and simplify applications development, enable simple solutions for low latency, "horizontally" thus, highly scaleable systems.

Points of Interest

Apparently ("envisioned" and proven in implementation!), M-Way Trie (B-Tree) data structures & Algorithms is a great "companion" feature for enabling ACID transactions. ACID transactions is an easy feature to implement in the "controller", and storage algorithms such as B-Tree are "controllers" by nature in its positioning in the storage flow.

Caching if done built-in part of the solution (in this SOP setup), will remove a lot of coding required to create a scaleable system, as caching is built-in.

And lastly, using an Objects based db engine simplifies applications development and makes it more fun (and simpler) than ever before.

History

24^th January, 2024: Current version - SOP V2 Golang

License

This article, along with any associated source code and files, is licensed under The MIT License