This is a brief introduction to SOP, an enabler of Objects persistence/ACID transaction to backend databases such as Cassandra.
Introduction
Objects Persistence and ACID transactions are two of the highly coveted ingredients in software development. Despite having RDBMS & SQL, architects and developers tend to prefer designing solutions based on Object Oriented paradigm. ACID transaction still proves to be highly sought after.
We have ORM, i.e., Object to Relational Mappers, such as JPA in Java and Entity Framework in .NET. But these solutions provide "mapping", not persistence of the actual Object. The solution imposes "mapping" and SQL middleware overhead. Such overhead becomes significant & apparent when you are authoring high performance & highly concurrent systems, where fine grained control on concurrency/parallelism across machines benefits with "alignment" with data persistence, i.e., highly efficient distributed processing.
Objects persistence (via SOP) solves this by removing the mapping and the SQL middleware layers and provides best fit features such as ACID transaction & built-in caching.
Background
Origins of SOP was in .NET. The first attempt was somewhat successful but never reached production release as implementation got bogged down as features added on top of features started to make the component complicated and hard to advance. It was a prototype of this new concept.
So, it was re-designed, features vs. schedule massaged, put a better stance on design adoption on patterns (e.g., Repository pattern) and ported it to Golang. The current version, V2, is a Golang port and achieved the goals as advertised.
Using the Code
SOP has B-Tree interface (CRUD methods) & has transaction API for managing transaction sessions.
Sample code is as follows:
import (
"github.com/SharedCode/sop/in_red_ck"
"github.com/SharedCode/sop/in_red_ck/cassandra"
"github.com/SharedCode/sop/in_red_ck/redis"
)
var cassConfig = cassandra.Config{
ClusterHosts: []string{"localhost:9042"},
Keyspace: "btree",
}
var redisConfig = redis.Options{
Address: "localhost:6379",
Password: "",
DB: 0,
DefaultDurationInSeconds: 24 * 60 * 60,
}
func init() {
in_red_ck.Initialize(cassConfig, redisConfig)
}
var ctx = context.Background()
...
func main() {
trans, _ := in_red_ck.NewTransaction(true, -1)
trans.Begin()
b3, _ := in_red_ck.NewBtree[int, string]
(ctx, "fooStore", 500, false, false, true, "", trans)
b3.Add(ctx, 1, "hello world")
...
trans.Commit(ctx)
}
And, another sample illustrating managing objects as key & value pair:
type PersonKey struct {
Firstname string
Lastname string
}
type Person struct {
Gender string
Email string
Phone string
SSN string
}
func newPerson(fname string, lname string, gender string,
email string, phone string, ssn string) (PersonKey, Person) {
return PersonKey{fname, lname}, Person{gender, email, phone, ssn}
}
func (x PersonKey) Compare(other interface{}) int {
y := other.(PersonKey)
i := cmp.Compare[string](x.Lastname, y.Lastname)
if i != 0 {
return i
}
return cmp.Compare[string](x.Firstname, y.Firstname)
}
func main() {
trans, err := in_red_ck.NewTransaction(true, -1)
trans.Begin()
b3, err := in_red_ck.NewBtree[PersonKey, Person]
(ctx, "persondb", 500, false, false, false, "", trans)
pk, p := newPerson("joe", "krueger", "male", "email", "phone", "mySSN123")
b3.Add(ctx, pk, p)
...
if ok, _ := b3.FindOne(ctx, pk, false); ok {
v, _ := b3.GetCurrentValue(ctx)
...
}
trans.Commit(ctx)
}
Methods are complete so you can do the basic CRUD operations (e.g., Add as shown above, Update
, Remove
, GetCurrentValue
, GetCurrentKey
, GetCurrentItem
), plus a few more to support range queries and range updates such as:
FindOne
- to search for an item given key Next
- to navigate to the next key relative to the current "cursor" position Previous
- to navigate to the previous key relative to the current "cursor" position First
- to move to the first record as per key ordering sequence Last
- to move to the last record as per key ordering sequence
You can manage multiple B-Tree stores within a transaction and multiple transactions in time or concurrently.
Here is the Wiki Excerpt of SOP to Give Detailed Description at High Level
SOP V2 is a Golang code library that provides Scaleable Objects Persistence. It is a general purpose database engine sporting ACID transactions & two phase commit for seamless application integration to 3rd party databases.
Using SOP, all you need to do is author your application data definition (struct
s) and SOP will take care of storage and retrieval. Offering a transaction based API that gives you ACID attributes of a transaction.
SOP is able to combine all the best features of Cassandra (so we don't have to reinvent the wheel) & Redis for global caching & add on top of it, ACID transactions & Objects based data management. You no longer need to author tables (& SQL scripts/ORM) in Cassandra, as SOP takes care of Objects persistence & very fast searches (using M-Way Trie B-Tree) to Cassandra. Also, you no longer need to write Redis based data caching because SOP has that built-in.
Since SOP turns your application (micro-service, web server, etc.) into the database server itself, we essentially removed the heavy SQL middleware from the setup. Thus, giving your application the raw power of the backend storage engine.
And by using a cool language like Golang, your data mining logic suddenly becomes in full unison with your app. No more impedance mismatch.
All the components or ingredients required to create or to support ACID transactions had been done in a very efficient manner using Redis. Example, there is no row locking, but instead, an equivalent locking is achieved by using Redis based algorithm for batch oriented logical locks.
SOP is able to provide a general purpose, domain "flexible" solution to data management, that will always scale as M-Way Trie became a commodity (or core feature) of the Objects "container" (SOP calls it "store" but analogous to a table or a set of tables in RDBMS). In one case, it can be your database system for your payroll & inventory, in another case, it can be your AI database storing "vectors" and such, etc.
The SOP, as a framework, is very portable & highly adaptive. In V2, it combines Cassandra & Redis and enhances the solution, adding the cool features listed above. It also comes with an "in-memory" version that can be used like a sorted map.
Future versions can integrate with other storage sub-systems and such, or even, have its own storage features, if needed and requested.
Here is link to the project in GitHub: https://github.com/SharedCode/sop for the full source code.
In case you are interested or curious about it and perhaps itching to join the fun. I extend my warm invitation to you. The project is a startup, 'will benefit with needed talents for the entire stack - from Managers, Leads, Architects, Developers, QA, Documenters and such.
The code is solid and has passed batteries of automated tests, but the concept is new and this is the first publication or discussion done about it. Thus, it is only me who is a contributor so far today.
In any case, I hope you had fun learning about the concept and got your interests going in this avenue. I do believe this project and the current implementation can really help accelerate and simplify applications development, enable simple solutions for low latency, "horizontally" thus, highly scaleable systems.
Points of Interest
Apparently ("envisioned" and proven in implementation!), M-Way Trie (B-Tree) data structures & Algorithms is a great "companion" feature for enabling ACID transactions. ACID transactions is an easy feature to implement in the "controller", and storage algorithms such as B-Tree are "controllers" by nature in its positioning in the storage flow.
Caching if done built-in part of the solution (in this SOP setup), will remove a lot of coding required to create a scaleable system, as caching is built-in.
And lastly, using an Objects based db engine simplifies applications development and makes it more fun (and simpler) than ever before.
History
- 24th January, 2024: Current version - SOP V2 Golang