Multi-Model Databases

April 15, 2017

NoSQL databases offer alternatives for situations where the data being processed doesn’t fit neatly into a relational database. Different types of systems have evolved to meet different needs. Some have adopted polyglot persistence, using multiple databases systems to meet specialized needs within a system. However, such a system greatly increases the complexity in programming and maintenance.
Some relational systems have added features to deal with alternate data structures. Postgres supports the JSON data type, which would allow you to replicate features of a document database, while SQL Server has supported XML for a while, and looks to be moving toward supporting JSON as well.
One solution is Multi-Model databases, which combine several database types into one system. This gives you one system to install and maintain. Plus, one query can access data from the various models, and one transaction to ensure consistency across those models.
Two system that I’ve looked at are ArangoDB and OrientDB.
Both systems support Document, Key/Value, and Graph database types in one engine. OrientDB supports SQL, while Arrango had a SQL-like query language, although for selects the language seems to resemble LINQ where the table/collection is specified first.
Both can operate as a single node or as a cluster, and have built in replication and sharding ability.
The multi-model system is an interesting idea and seems quite useful. I’ll follow up with a post with a deeper dive into one of these systems. It is interesting that neither system implements a column family model, perhaps that falls more into a analytical system that would be separate from the transacional system anyway.

Links:
10 Reasons To Consider A Multi-model Database
Data Modeling With Multi-model Databases
Datastax Believes Multi-model Databases Are The Future

Advertisements

Neo4j Introduction

September 7, 2013

Neo4j is the leading open source graph database project. Graph databases are a type of NoSQL database, and they excel in storing information about relationships between entities. The graph databases will store nodes (entities), which can also have properties. Edges (relationships) connect nodes, and can also have properties to describe the type of relationship.

Installation
Neo4j can run on Windows, the free Community edition is available at Download Neo4j (You need the Java JRE installed as well before running).
Extract the downloaded ZIP, and run the bin/Neo4j.bat file to start the database service.

The service runs on port 7474 by default. Navigating to http:\\localhost:7474/webadmin/ will open the Web Admin tool, which includes a data browser and a query console. The console can query the database using the Cypher query language.

For my experimentation, I downloaded a C# client, Neo4jClient using NuGet.

Documentation:
An online Neo4j Manual is available, it can also be downloaded as a PDF, which requires filling out a short form with contact information the site.

There is also a Free Graph Databases e-book from O’Reilly available to download. It also requires some contact information as well.

Code:
Here is a short program to create nodes and edges, before dropping them.

using System;

using Neo4jClient;

// https://bitbucket.org/Readify/neo4jclient/wiki/Home for code samples and more info on Neo4jClient
namespace Neo4jDemo
{
    public class Program
    {
        // Define our entity
        public class FootballEntity
        {
            public string FirstName { get; set; }
            public string LastName { get; set; }
            public string Position { get; set; }
        }

        // Define relationships
        public class Coaches : Relationship,
            IRelationshipAllowingSourceNode,
            IRelationshipAllowingTargetNode
        {
            public Coaches(NodeReference targetNode) : base(targetNode){}

            public const string TypeKey = "Coaches";

            public override string RelationshipTypeKey
            {
                get { return TypeKey; }
            }
        }

        public class Teammates : Relationship,
            IRelationshipAllowingSourceNode,
            IRelationshipAllowingTargetNode
        {
            public Teammates(NodeReference targetNode) : base(targetNode){}

            public const string TypeKey = "Plays with";

            public override string RelationshipTypeKey
            {
                get { return TypeKey; }
            }
        }

        static void Main(string[] args)
        {
            // Connect
            var client = new GraphClient(new Uri("http://localhost:7474/db/data"));
            client.Connect();

            // Add nodes
            var coachReference = client.Create(new FootballEntity { FirstName = "Mike", LastName = "Smith", Position = "Head Coach" });
            var coachId = coachReference.Id;

            var qbReference = client.Create(new FootballEntity { FirstName = "Matt", LastName = "Ryan", Position = "QB" });
            var qbId = qbReference.Id;

            var wrReference = client.Create(new FootballEntity { FirstName = "Roddy", LastName = "White", Position = "WR" });
            var wrId = wrReference.Id;

            var rbReference = client.Create(new FootballEntity { FirstName = "Steven", LastName = "Jackson", Position = "RB" });
            var rbId = rbReference.Id;

            // Display Node IDs
            Console.WriteLine("Coach ID = " + coachId.ToString());
            Console.WriteLine("QB ID = " + qbId.ToString());
            Console.WriteLine("WR ID = " + wrId.ToString());
            Console.WriteLine("RB ID = " + rbId.ToString());

            // Retrieve a node
            var retrievedNode = client.Get(coachReference);

            // Create relationships
            var relationship = client.CreateRelationship(coachReference, new Coaches(qbReference));
            relationship = client.CreateRelationship(coachReference, new Coaches(wrReference));
            relationship = client.CreateRelationship(coachReference, new Coaches(rbReference));
            relationship = client.CreateRelationship(qbReference, new Teammates(wrReference));
            relationship = client.CreateRelationship(qbReference, new Teammates(rbReference));
            relationship = client.CreateRelationship(rbReference, new Teammates(wrReference));

            // Pause to examine values
            Console.WriteLine("Pause");
            Console.Read();

            // Cleanup
            client.Delete(coachReference, DeleteMode.NodeAndRelationships);
            client.Delete(qbReference, DeleteMode.NodeAndRelationships);
            client.Delete(wrReference, DeleteMode.NodeAndRelationships);
            client.Delete(rbReference, DeleteMode.NodeAndRelationships);

            // End
            Console.WriteLine("Completed");
            Console.Read();
        }
    }
}

MongoDB Introduction

September 5, 2013

MongoDB is an open source NoSQL database, it is one of the most prominent of the document-oriented databases. These databases build on the Key-Value model, allowing the user to store complex documents as the value (In MongoDB’s case, they are stored as Binary JSON documents) and an ObjectId (12 byte identifier) value as the key.
MongoDB can run on Windows, there are 32 and 64 bit editions available. There is also a Chocolatey package available.
Also, you’ll probably want to have a client, in my case I got the C# driver from NuGet.
There is good C# Tutorial available on the MongoDB site, with plenty of code examples.
In my experimentation, I created a C# data object class, and simply passed it to the C# driver, which took care of serializing the data to JSON. You can also use LINQ to query your data.
There are also several third party programs used to provide a GUI for MongoDB, I checked out MongoVUE on Chocolatey, which is a free 14 day trial of their product.

To start MongoDB, I ran this command from a BAT file:
“C:\Program Files\gb.MongoDB\bin\mongod.exe” –dbpath “C:\Program Files\gb.MongoDB\data”

The default port number is 27017.
Navigating to http://localhost:28017/ will open a web-based admin tool.

Here is an example, borrowing from the code samples on the MongoDB site. First, you’ll need to add references to the MongoDB.Bson and MongoDB.Driver DLLs.

using System;
using System.Linq;

using MongoDB.Bson;
using MongoDB.Driver;
using MongoDB.Driver.Builders;
using MongoDB.Driver.Linq;

namespace Mongo
{
    class Program
    {
        public class FootballPlayer
        {
            public ObjectId Id { get; set; }
            public string TeamName { get; set; }
            public string FirstName { get; set; }
            public string LastName { get; set; }
            public string Position { get; set; }
            public int JerseyNumber { get; set; }
        }

        static void Main(string[] args)
        {
            Console.WriteLine("Connecting");
            var connectionString = "mongodb://localhost";
            var client = new MongoClient(connectionString);
            var server = client.GetServer();
            var database = server.GetDatabase("test");
            var collection = database.GetCollection("FootballPlayers");

            // Insert 1st entity
            Console.WriteLine("Inserting");

            var entityMR = new FootballPlayer();
            entityMR.TeamName = "Falcons";
            entityMR.FirstName = "Matt";
            entityMR.LastName = "Ryan";
            entityMR.Position = "QB";
            entityMR.JerseyNumber = 3;

            collection.Insert(entityMR);
            var firstId = entityMR.Id;

            // Insert 2nd entity
            var entityRW = new FootballPlayer();
            entityRW.TeamName = "Falcons";
            entityRW.FirstName = "Roddy";
            entityRW.LastName = "White";
            entityRW.Position = "WR";
            entityRW.JerseyNumber = 84;

            collection.Insert(entityRW);
            var secondId = entityRW.Id;

            // Retrieve 1st entity
            Console.WriteLine("Querying One");
            var query = Query.EQ("_id", firstId);
            var retrievedEntity = collection.FindOne(query);

            // Update - Saves, but only sends the changed data across
            Console.WriteLine("Update");
            var update = Update.Set("JerseyNumber", 2);
            collection.Update(query, update);

            //List all
            Console.WriteLine("Querying All");
            var queryAll = from e in collection.AsQueryable() select e;
            foreach (var player in queryAll)
            {
                Console.WriteLine("ID = " + player.Id.ToString());
                Console.WriteLine("Name = " + player.FirstName + " " + player.LastName);
                Console.WriteLine("Position = " + player.Position);
                Console.WriteLine("Jersey Number = " + player.JerseyNumber.ToString());
                Console.WriteLine("");
            }

            // Remove
            Console.WriteLine("Delete All");
            collection.RemoveAll();

            Console.WriteLine("Completed");
            Console.ReadLine();
        }
    }
}

Redis Introduction

August 31, 2013

I’ve recently spent a little time working with Redis, which is a key-value store NoSQL database. Redis (REmote DIctionary Server) is an in-memory database, but the data can also be persisted to disk.
With a key-value store, you submit a record with a unique string key, and store a value for that key. In Redis the value can be Strings, Lists, Sets, Hashes or Ordered Sets. A record is retrieved by the key, there aren’t joins like there would be in a relational database.
There isn’t an official Redis download for Windows, but I was able to download it using Chocolatey, which installed it as a Windows service.
Next I downloaded a C# client, I opted for Sider. The Sider page also includes some sample code, with more sample code available as part of the repository code.
Below is some sample code where I instantiate the Client, save some values, update a value and delete a value before exiting.

using System;

using Sider;

namespace RedisDemo
{
    class Program
    {
        static void Main(string[] args)
        {
            // Create client
            RedisClient client = new RedisClient("localhost", 6379);

            // Set values
            client.Set("QB", "Matt Ryan");
            client.Set("HeadCoach", "Mike Smith");
            client.Set("Owner", "Arthur Blank");
            client.Set("RB", "Michael Turner");
            client.Set("DE", "John Abraham");

            // Display Values
            Console.WriteLine("QB = " + client.Get("QB"));
            Console.WriteLine("Head Coach = " + client.Get("HeadCoach"));
            Console.WriteLine("Owner = " + client.Get("Owner"));
            Console.WriteLine("RB = " + client.Get("RB"));
            Console.WriteLine("DE = " + client.Get("DE"));

            // Set a new Running Back
            client.Set("RB", "Steven Jackson");
            Console.WriteLine("New RB = " + client.Get("RB"));

            // Remove player record
            client.Del("DE");
            Console.WriteLine("Removed DE");

            // Close connection
            client.Dispose();

            Console.WriteLine("Press any key to Exit");
            Console.ReadLine();
        }
    }
}

Cassandra Resources

July 14, 2013

The Apache Cassandra project has established itself as one of the leading column store / column family databases. As a NoSQL database, it is schema-free and is scalable to multiple nodes, with no single point of failure.

Datastax has made available a MSI to install Cassandra on Windows .
The downloads are available at Planet Cassandra, which also has other resources for Cassandra, such as additional documentation, a blog and event listings.

The package also includes the community edition of the DataStax OpsCenter browser-based tool to manage a Cassandra cluster. The tool also includes dashboards to monitor a cluster, as well as viewing data and defining the schema.

Two other tools come with the install:
Cassandra CLI Utility (CLI = Command Line Interface)
Cassandra CQL Shell (CQL = Cassandra Query Language)

Either tool can be used to interact with the cluster, the query shell is the newer of the two tools.

Terms and Concepts:
Keyspace – Similar to a schema or a namespace.
Column Family – Equivalent to a table. The CQL tool actually refers to these structures as tables.

Cassandra uses a key-value pair to store a column name and the column value. It will also store a timestamp for that pair of values. This key-value pair is referred to as a column.
The value can also store key-value pairs in a nested fashion. In this case, the outer key-value pair is called a super column.
The name and the value are stored as byte arrays.
The column family stores rows, which are a key and a container for the set of columns.

Usage:
A TTL (Time To Live) property can be set for a property, where the column will be removed after a defined number of seconds has elapsed.
In a cluster, there is no master, each node as a peer.
Transactions aren’t natively supported.
Writes are recorded to a transaction log. The consistency level is tune-able (ONE, QUORUM, ALL), which will be inversely proportional to the write performance.
Table joins and subqueries are not supported.

Other Links:
Cassandra Home

Getting Started – Datastax

Intro to Cassandra data model

Slides – Intro To Cassandra