SQL Graph

September 19, 2017

One of the new feature in SQL Server 2017 (I’m using RC2) is the addition of graph database capabilities.

What is a Graph Database?
The Neo4j site has a good basic explanation on what a graph database is. Neo4j is the leading Graph database system available. The main concepts are nodes and edges. A node represents an entity and can have properties stored along with it. Edges represent relationships between nodes, and edges can also have properties as well.

Table Creation:
SQL Server implements Nodes and Edges as relational tables. As an example, I’ll create a Player table to store data on players for a football team. The only thing new here is the ‘As Node’ statement at the end of the table creation.

drop table if exists dbo.Player;

create table dbo.Player(
Team char(3) not null,
Number tinyint not null,
FirstName varchar(50),
LastName varchar(50),
Position char(2),
primary key(Team, Number)
) as Node;

insert into dbo.Player values ('ATL', 2, 'Matt', 'Ryan', 'QB');
insert into dbo.Player values ('ATL', 11, 'Julio', 'Jones', 'WR');
insert into dbo.Player values ('ATL', 21, 'Desmond', 'Trufant', 'CB');
insert into dbo.Player values ('ATL', 44, 'Vic', 'Beasley', 'LB');

select * from dbo.Player;

When we selection from this Node table, notice that a ‘Node Id’ column has been added. This stores JSON data with some information on each record, including a numeric Node Id for each record.

{"type":"node","schema":"dbo","table":"Player","id":0}

Edges are created the same way as a Node, with a ‘As Edge’ statement.

drop table if exists dbo.PlatoonsWith;

create table dbo.PlatoonsWith(
PlatoonName varchar(10) not null
) as Edge;

select * from dbo.PlatoonsWith;

As an Edge table, there are attributes added to the table when created. There is the Edge ID, From Id and To Id. The From and To hold the node IDs that are being related.
Once the tables have been created, they can be viewed in SSMS under ‘Tables’, under a ‘Graph Tables’ directory that is new for SQL Server 2017.

Creating Relationships:
We can create relationships between nodes by inserting records into the Edge table. We’ll insert the IDs for the two Nodes that we want to relate, plus any Edge properties. In this example, we’ll relate the two offensive players (Ryan and Jones).

declare @Node1 nvarchar(1000);
declare @Node2 nvarchar(1000);

select @Node1 = [$node_id_4FE94CB5420E440DA5CE8FE7781FA404]
from dbo.Player
where Team = 'ATL' and Number = 2;

select @Node2 = [$node_id_4FE94CB5420E440DA5CE8FE7781FA404]
from dbo.Player
where Team = 'ATL' and Number = 11;

insert into dbo.PlatoonsWith([$from_id_64FB3EBE4E3449D5A59D79FB0F8FA857], [$to_id_6A42816479484605BCD86836DFE8122B], PlatoonName)
values (@Node1, @Node2, 'Offense');

select * from dbo.PlatoonsWith;

Querying Graph Data:
The MATCH keyword was added to T-SQL to aid in querying graph data. The Node table can be joined to the Edge table without the usual JOIN..ON clause, instead using the old style Table1, Table2 syntax.
For an example, we’ll find the offensive player joined to Matt Ryan.

select p1.FirstName + ' ' + p1.LastName as Player1, p1.Position as Player1Position,
	pw.PlatoonName, p2.FirstName + ' ' + p2.LastName as Player2, p2.Position as Player2Position
from dbo.Player as p1, dbo.PlatoonsWith as pw, dbo.Player as p2
where match(p1-(pw)->p2)
	and p1.Number = 2
	and p2.Team = 'ATL';

Deleting Nodes:
There isn’t any sort of protection against deleting nodes. Even if an Edge record refers to a Node, you’re still allowed to delete that Node. The Edge record with the orphaned Node will be left in place.

Wrap Up:
A script with all of the SQL from this post is posted on GitHub.
The Graph functions are definitely useful, but hopefully some improvements will be made. I’m not sure why the auto-generated columns include the extra characters appended to the column names, that does make it a little harder to query, since you have to go retrieve the column name after the table is created. It would also be useful if there were some sort of visualization tool to view the Nodes and their relations.

Links:
SQL Graph Overview – Microsoft
SQL Graph – Niko Neugebauer

Advertisements

SQL Server 2017 – New T-SQL Functions

August 14, 2017

Here are some new T-SQL functions added to SQL Server 2017. I’m running the RC 2 version.

TRIM:
SQL Server has separate functions for removing spaces at the start (LTRIM) and end (RTRIM) of a string. 2017 adds the TRIM function to remove leading and trailing spaces together.

declare @Test varchar(20) = '  Test string   ';
select @Test as OriginalString, trim(@Test) as TrimmedString;

Result:
OriginalString TrimmedString
Test string Test string

TRIM

CONCAT_WS:
CONCAT_WS stands for concatenate with separator. The first argument is a string that will be used as the separator. Any number of string arguments can follow and these string will be combined into one string, with the designated separator between each value.

select concat_ws(',', 'First', 'Second', 'Third', 'Fourth');

Result:
First,Second,Third,Fourth

CONCAT_WS

TRANSLATE:
TRANSLATE can be thought of as an extension of REPLACE. We can define a list of characters that we want replaced in a string, and then a list of the characters to replace them with.

select translate('[Test string]', '[]', '()');

Result: (Test string)
TRANSLATE

STRING_AGG:
STRING_AGG is similar to CONCAT_WS, except it will concatenate values from a column instead of a list of values. The second argument is the string to use as the value separator. We can also use the WITHIN GROUP clause to specify the order of the items in the delimited list.

drop table if exists #Test;
create table #Test(Team varchar(20) not null primary key);
insert into #Test values('Falcons'), ('Saints'), ('Panthers'), ('Buccaneers');
select string_agg(Team, ':') from #Test;
select string_agg(Team, ':') within group(order by Team desc) from #Test;

Result:
Buccaneers:Falcons:Panthers:Saints
Saints:Panthers:Falcons:Buccaneers

STRING_AGG

Links:
SQL Server 2017 New Features
SQL Shack – New 2017 String Functions