SQL Server And JSON

February 11, 2018

SQL Server 2016 added support for working with JSON. Although there isn’t a JSON datatype, there is still the ability to output query results to JSON, and to break down JSON into rows and columns.
This post will run through some basic examples of working with JSON. I’m using SQL Server 2017 although 2016 can be used as well.

First I’ll create some test data.

drop table if exists dbo.JsonTest;

create table dbo.JsonTest(
FirstName varchar(20) null,
LastName varchar(20) not null,
StatusCode varchar(10) not null
);
insert into dbo.JsonTest(FirstName, LastName, StatusCode) values ('Mike', 'Smith', 'Inactive');
insert into dbo.JsonTest(FirstName, LastName, StatusCode) values ('Jane', 'Doe', 'Active');
insert into dbo.JsonTest(FirstName, LastName, StatusCode) values (NULL, 'Jones', 'Pending');

Next is returning relational data as JSON. Much like the FOR XML clause returns XML, the FOR JSON clause returns the selected data in a JSON string. AUTO will return a default structure, where using PATH will allow more control on the output, like naming the root node and returning NULL values instead of omitting them.

select FirstName, LastName, StatusCode FROM dbo.JsonTest FOR JSON AUTO;
select FirstName, LastName, StatusCode, 'Atlanta' as [Address.City], 'GA' as [Address.State] FROM dbo.JsonTest FOR JSON PATH, ROOT('People');
select FirstName, LastName, StatusCode FROM dbo.JsonTest FOR JSON PATH, INCLUDE_NULL_VALUES, WITHOUT_ARRAY_WRAPPER;

The second select also shows how to nest data, in this case in an Address node.

OPENJSON will return one row for each node in a JSON string.

declare @JSON as varchar(4000) = '[{"FirstName":"Mike","LastName":"Smith","StatusCode":"Inactive"},{"FirstName":"Jane","LastName":"Doe","StatusCode":"Active"},{"LastName":"Jones","StatusCode":"Pending"}]';

SELECT * FROM OPENJSON(@JSON);

With OPENJSON, we can also parse JSON into relational rows and columns, provided that the column name matches the JSON attribute name. If the names don’t match then NULLs are returned.

declare @JSON as varchar(4000) = '[{"FirstName":"Mike","LastName":"Smith","StatusCode":"Inactive"},{"FirstName":"Jane","LastName":"Doe","StatusCode":"Active"},{"LastName":"Jones","StatusCode":"Pending"}]';

SELECT * FROM OPENJSON(@JSON)
WITH (
FirstName varchar(20),
LastName varchar(20),
StatusCode varchar(10)
);

It is also possible to map a JSON attribute name to a different name for the output, we’ll need to specify which value to match to.


declare @JSON as varchar(4000) = '[{"FirstName":"Mike","LastName":"Smith","StatusCode":"Inactive"},{"FirstName":"Jane","LastName":"Doe","StatusCode":"Active"},{"LastName":"Jones","StatusCode":"Pending"}]';

SELECT * FROM OPENJSON(@JSON)
WITH (
GivenName varchar(20) '$.FirstName',
Surname varchar(20) '$.LastName',
StatusCode varchar(10)
);

There are also a few JSON Functions available.
ISJSON will determine if a string is valid JSON or not.
JSON_VALUE will extract a scalar value.
JSON_QUERY will return a JSON fragment or an array of values.
By default, the JSON functions are in Lax mode, which means that an error won’t be raised with an invalid operation, a NULL values will be returned instead. Strict mode can be specified, in which case an error will be raised with an invalid operation.


declare @JSON as varchar(4000) = '[{"FirstName":"Mike","LastName":"Smith","StatusCode":"Inactive", "Language":["English","Spanish"]},{"FirstName":"Jane","LastName":"Doe","StatusCode":"Active"}]';
declare @NotJson as varchar(4000) = 'Not a JSON string';

-- Return bit to determine if a string is valid JSON or not
SELECT ISJSON(@JSON);
SELECT ISJSON(@NotJson);

-- Get scalar value - 0 based array
SELECT JSON_VALUE(@JSON, '$[1].FirstName');

-- Return JSON fragment or array of values
SELECT JSON_QUERY(@JSON, '$[0].Language');

-- Default is lax mode - returns NULL on error
-- Strict will raise error

SELECT JSON_QUERY(@JSON, 'lax $[1].Language');
SELECT JSON_QUERY(@JSON, 'strict $[1].Language');

All of the SQL is also posted on GitHub.

Links:
Simple Talk has a good introduction to JSON functionality.
Microsoft Docs – OPENJSON
Microsoft Docs – JSON Functions

Advertisements

Decimal vs Numeric

January 22, 2018

When storing exact numeric values with decimal values in SQL Server, there are two data types to choose from: Decimal and Numeric. Both types allow precision and scale to be defined. Precision is the maximum number of total digits allowed, and scale is the number of digits to the right of the decimal point. Both are part of the ANSI standard. Both types appear to be used the same way in SQL Server, the Microsoft documentation states that the two are synonyms. So why are there two data types for the same purpose?
In Joe Celkos’s book “SQL For Smarties”, he mentioned a sight distinction between the two.
“NUMERIC (p,s) specifies the exact precision and scale to be used. DECIMAL(p,s) specifies the exact scale, but the precision is implementation-defined to be equal or greater than the specified value.”
So there’s a slight difference in the ANSI Standard, but no difference in the SQL Server implementation.

I did find it interesting that when that when an error was raised assigning a value to a decimal variable, the error message referred to a numeric data type.

declare @Value decimal(2,1);
set @Value = 123.456;

Msg 8115, Level 16, State 8, Line 25
Arithmetic overflow error converting numeric to data type numeric.

I use decimal, since that term is more specific than numeric, which sounds like it would be any kind of number, even integers.


SQL Server 2017 – In-Database Python Setup

January 15, 2018

With SQL Server 2016, the ability to execute R code in SQL Server was added. SQL Server 2017 added functionality to run Python code as well. This functionality for either language can be installed as part of the SQL Server installation, or added to an existing instance.

Issues:
My initial attempt at Python installation didn’t go smoothly. Multiple attempts at install would hang up and not progress. However, once I removed the Python 3 installation from the machine, the install was able to complete.
The Launchpad service gets installed as part of the Python/R package as well. This service needs to be running in order to execute Python or R code. At first I created a new user for the service, but I ran into errors trying to execute code.
Msg 39021, Level 16, State 1, Line 55
Unable to launch runtime for ‘Python’ script. Please check the configuration of the ‘Python’ runtime.
Msg 39019, Level 16, State 2, Line 55
An external script error occurred:
Unable to launch the runtime. ErrorCode 0x80070490: 1168(Element not found.).

I ended up giving the service an administrator account to run under, which cleared up this issue.

Configuration:
The ‘External Scripts Enabled’ setting should be set to true in order to run Python/R code.

EXEC sp_configure 'external scripts enabled', 1;
GO
RECONFIGURE;
GO

Execution:
I’ll create a small table to have some test data.

drop table if exists dbo.Team;
create table dbo.Team(Team varchar(20) not null primary key);
insert into dbo.Team values('Falcons'), ('Saints'), ('Panthers'), ('Buccaneers');

And then run a simple loop to read the team names from the table and write them to the results pane.

exec sp_execute_external_script @language = N'Python', @script = N'for x in InputDataSet.Team: print(x)', @input_data_1 = N'select Team from dbo.Team';

InputDataSet is the default name for the data set read in from the database. There is a parameter for the sp_execute_external_script proc to set a different name, if so desired.

Additional Links:
sp_execute_external_script
Set up Python Machine Learning Services


SQL Server Tools List

December 31, 2017

SQL Server Tools
Here is a collection of SQL Server tools I’ve run across. These are all free tools, some from Microsoft and some from the SQL Server community.

SQL Cop: Runs checks on a database to find potential issues, like tables missing primary keys, use of deprecated data types, etc.

Microsoft Assessment and Planning (MAP) Toolkit for SQL Server: Lists all SQL Server instances in an environment,including version, edition as well as information on the host(OS, # of cores, etc).

SQL Query Stress: Developed by Adam Machanic, this is a tool to load test queries. It can create up to 200 threads to create a test load.

SQL Server Diagnostics: An add-in for SSMS (2016 or later). It can analyze memory dumps and give recommendations for issues. It also recommends Cumulative Updates for fixes for specific features being used in a specific instance.

SQL Server Migration Assistant: Tools to move data into SQL Server from Access, Oracle, MySQL and other databases.

Database Experimentation Assistant: Testing program to identify potential issues when upgrading to a newer edition of SQL Server.

Tiger Team Github: Various tools from the SQL Server engineering team. The SQL Performance dashboard reports in particular are very useful.

DiskSpd: Tool for testing storage. Takes the place of the SQLIO utility.

 

 


Running Totals

November 28, 2017

The guys from the SQL Server Radio podcast posted a blog post from their episode on calculating running totals in SQL Server. Their script (link in the post) covers several different ways to calculate a running total in a query (A total of a value in the current row added to each row that came before it). The best method (in my view) is using the SUM function with the OVER clause. A lot of the other methods involved a cursor or some other messy method.
I imagine most developers familiar with T-SQL have used the OVER clause with ROW_NUMBER or some other windowing function, but the OVER clause can also be used with aggregate functions, like COUNT or SUM or AVG.
An example:

create table [dbo].[RunningTotal](
	RecordId int identity(1,1) NOT NULL,
	GroupId tinyint NOT NULL,
	Amount decimal(6, 2) NOT NULL
);

insert into [dbo].[RunningTotal](GroupId, Amount) values (1, 12.34);
insert into [dbo].[RunningTotal](GroupId, Amount) values (1, 56.78);
insert into [dbo].[RunningTotal](GroupId, Amount) values (1, 55.66);
insert into [dbo].[RunningTotal](GroupId, Amount) values (2, 33.33);
insert into [dbo].[RunningTotal](GroupId, Amount) values (2, 22.10);
insert into [dbo].[RunningTotal](GroupId, Amount) values (2, 12.34);
insert into [dbo].[RunningTotal](GroupId, Amount) values (3, 98.76);

select GroupId, Amount, 
	sum(Amount) over(order by RecordId) as RunningSum,
	sum(Amount) over(partition by GroupId order by RecordId) as RunningSumByGroup
from dbo.RunningTotal;

go

This example gives a running total over the whole dataset, plus a running total by Group (GroupId).


Microsoft SQL Operations Studio

November 27, 2017

Microsoft has released a new database management tool called Microsoft SQL Operations Studio. The new studio will run on Windows, Linux and Mac OS, and is free and open source. It can manage on-premises SQL Servers as well as Axure instances. There’s no installer, you just unzip the files and run the EXE to start the studio.
SSMS users will be familiar with most features: Object Explorer, T-SQL code window with Intellisense, backup and restore, and so on.
There is also built-in support for Git source control. You can also create dashboard widgets to display performance and maintenance data, which they call Insight Widgets. Two examples are provided: One to display the slowest queries and one to show the space used by tables. The performance widget runs off of the Query Store data, so that feature must be enabled.
The most interesting feature was the Integrated Terminal, which allows running command lines tasks from within the Studio, including Command Line, Powershell, and BCP.
Operations Studio is still way short of SSMS on features, although I did like the dashboards and the Integrated Terminal. The basics seem to be covered, and I would expect more features to be added over time. I imagine the main use case is for non-Windows users, although a non-admin Windows user that just needed to query data may find use in the simpler version.

Links:
GitHub
Download SQL Operations Studio
Insight Widgets with examples
Integrated Terminal


SQL Graph

September 19, 2017

One of the new feature in SQL Server 2017 (I’m using RC2) is the addition of graph database capabilities.

What is a Graph Database?
The Neo4j site has a good basic explanation on what a graph database is. Neo4j is the leading Graph database system available. The main concepts are nodes and edges. A node represents an entity and can have properties stored along with it. Edges represent relationships between nodes, and edges can also have properties as well.

Table Creation:
SQL Server implements Nodes and Edges as relational tables. As an example, I’ll create a Player table to store data on players for a football team. The only thing new here is the ‘As Node’ statement at the end of the table creation.

drop table if exists dbo.Player;

create table dbo.Player(
Team char(3) not null,
Number tinyint not null,
FirstName varchar(50),
LastName varchar(50),
Position char(2),
primary key(Team, Number)
) as Node;

insert into dbo.Player values ('ATL', 2, 'Matt', 'Ryan', 'QB');
insert into dbo.Player values ('ATL', 11, 'Julio', 'Jones', 'WR');
insert into dbo.Player values ('ATL', 21, 'Desmond', 'Trufant', 'CB');
insert into dbo.Player values ('ATL', 44, 'Vic', 'Beasley', 'LB');

select * from dbo.Player;

When we selection from this Node table, notice that a ‘Node Id’ column has been added. This stores JSON data with some information on each record, including a numeric Node Id for each record.

{"type":"node","schema":"dbo","table":"Player","id":0}

Edges are created the same way as a Node, with a ‘As Edge’ statement.

drop table if exists dbo.PlatoonsWith;

create table dbo.PlatoonsWith(
PlatoonName varchar(10) not null
) as Edge;

select * from dbo.PlatoonsWith;

As an Edge table, there are attributes added to the table when created. There is the Edge ID, From Id and To Id. The From and To hold the node IDs that are being related.
Once the tables have been created, they can be viewed in SSMS under ‘Tables’, under a ‘Graph Tables’ directory that is new for SQL Server 2017.

Creating Relationships:
We can create relationships between nodes by inserting records into the Edge table. We’ll insert the IDs for the two Nodes that we want to relate, plus any Edge properties. In this example, we’ll relate the two offensive players (Ryan and Jones).

declare @Node1 nvarchar(1000);
declare @Node2 nvarchar(1000);

select @Node1 = [$node_id_4FE94CB5420E440DA5CE8FE7781FA404]
from dbo.Player
where Team = 'ATL' and Number = 2;

select @Node2 = [$node_id_4FE94CB5420E440DA5CE8FE7781FA404]
from dbo.Player
where Team = 'ATL' and Number = 11;

insert into dbo.PlatoonsWith([$from_id_64FB3EBE4E3449D5A59D79FB0F8FA857], [$to_id_6A42816479484605BCD86836DFE8122B], PlatoonName)
values (@Node1, @Node2, 'Offense');

select * from dbo.PlatoonsWith;

Querying Graph Data:
The MATCH keyword was added to T-SQL to aid in querying graph data. The Node table can be joined to the Edge table without the usual JOIN..ON clause, instead using the old style Table1, Table2 syntax.
For an example, we’ll find the offensive player joined to Matt Ryan.

select p1.FirstName + ' ' + p1.LastName as Player1, p1.Position as Player1Position,
	pw.PlatoonName, p2.FirstName + ' ' + p2.LastName as Player2, p2.Position as Player2Position
from dbo.Player as p1, dbo.PlatoonsWith as pw, dbo.Player as p2
where match(p1-(pw)->p2)
	and p1.Number = 2
	and p2.Team = 'ATL';

Deleting Nodes:
There isn’t any sort of protection against deleting nodes. Even if an Edge record refers to a Node, you’re still allowed to delete that Node. The Edge record with the orphaned Node will be left in place.

Wrap Up:
A script with all of the SQL from this post is posted on GitHub.
The Graph functions are definitely useful, but hopefully some improvements will be made. I’m not sure why the auto-generated columns include the extra characters appended to the column names, that does make it a little harder to query, since you have to go retrieve the column name after the table is created. It would also be useful if there were some sort of visualization tool to view the Nodes and their relations.

Links:
SQL Graph Overview – Microsoft
SQL Graph – Niko Neugebauer