Running Totals

May 28, 2018

Displaying a running total in SQL Server is very straightforward, thanks to window functions and the OVER clause.
We’ll set up an example with some records for a payment amount over several days.

drop table if exists #Test;

create table #Test (
PaymentDate date not null primary key,
Amount int not null

insert into #Test(PaymentDate, Amount) 
values ('2018-05-29', 25), ('2018-05-30',10), ('2018-05-31',20), 
	('2018-06-01',30), ('2018-06-02',40), ('2018-06-03', 15);

select * from #Test;

Adding a running total is just a matter of SUM with an OVER clause.

select PaymentDate, Amount,
	sum(Amount) over (order by PaymentDate) as RunningTotal
from #Test
order by PaymentDate;

The RunningTotal column will show the total of all payments made on that day and any earlier date.

We can also use PARTITION to display a running total for just that month as well.

select PaymentDate, Amount, datename(month, PaymentDate) as MonthName,
	sum(Amount) over (partition by month(PaymentDate) order by PaymentDate) as MonthRunningTotal
from #Test
order by PaymentDate;

In the first example we used default settings for the OVER clause, which is for the current row plus all rows before it. Here is the first query with the default explicitly given.

select PaymentDate, Amount,
	sum(Amount) over (order by PaymentDate rows unbounded preceding) as RunningTotal
from #Test
order by PaymentDate;

We can change this setting to return other combinations. In the final example, we’ll return the running total as the payment amount for the current row plus the amount from the day before.

select PaymentDate, Amount,
	sum(Amount) over (order by PaymentDate rows 1 preceding) as RunningTotal
from #Test
order by PaymentDate;

Window Functions



April 22, 2018

When assigning values to a variable, we can use SET or SELECT. SET is the ANSI standard method. If we have a set value we want to assign to a variable, either method works just as well as the other. When we are retrieving values from a table or view, then there are some differences in behavior.
In order to run through some examples, we’ll set up a temp table with 1 column and 2 rows of data.

drop table if exists #Test;
create table #Test (RecordId int not null primary key);
insert into #Test(RecordId) values (1), (2);

When selecting by the primary key, we know we’ll get one value, so things are straightforward.

declare @VarSelect int;
declare @VarSet int;

select @VarSelect = RecordId from #Test where RecordId = 1;
set @VarSet = (select RecordId from #Test where RecordId = 1);

select @VarSelect as VarSelect, @VarSet as VarSet;

Let’s try an example of retrieving a value that doesn’t exist in the table. We’ll set 0 as the default value for the variables.

declare @VarSelect int = 0;
declare @VarSet int = 0;

select @VarSelect = RecordId from #Test where RecordId = 3;
set @VarSet = (select RecordId from #Test where RecordId = 3);

select @VarSelect as VarSelect, @VarSet as VarSet;

Using the SET method will return NULL, where the SELECT will leave the default value in place. If we’re using the variable one time, this behavior may not cause issues, but reusing the variable could cause confusion.
In the following example, the first select will return 1. The second select won’t find a value, so it will also return 1.

declare @VarSelect int = 0;

select @VarSelect = RecordId from #Test where RecordId = 1;
select @VarSelect as VarSelect;

select @VarSelect = RecordId from #Test where RecordId = 3;
select @VarSelect as VarSelect;

When returning multiple values, a SET will raise this error:
“Msg 512, Level 16, State 1, Line x
Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, , >= or when the subquery is used as an expression.”
The SELECT will return one of the values from the result set.

declare @VarSelect2 int = 0;
declare @VarSet2 int = 0;

select @VarSelect2 = RecordId from #Test;
set @VarSet2 = (select RecordId from #Test);

select @VarSelect2 as VarSelect, @VarSet2 as VarSet;

In this case, I would rather an error be thrown that to just return one of the result set values.
Overall, it seems like SET it the better choice. We’ll get an error if multiple values are returned, and if no value is returned then we’ll get a NULL.

Converting Float Values To Strings

March 26, 2018

A coworker and I were looking at a request where we needed to write float values to a file, but have the values be of a uniform length (In this case, the float column was storing a monetary value, which isn’t a good idea, but that’s what we had to work with). In this case, all values were out to 4 decimal places, so it was a matter of padding the value to the left of the decimal to get everything to an equal length. The first thought was to convert the value to a string, pad the from with zeros, and from that take a string of the specified length.
To my surprise, it turns out that casting a float to a string data type (a varchar in this case) ends up rounding the value in some cases, and converting to scientific notation in others. With a little Googling we found out that using the STR function is the recommended way to make this conversion. STR takes has length and decimal value parameters to control how many digits are output.
I had used the FORMAT function for dates before, but it also handles numbers as well. This turned out to be the approach that was used, since in the FORMAT function we can specify padding to a certain length.
Here’s SQL for some test data long with the different approaches.

create table dbo.FloatTest(
FloatValue float not null

insert into dbo.FloatTest values (123.45678);
insert into dbo.FloatTest values (2.66993256);
insert into dbo.FloatTest values (0.00001);
insert into dbo.FloatTest values (55555.84);
insert into dbo.FloatTest values (321.0987654);

select  FloatValue as OriginalValue, 
        cast(FloatValue as varchar(15)) as CastValue,
	str(FloatValue, 20, 10) as StrValue, 
	format(FloatValue, 'F9') as FormatValue,
	format(FloatValue, '0000000000.##########') as CustomFormatValue
from dbo.FloatTest;


With FORMAT, we can specify a standard format for floats with F, followed by the number of decimal digits. Or we can create a custom format. Using a 0 for a required digit and a # for an optional digit allows us greater control over what is returned.

SQL Server And JSON

February 11, 2018

SQL Server 2016 added support for working with JSON. Although there isn’t a JSON datatype, there is still the ability to output query results to JSON, and to break down JSON into rows and columns.
This post will run through some basic examples of working with JSON. I’m using SQL Server 2017 although 2016 can be used as well.

First I’ll create some test data.

drop table if exists dbo.JsonTest;

create table dbo.JsonTest(
FirstName varchar(20) null,
LastName varchar(20) not null,
StatusCode varchar(10) not null
insert into dbo.JsonTest(FirstName, LastName, StatusCode) values ('Mike', 'Smith', 'Inactive');
insert into dbo.JsonTest(FirstName, LastName, StatusCode) values ('Jane', 'Doe', 'Active');
insert into dbo.JsonTest(FirstName, LastName, StatusCode) values (NULL, 'Jones', 'Pending');

Next is returning relational data as JSON. Much like the FOR XML clause returns XML, the FOR JSON clause returns the selected data in a JSON string. AUTO will return a default structure, where using PATH will allow more control on the output, like naming the root node and returning NULL values instead of omitting them.

select FirstName, LastName, StatusCode FROM dbo.JsonTest FOR JSON AUTO;
select FirstName, LastName, StatusCode, 'Atlanta' as [Address.City], 'GA' as [Address.State] FROM dbo.JsonTest FOR JSON PATH, ROOT('People');

The second select also shows how to nest data, in this case in an Address node.

OPENJSON will return one row for each node in a JSON string.

declare @JSON as varchar(4000) = '[{"FirstName":"Mike","LastName":"Smith","StatusCode":"Inactive"},{"FirstName":"Jane","LastName":"Doe","StatusCode":"Active"},{"LastName":"Jones","StatusCode":"Pending"}]';


With OPENJSON, we can also parse JSON into relational rows and columns, provided that the column name matches the JSON attribute name. If the names don’t match then NULLs are returned.

declare @JSON as varchar(4000) = '[{"FirstName":"Mike","LastName":"Smith","StatusCode":"Inactive"},{"FirstName":"Jane","LastName":"Doe","StatusCode":"Active"},{"LastName":"Jones","StatusCode":"Pending"}]';

FirstName varchar(20),
LastName varchar(20),
StatusCode varchar(10)

It is also possible to map a JSON attribute name to a different name for the output, we’ll need to specify which value to match to.

declare @JSON as varchar(4000) = '[{"FirstName":"Mike","LastName":"Smith","StatusCode":"Inactive"},{"FirstName":"Jane","LastName":"Doe","StatusCode":"Active"},{"LastName":"Jones","StatusCode":"Pending"}]';

GivenName varchar(20) '$.FirstName',
Surname varchar(20) '$.LastName',
StatusCode varchar(10)

There are also a few JSON Functions available.
ISJSON will determine if a string is valid JSON or not.
JSON_VALUE will extract a scalar value.
JSON_QUERY will return a JSON fragment or an array of values.
By default, the JSON functions are in Lax mode, which means that an error won’t be raised with an invalid operation, a NULL values will be returned instead. Strict mode can be specified, in which case an error will be raised with an invalid operation.

declare @JSON as varchar(4000) = '[{"FirstName":"Mike","LastName":"Smith","StatusCode":"Inactive", "Language":["English","Spanish"]},{"FirstName":"Jane","LastName":"Doe","StatusCode":"Active"}]';
declare @NotJson as varchar(4000) = 'Not a JSON string';

-- Return bit to determine if a string is valid JSON or not

-- Get scalar value - 0 based array
SELECT JSON_VALUE(@JSON, '$[1].FirstName');

-- Return JSON fragment or array of values
SELECT JSON_QUERY(@JSON, '$[0].Language');

-- Default is lax mode - returns NULL on error
-- Strict will raise error

SELECT JSON_QUERY(@JSON, 'lax $[1].Language');
SELECT JSON_QUERY(@JSON, 'strict $[1].Language');

All of the SQL is also posted on GitHub.

Simple Talk has a good introduction to JSON functionality.
Microsoft Docs – OPENJSON
Microsoft Docs – JSON Functions

Decimal vs Numeric

January 22, 2018

When storing exact numeric values with decimal values in SQL Server, there are two data types to choose from: Decimal and Numeric. Both types allow precision and scale to be defined. Precision is the maximum number of total digits allowed, and scale is the number of digits to the right of the decimal point. Both are part of the ANSI standard. Both types appear to be used the same way in SQL Server, the Microsoft documentation states that the two are synonyms. So why are there two data types for the same purpose?
In Joe Celkos’s book “SQL For Smarties”, he mentioned a sight distinction between the two.
“NUMERIC (p,s) specifies the exact precision and scale to be used. DECIMAL(p,s) specifies the exact scale, but the precision is implementation-defined to be equal or greater than the specified value.”
So there’s a slight difference in the ANSI Standard, but no difference in the SQL Server implementation.

I did find it interesting that when that when an error was raised assigning a value to a decimal variable, the error message referred to a numeric data type.

declare @Value decimal(2,1);
set @Value = 123.456;

Msg 8115, Level 16, State 8, Line 25
Arithmetic overflow error converting numeric to data type numeric.

I use decimal, since that term is more specific than numeric, which sounds like it would be any kind of number, even integers.

SQL Server 2017 – In-Database Python Setup

January 15, 2018

With SQL Server 2016, the ability to execute R code in SQL Server was added. SQL Server 2017 added functionality to run Python code as well. This functionality for either language can be installed as part of the SQL Server installation, or added to an existing instance.

My initial attempt at Python installation didn’t go smoothly. Multiple attempts at install would hang up and not progress. However, once I removed the Python 3 installation from the machine, the install was able to complete.
The Launchpad service gets installed as part of the Python/R package as well. This service needs to be running in order to execute Python or R code. At first I created a new user for the service, but I ran into errors trying to execute code.
Msg 39021, Level 16, State 1, Line 55
Unable to launch runtime for ‘Python’ script. Please check the configuration of the ‘Python’ runtime.
Msg 39019, Level 16, State 2, Line 55
An external script error occurred:
Unable to launch the runtime. ErrorCode 0x80070490: 1168(Element not found.).

I ended up giving the service an administrator account to run under, which cleared up this issue.

The ‘External Scripts Enabled’ setting should be set to true in order to run Python/R code.

EXEC sp_configure 'external scripts enabled', 1;

I’ll create a small table to have some test data.

drop table if exists dbo.Team;
create table dbo.Team(Team varchar(20) not null primary key);
insert into dbo.Team values('Falcons'), ('Saints'), ('Panthers'), ('Buccaneers');

And then run a simple loop to read the team names from the table and write them to the results pane.

exec sp_execute_external_script @language = N'Python', @script = N'for x in InputDataSet.Team: print(x)', @input_data_1 = N'select Team from dbo.Team';

InputDataSet is the default name for the data set read in from the database. There is a parameter for the sp_execute_external_script proc to set a different name, if so desired.

Additional Links:
Set up Python Machine Learning Services

SQL Server Tools List

December 31, 2017

SQL Server Tools
Here is a collection of SQL Server tools I’ve run across. These are all free tools, some from Microsoft and some from the SQL Server community.

SQL Cop: Runs checks on a database to find potential issues, like tables missing primary keys, use of deprecated data types, etc.

Microsoft Assessment and Planning (MAP) Toolkit for SQL Server: Lists all SQL Server instances in an environment,including version, edition as well as information on the host(OS, # of cores, etc).

SQL Query Stress: Developed by Adam Machanic, this is a tool to load test queries. It can create up to 200 threads to create a test load.

SQL Server Diagnostics: An add-in for SSMS (2016 or later). It can analyze memory dumps and give recommendations for issues. It also recommends Cumulative Updates for fixes for specific features being used in a specific instance.

SQL Server Migration Assistant: Tools to move data into SQL Server from Access, Oracle, MySQL and other databases.

Database Experimentation Assistant: Testing program to identify potential issues when upgrading to a newer edition of SQL Server.

Tiger Team Github: Various tools from the SQL Server engineering team. The SQL Performance dashboard reports in particular are very useful.

DiskSpd: Tool for testing storage. Takes the place of the SQLIO utility.