April 22, 2018

When assigning values to a variable, we can use SET or SELECT. SET is the ANSI standard method. If we have a set value we want to assign to a variable, either method works just as well as the other. When we are retrieving values from a table or view, then there are some differences in behavior.
In order to run through some examples, we’ll set up a temp table with 1 column and 2 rows of data.

drop table if exists #Test;
create table #Test (RecordId int not null primary key);
insert into #Test(RecordId) values (1), (2);

When selecting by the primary key, we know we’ll get one value, so things are straightforward.

declare @VarSelect int;
declare @VarSet int;

select @VarSelect = RecordId from #Test where RecordId = 1;
set @VarSet = (select RecordId from #Test where RecordId = 1);

select @VarSelect as VarSelect, @VarSet as VarSet;

Let’s try an example of retrieving a value that doesn’t exist in the table. We’ll set 0 as the default value for the variables.

declare @VarSelect int = 0;
declare @VarSet int = 0;

select @VarSelect = RecordId from #Test where RecordId = 3;
set @VarSet = (select RecordId from #Test where RecordId = 3);

select @VarSelect as VarSelect, @VarSet as VarSet;

Using the SET method will return NULL, where the SELECT will leave the default value in place. If we’re using the variable one time, this behavior may not cause issues, but reusing the variable could cause confusion.
In the following example, the first select will return 1. The second select won’t find a value, so it will also return 1.

declare @VarSelect int = 0;

select @VarSelect = RecordId from #Test where RecordId = 1;
select @VarSelect as VarSelect;

select @VarSelect = RecordId from #Test where RecordId = 3;
select @VarSelect as VarSelect;

When returning multiple values, a SET will raise this error:
“Msg 512, Level 16, State 1, Line x
Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, , >= or when the subquery is used as an expression.”
The SELECT will return one of the values from the result set.

declare @VarSelect2 int = 0;
declare @VarSet2 int = 0;

select @VarSelect2 = RecordId from #Test;
set @VarSet2 = (select RecordId from #Test);

select @VarSelect2 as VarSelect, @VarSet2 as VarSet;

In this case, I would rather an error be thrown that to just return one of the result set values.
Overall, it seems like SET it the better choice. We’ll get an error if multiple values are returned, and if no value is returned then we’ll get a NULL.


Converting Float Values To Strings

March 26, 2018

A coworker and I were looking at a request where we needed to write float values to a file, but have the values be of a uniform length (In this case, the float column was storing a monetary value, which isn’t a good idea, but that’s what we had to work with). In this case, all values were out to 4 decimal places, so it was a matter of padding the value to the left of the decimal to get everything to an equal length. The first thought was to convert the value to a string, pad the from with zeros, and from that take a string of the specified length.
To my surprise, it turns out that casting a float to a string data type (a varchar in this case) ends up rounding the value in some cases, and converting to scientific notation in others. With a little Googling we found out that using the STR function is the recommended way to make this conversion. STR takes has length and decimal value parameters to control how many digits are output.
I had used the FORMAT function for dates before, but it also handles numbers as well. This turned out to be the approach that was used, since in the FORMAT function we can specify padding to a certain length.
Here’s SQL for some test data long with the different approaches.

create table dbo.FloatTest(
FloatValue float not null

insert into dbo.FloatTest values (123.45678);
insert into dbo.FloatTest values (2.66993256);
insert into dbo.FloatTest values (0.00001);
insert into dbo.FloatTest values (55555.84);
insert into dbo.FloatTest values (321.0987654);

select  FloatValue as OriginalValue, 
        cast(FloatValue as varchar(15)) as CastValue,
	str(FloatValue, 20, 10) as StrValue, 
	format(FloatValue, 'F9') as FormatValue,
	format(FloatValue, '0000000000.##########') as CustomFormatValue
from dbo.FloatTest;


With FORMAT, we can specify a standard format for floats with F, followed by the number of decimal digits. Or we can create a custom format. Using a 0 for a required digit and a # for an optional digit allows us greater control over what is returned.

SQL Server And JSON

February 11, 2018

SQL Server 2016 added support for working with JSON. Although there isn’t a JSON datatype, there is still the ability to output query results to JSON, and to break down JSON into rows and columns.
This post will run through some basic examples of working with JSON. I’m using SQL Server 2017 although 2016 can be used as well.

First I’ll create some test data.

drop table if exists dbo.JsonTest;

create table dbo.JsonTest(
FirstName varchar(20) null,
LastName varchar(20) not null,
StatusCode varchar(10) not null
insert into dbo.JsonTest(FirstName, LastName, StatusCode) values ('Mike', 'Smith', 'Inactive');
insert into dbo.JsonTest(FirstName, LastName, StatusCode) values ('Jane', 'Doe', 'Active');
insert into dbo.JsonTest(FirstName, LastName, StatusCode) values (NULL, 'Jones', 'Pending');

Next is returning relational data as JSON. Much like the FOR XML clause returns XML, the FOR JSON clause returns the selected data in a JSON string. AUTO will return a default structure, where using PATH will allow more control on the output, like naming the root node and returning NULL values instead of omitting them.

select FirstName, LastName, StatusCode FROM dbo.JsonTest FOR JSON AUTO;
select FirstName, LastName, StatusCode, 'Atlanta' as [Address.City], 'GA' as [Address.State] FROM dbo.JsonTest FOR JSON PATH, ROOT('People');

The second select also shows how to nest data, in this case in an Address node.

OPENJSON will return one row for each node in a JSON string.

declare @JSON as varchar(4000) = '[{"FirstName":"Mike","LastName":"Smith","StatusCode":"Inactive"},{"FirstName":"Jane","LastName":"Doe","StatusCode":"Active"},{"LastName":"Jones","StatusCode":"Pending"}]';


With OPENJSON, we can also parse JSON into relational rows and columns, provided that the column name matches the JSON attribute name. If the names don’t match then NULLs are returned.

declare @JSON as varchar(4000) = '[{"FirstName":"Mike","LastName":"Smith","StatusCode":"Inactive"},{"FirstName":"Jane","LastName":"Doe","StatusCode":"Active"},{"LastName":"Jones","StatusCode":"Pending"}]';

FirstName varchar(20),
LastName varchar(20),
StatusCode varchar(10)

It is also possible to map a JSON attribute name to a different name for the output, we’ll need to specify which value to match to.

declare @JSON as varchar(4000) = '[{"FirstName":"Mike","LastName":"Smith","StatusCode":"Inactive"},{"FirstName":"Jane","LastName":"Doe","StatusCode":"Active"},{"LastName":"Jones","StatusCode":"Pending"}]';

GivenName varchar(20) '$.FirstName',
Surname varchar(20) '$.LastName',
StatusCode varchar(10)

There are also a few JSON Functions available.
ISJSON will determine if a string is valid JSON or not.
JSON_VALUE will extract a scalar value.
JSON_QUERY will return a JSON fragment or an array of values.
By default, the JSON functions are in Lax mode, which means that an error won’t be raised with an invalid operation, a NULL values will be returned instead. Strict mode can be specified, in which case an error will be raised with an invalid operation.

declare @JSON as varchar(4000) = '[{"FirstName":"Mike","LastName":"Smith","StatusCode":"Inactive", "Language":["English","Spanish"]},{"FirstName":"Jane","LastName":"Doe","StatusCode":"Active"}]';
declare @NotJson as varchar(4000) = 'Not a JSON string';

-- Return bit to determine if a string is valid JSON or not

-- Get scalar value - 0 based array
SELECT JSON_VALUE(@JSON, '$[1].FirstName');

-- Return JSON fragment or array of values
SELECT JSON_QUERY(@JSON, '$[0].Language');

-- Default is lax mode - returns NULL on error
-- Strict will raise error

SELECT JSON_QUERY(@JSON, 'lax $[1].Language');
SELECT JSON_QUERY(@JSON, 'strict $[1].Language');

All of the SQL is also posted on GitHub.

Simple Talk has a good introduction to JSON functionality.
Microsoft Docs – OPENJSON
Microsoft Docs – JSON Functions

Running Totals

November 28, 2017

The guys from the SQL Server Radio podcast posted a blog post from their episode on calculating running totals in SQL Server. Their script (link in the post) covers several different ways to calculate a running total in a query (A total of a value in the current row added to each row that came before it). The best method (in my view) is using the SUM function with the OVER clause. A lot of the other methods involved a cursor or some other messy method.
I imagine most developers familiar with T-SQL have used the OVER clause with ROW_NUMBER or some other windowing function, but the OVER clause can also be used with aggregate functions, like COUNT or SUM or AVG.
An example:

create table [dbo].[RunningTotal](
	RecordId int identity(1,1) NOT NULL,
	GroupId tinyint NOT NULL,
	Amount decimal(6, 2) NOT NULL

insert into [dbo].[RunningTotal](GroupId, Amount) values (1, 12.34);
insert into [dbo].[RunningTotal](GroupId, Amount) values (1, 56.78);
insert into [dbo].[RunningTotal](GroupId, Amount) values (1, 55.66);
insert into [dbo].[RunningTotal](GroupId, Amount) values (2, 33.33);
insert into [dbo].[RunningTotal](GroupId, Amount) values (2, 22.10);
insert into [dbo].[RunningTotal](GroupId, Amount) values (2, 12.34);
insert into [dbo].[RunningTotal](GroupId, Amount) values (3, 98.76);

select GroupId, Amount, 
	sum(Amount) over(order by RecordId) as RunningSum,
	sum(Amount) over(partition by GroupId order by RecordId) as RunningSumByGroup
from dbo.RunningTotal;


This example gives a running total over the whole dataset, plus a running total by Group (GroupId).

Number To Text

June 25, 2017

I ran across a programming challenge question on Reddit, asking to create a program that will take a number and convert it to its text representation, so 11 to Eleven, 103 to One Hundred Three, etc.

I’ve posted a script on Github to accomplish this in T-SQL.

Previously, I had created a similar script to convert a Roman numeral to a number.

Calendar Table And Marking Holidays

March 15, 2017

A lot of Data Marts or reporting databases will include a Calendar or Dates table to list all days within a certain range, along with data on those days (Day of the week, quarter, etc.).
I’ve posted my script on GitHub that will generate these records.
At a previous job, we wanted to track days the office was closed so that we could measure how many business days certain functions took. Determining weekends was easy, but calculating holidays took a little more effort. You could just manually enter the holidays for each year, but it is possible to calculate the holidays for each year in one pass.
Some holidays are the same date every year, like Christmas.

UPDATE dbo.Calendar SET IsHoliday = 1 WHERE [Month] = 12 AND [Day] = 25;

Others are on a specific day of the week, like Labor Day being the first Monday in September (This script is marking US holidays).

SET IsHoliday = 1
FROM dbo.Calendar as c
	SELECT DateKey,
	FROM dbo.Calendar
	WHERE [Month] = 9
		AND [WeekDay] = 2
) AS z
ON z.DateKey = c.DateKey
	AND z.RowNumber = 1;

This statement will find the first Monday of each September for the range of dates in the table, and mark it as a holiday.
Another method I saw used was to look for a Monday in September where the day was between 1 and 7, since the first Monday of the month will always be in that range.
Of course, the hardest holiday to calculate is Easter. Luckily, I found an algorithm on the US Naval Observatory site to determine the date. I’ve included a function in the Calendar script to calculate the date.
Another thing to consider (which I didn’t include in this script) is if a holiday falls on a weekend, a business may choose the nearest weekday to observe that holiday.


March 6, 2017

While writing a query recently, I made use of the BETWEEN operator, which will match all values within a specified range. Without thinking, I put the greater value first:
WHERE RecordId BETWEEN 100 and 90

I was somewhat surprised that no records were returned, I knew there should be matches. Reversing the order gave me the results I expected:
WHERE RecordID BETWEEN 90 and 100.

So running the query and getting the execution plan, I saw that the WHERE clause was transformed to:
WHERE RecordID >= 90 and RecordID <= 100

Seeing this, it becomes obvious why the order would matter.