Links – Database Theory

December 31, 2016

A few interesting links of database information I’ve run across:
Armstrong’s Axioms: Rules on functional dependencies.
Logic For Serious Database Folks: Chapters from a work in progress on Logic as it pertains to relational databases, no release date set for the complete work.
A Simple Guide to Five Normal Forms in Relational Database Theory: Covering the normal forms, most resources don’t go past Third Normal Form.
Comparison of different SQL implementations: Very comprehensive list of how several relational database products adhere or stray from the ANSI SQL standard.

Links – Computer Science Basics

December 31, 2016

Now that I’ve completed my Bachelor’s (non Computer Science) degree, I’m going back to fill in some gaps in my knowledge of the basics of Computer Science. Here are some interesting links I’ve run across. A lot of these are lists of items that a good programmer should know, and some actually cover some basic material.

Oregon State U – Post-Baccalaureate: Oregon State has a one year online program for students to earn a 2nd Bachelor’s degree. A bit pricey at 28K, but still interesting to see the curriculum.
Programmer Competency Matrix: A good list of what a good programmer should know at various levels of their career.
Interview Cake: The ‘Glossary’ link at the top of the page has links to good explainations of basic terms in Computer Science.
aGupieWare – Bachelor’s Curriculum: A list of the general areas that should be covered by a basic Computer Science education, with links to recommended resources to cover those areas.
Syncfusion: Data Structures I and Data Structures II: Two free e-Books from Syncfusion that cover the basic data structures (Registration required).
Steve Yegge – Five Essential Phone Screen Questions: A good list of the skills an interviewer expects during an interview.
GitHub Gist: Technical Interview Cheet Sheet: A short list to cover basic data structures and algorithms.
Computer Science From The Bottom Up: Good online book (PDF and ePub available) on Computer Science basics.
Rob Conery – The Imposter’s Handbook: A great book that covers Computer Science basics, like Big O notation, data structures, and so on. The e-book is $30, a printed version isn’t yet available.

Degree Completion

November 30, 2016

This post won’t be a technical one, but I recently went back and completed my bachelor’s degree, 29 years after I first entered college. I wanted to put together a write up of the path I took to complete, in case this information was useful to anyone else trying to complete their degree.

My Educational Background
Out of high school, I attended two different schools for 2 years each, so I had plenty of credits built up. I studied political science at the first school and philosophy at the second. I also had some accounting classes from a local tech school as well. To be honest, I was looking for the shortest route to completion, as I was working full time and was already deep into my career. Most schools have a residency requirement, meaning that you need to complete the last 1-2 years or so of your education at that school in order to graduate.

Selecting A School
Reputable US schools will have a regional accreditation. Georgia is covered by the Southern Association of Colleges and Schools. Most schools will require a school to be regionally accredited in order to accept transfer credits from them, or to enter a graduate program. So I was looking for a school with regional accreditation.
I also hoped to find a school that didn’t have a residency requirement, so that I would be able to use as many of my previously earned credits as possible, as well as shorten that amount of time I would spend in completion.
I ended up finding two schools that met these requirements and applied to both of those:
Excelsior College
Thomas Edison State University (TESU)

When I applied, the schools gave me a rough idea of which classes would be accepted as transfer, as well as which category they would be credited as. Once I enrolled at the school, I got an official evaluation of my credits.
Thomas Edison offered to transfer more credits so I enrolled with them.
I later learned of a third school that only requires two classes to be taken at their institution: Charter Oak State College

Once I enrolled at TESU, I met with a guidance counselor to go over my requirements. Everything laid out in my initial evaluation was approved. I had to pay an enrollment fee good for one year, as well as a fee for each class taken.

Selecting A Major
My credits were spread across several areas in the liberal arts, so after checking Thomas Edison’s requirements, I was the closest to earning a Liberal Studies degree.

Earning Credits
There was one Capstone class that I had to take at TESU, but I was free to earn the rest of the credits any way that I could, I didn’t necessarily have to take them from the school.
I had most of my pre-requisites completed, but one good resource for these classes is Georgia’s ECore site.
I ended up taking several classes from a third party company, Straighterline. Here you pay a monthly subscription fee ($99 when I took classes) plus a fee per class($50 or so). You take classes at your own pace, taking as long as you want or working through as quickly as you like. Each class used a textbook, which were easily bought used online. Readings in the text were assigned, and each module had a quiz that was administered online. Most classes had a midterm as well. Each class had a final exam, which was proctored by a third party company. You would take a final exam at home, but the proctor would monitor you via webcam as you tested, as well as monitor your desktop while you tested. You would schedule a time with the proctor, setup with your assigned proctor, and then take the test. Everything was graded immediately. Their site posted a list of all of their classes that TESU would accept, along with the class that it would be credited as, which was very useful in planning. Once you pass a class, you can have StraighterLine send a transcript to the school. Even though you are given a letter grade for the class, the credits will be transferred in as pass/fail. I found this the easiest and cheapest way to earn credits.
There are also several ways to earn credits by exam. For these, you’ll travel to a testing center to take the exam. The testing centers are mostly colleges that open their testing center to outside students (not all schools allow that). I took exams from two agencies:
DSST(formerly known as DANTES)

Both sets of exams were under $100 an exam. As long as you receive a passing grade, you will receive credit. Since the exams are pass/fail you don’t get a letter grade, it will just show up as credit on your transcript.
TESU publishes lists of the exams that they accepted for credit, and the class that you would get credit for.
To study for the exams, I used a site Instacert to prepare for the exams. They are a subscription site ($20 a month when I used them) that have practice questions with answers, I found them very useful. The site Free CLEP Prep also had links to other resources as well.
The exams were tough, but with a good bit of studying I was able to pass every attempt. I stuck to areas that I already knew very well, like history.
TESU and Exclesior also offer their own sets of exams for credit, but I didn’t take any of those.
Most exams are credited for lower level (100 or 200 level) classes. I did find a few that would transfer in for upper level credit. The History of the Vietnam War DSST exam was one, as well as the Introduction to Religion StraighterLine class.

After filling in my requirements, I was left with one class to take at TESU, their capstone class. Here, the sole activity of the class was to write a research paper, kind of a mini-thesis. We were free to select any topic in our major area. Computer Science fell under the Liberal Studies umbrella here, so I was free to select a programming-related topic. Each class has a professor assigned, who will help to formulate a topic. Every two weeks I would submit a chapter and would receive feedback on it. At the end of the class we would submit the full paper. It wasn’t a typical research paper, a lot of the writing was about my research methods as well. The paper had to be at least 25 pages, not including the bibliography and other supporting writings. A minimum grade of ‘C’ was needed.

Before the class was over, I was eligible to go ahead and apply for graduation, of course pending on getting a high enough grade. Once I completed the class, I was eligible to graduate in September of 2016.

SQL Server 2016 SP1: T-SQL Enhancement – CREATE OR ALTER

November 27, 2016

In the Service Pack 1 for SQL Server 2016 (Download here), a new T-SQL enhancement has been added:

This command can be used with stored procedures, functions, triggers and views. So it doesn’t matter if an object already exists or not, we can issue the same command for either case. No more checking for the existence of an object or dropping and recreating an object.
For example:

create or alter procedure dbo.GetPosition
	@PositionCode char(2)

select PositionCode, PositionDescription 
from dbo.Position
where PositionCode = @PositionCode;

This will definitely simplify any deployment process. I’ve never liked dropping and re-creating an object, since you lose any permissions that have been granted to for an object.


Query Performance: PIVOT vs CASE

October 10, 2016

A discussion came up at my job on the most efficient way to return subtotals and totals in one row, something like:

Total A Total B Total C Grand Total
1 2 3 6

I had assumed that PIVOT would be the most straightforward method, but others preferred using CASE statements. So which method is more efficient?

I posted a script on GitHub to build a table with 6 million records to run both kinds of queries against. The results of each will be:

Red Blue White Total
3000000 2000000 1000000 6000000

The PIVOT query:

select [Red], [Blue], [White], ([Red] + [Blue] + [White]) as Total
from (
select RecordColor from dbo.TestPivotCase
) as a
pivot (
count(RecordColor) for RecordColor in ([Red], [White], [Blue])
)as p;

The CASE query:

  sum(case when RecordColor = 'Red' then 1 else 0 end) as [Red],
  sum(case when RecordColor = 'Blue' then 1 else 0 end) as [Blue],
  sum(case when RecordColor = 'White' then 1 else 0 end) as [White],
  count(*) as [Total]
from dbo.TestPivotCase;

The plans were almost exactly similar in cost, with the PIVOT with a slightly less cost (49.7 % vs. 50.3%).
The execution plans were very similar.
Case Plan:

Pivot Plan:

So both methods have a similar execution plan and run time, with a slight edge to the PIVOT method.

SQL Server Plan Comparison Tool

September 30, 2016

MSSQL Tiger Team – Plan Comparison Tool
I ran across the above link detailing the execution plan comparison tool included as part of SSMS(SQL Server Management Studio). This tool gives you a way to view two execution plans at the same time, along with displaying statistics from each plan.
Once you produce the first execution plan, you can right-click and select ‘Save Execution Plan As…’ and save the plan as a .sqlplan file. Once the second execution plan is generated, you can right click on the plan and select the ‘Compare Showplan’ option. The dialog opened will allow you to select the file with the first plan, and will open the comparison tool to display the two plans.

On the topic of execution plans, SQL Sentry announced recently that their Plan Explorer tool is now free, If you haven’t used this tool, it is a great improvement on viewing SQL Server execution plans.
SQL Sentry Plan Explorer

Inner Join vs Outer Join Performance

August 29, 2016

At work, a colleague and I discussed the performance of inner joins and against outer joins, particularly in the case where both types of joins would return the same number of rows. So if it turned out that you always had a match for each inner record, would you pay a penalty for having a left join? In most cases an outer join would return more records, since you would usually have unmatched records.
I posted a script on GitHub to create two tables that would be joined on RecordId.

create table dbo.Test1(
RecordId int not null identity(1,1) primary key clustered,
CreatedDate datetime not null default getdate()

create table dbo.Test2(
RecordId int null,
ItemId tinyint null,
CreatedDate datetime not null default getdate()

I created two million records in Test1, and then 5 Test2 records for each one in Test1. The Test2.RecordId column was left nullable, so we wouldn’t give the optimizer any hints as far as unmatched records.
So next we return all records in two queries, one with an inner join and one with an outer join.

select *
from dbo.Test1 as a
join dbo.Test2 as b
	on b.RecordId = a.RecordId;

select *
from dbo.Test1 as a
left join dbo.Test2 as b
	on b.RecordId = a.RecordId;

The two queries run in the same amount of time with virtually the same execution plan. Since Test1 has a clustered index, the key column is already sorted, so a Merge Join is used. So in this case, we’re running down the clustered index and finding a match for each record in the Test2 table, so it makes sense that the same amount of work would be done.
Dropping the clustered primary key on Test1 results in a different execution plan, where a Hash join is used instead of the Merge join, but the run time is still the same for both queries.
Deleting every 4th row from the detail table resulted in the inner join running slightly faster, but it was a 52-49 split in the execution plan.
I was a little surprised at first there wouldn’t be a difference in query performance between the two join types, but now it makes sense that the optimizer would go through the same plan for both types of join in this case. Inner vs Outer join does make a lot of difference logically in the records returned, but if both joins will return the same number of records then the plans turn out to be the same and the execution time is the same as well.