Data on Morten Ankerstjerne

Repeat last known value

Fri, 18 Jul 2025 00:00:00 +0000

Demo scripts below.

Disclaimer: This technique is by no means something I have come up with, I’ve just needed it enough times that I figured I might as well put it into my own words, for future reference.

Background

Suppose you have a collection of time series data, recording some status or other measurement at a point in time.

In an OLTP system we probably don’t want to repeat the same value every time we take a measurement, or we may only record changes made to a ledger when they occur.

Stop auto-shrinking your staging database

Thu, 05 Jun 2025 00:00:00 +0000

Background

If you’ve worked with data warehouse development on Microsoft SQL Server, you have probably had a run-in with your SAN-admin at some point, complaining about the size of your databases, and asking if you can do some cleanup to free up disk space.

What often ends up happening is, you take look at the database files and realize there’s a lot of free space that could be released, especially if you have a database dedicated to staging transformed data, before loading it into data marts.

Ambiguity in daylight saving time

Mon, 28 Apr 2025 00:00:00 +0000

Demo scripts below.

Background

Most data professionals will have to deal with dates at one point or another. Most often, we don’t have the luxury of choosing the data types in the source systems we work with, so some interesting challenges may come up.

One such challenge is daylight saving time (or summer time, in Europe), when people turn their clocks back or forward an hour to have more hours of sunlight during our daily life (this is not universal, but is observed throughout most of Europe and North America, among other places).

Column order matters in Clustered Indexes

Sun, 23 Mar 2025 00:00:00 +0000

Demo scripts below.

I recently worked on a problem for a customer who experienced some performance issues with a job on their SQL Server database.

Because they use SQL Server Standard Edition, their solution included a home-rolled version of table partitioning, where two tables were created every week:

one contained 10-second aggregate values, written in microbatches every 10 seconds
the other similarly contained 5-minute values, written from the 10 second table every 5 minutes

All of these aggregates were based on raw measurements from their various systems, one measurement per device.