Archive for the ‘DBA’ Category

Denormalization again

November 11, 2007

Just realized that there is one more benefit in denormalizing  way described in previous post:  you can easily move Orders records to another database server because you don’t really need foreign keys to products database. Helps with the scalability.


One More Reason for “Denormalization”

November 11, 2007

Everyone knows what is database normalization for and everyone tries to keep their database normalized at maximum possible level. Guaranteed data integrity, impossibility of insert/update/delete anomalies – all that cool things, you know…

Most of developers also know what is database denormalization for – for improving performance in most cases.

Btw because of cheap hardware and relatively expensive development services for most applications maximum performance is not the main goal. So, seems like no reason for denormalizaition?

Nope. There’s no reason for denormalization only if the data never changes.

Let’s see the simple example. We have a shop and we’re selling blankets with custom art printed on them. There are several steps from order submit to shipping – the art should at least be rendered, approved by quality control team and printed. What will happen if we will change some properties of the product when order is already submitted but not yet rendered? The size of printable area for example (we’re going to buy some new printing equipment with smaller printable area and don’t want to allow users to use old large area anymore).

With perfectly normalized database structure we will take the size of the printable area from the product properties and we will finally render smaller image. Will it be cropped or resized – doesn’t matter because it will differ from what user expected to get and what he saw on the preview anyway.

There are several possible approaches to avoid this problem. First is to make a clone of the product record on every change, so all old orders will point to unchanged data. In some cases it is acceptable way – if product info changed rarely we will not have too much garbage in the database, but if it is changed relatively frequently, or we expect the system to keep data for several years, then the second way is better I think.

The second way is to keep all information necessary for printing within the order record even if it will duplicate some product’s properties. It is actually not the ‘denormalization’ if we’ll call that ‘extra’ fields in order record not the “product’s editable area size” but “image size this order was submitted with”. This simple trick allows us to consider the whole database still perfectly normalized.