Posts Tagged ‘DB’

Denormalization again

November 11, 2007

Just realized that there is one more benefit in denormalizing  way described in previous post:  you can easily move Orders records to another database server because you don’t really need foreign keys to products database. Helps with the scalability.

Advertisements

One More Reason for “Denormalization”

November 11, 2007

Everyone knows what is database normalization for and everyone tries to keep their database normalized at maximum possible level. Guaranteed data integrity, impossibility of insert/update/delete anomalies – all that cool things, you know…

Most of developers also know what is database denormalization for – for improving performance in most cases.

Btw because of cheap hardware and relatively expensive development services for most applications maximum performance is not the main goal. So, seems like no reason for denormalizaition?

Nope. There’s no reason for denormalization only if the data never changes.

Let’s see the simple example. We have a shop and we’re selling blankets with custom art printed on them. There are several steps from order submit to shipping – the art should at least be rendered, approved by quality control team and printed. What will happen if we will change some properties of the product when order is already submitted but not yet rendered? The size of printable area for example (we’re going to buy some new printing equipment with smaller printable area and don’t want to allow users to use old large area anymore).

With perfectly normalized database structure we will take the size of the printable area from the product properties and we will finally render smaller image. Will it be cropped or resized – doesn’t matter because it will differ from what user expected to get and what he saw on the preview anyway.

There are several possible approaches to avoid this problem. First is to make a clone of the product record on every change, so all old orders will point to unchanged data. In some cases it is acceptable way – if product info changed rarely we will not have too much garbage in the database, but if it is changed relatively frequently, or we expect the system to keep data for several years, then the second way is better I think.

The second way is to keep all information necessary for printing within the order record even if it will duplicate some product’s properties. It is actually not the ‘denormalization’ if we’ll call that ‘extra’ fields in order record not the “product’s editable area size” but “image size this order was submitted with”. This simple trick allows us to consider the whole database still perfectly normalized.

DB, files, and unique names

November 7, 2007

Almost everyone faced the problem of generation of unique filenames to store on server. For user-uploaded images or something like that. There are several ways to generate an unique name for the file and most common one is to use auto-increment key from database table where the image information stored. Probably it is the best way, but…

You know… there are usually the field for filename in that table. Why? I don’t know, but saw this in a lot of projects.

If you don’t see the problem, I’ll explain:

When just got the file and going to insert the row you can’t know what to put into the filename field because you don’t have the ID yet.

The right way to avoid this problem obviously is to not use the filename field at all. You don’t need need it because filename can be generated from ID anytime.

It’s so simple, right? The main question is why I see filename field again and again and again. I don’t know. Do you?