denormalized vs. normalized data model

How do I decide whether to go with a normalized vs. denormalized data structure for my application?

How do I decide whether to go with a normalized vs. denormalized data structure for my application?

First off, an example of a normalized model:

 <model name="asset" label="Asset">
        <field name="name" label="Name" type="text" />
        <field name="customer_name" label="Customer Name" type="text:name" />
        <field name="customer_address" label="Customer Address" type="text:name"/>
        <field name="model_name" label="Model Name" type="text:name" />
        <field name="model_number" label="Model Number" type="text:number" />
        <display>{name}</display>
    </model>

    <model name="customer" label="Customer">
        <field name="name" label="Name" type="text:name" />
        <field name="address" label="Address" type="text:address" />
        <display>{name}</display>
    </model>

    <model name="model" label="MSTR: Model">
        <field name="name" label="Name" type="text:name" />
        <field name="number" label="Number" type="text:number" />
        <display>{name} {number}</display>
    </model>

Now, the same model, denormalized:

   <model name="asset" label="Asset">
        <field name="name" label="Name" type="text" />
        <belongs-to model="model" />
        <belongs-to model="customer" />
        <display>Model: {model} | Customer: {name}</display>
    </model>

    <model name="customer" label="Customer">
        <field name="name" label="Name" type="text:name" />
        <display>{name}</display>
    </model>

    <model name="model" label="MSTR: Model">
        <field name="name" label="Name" type="text:name" />
        <field name="number" label="Number" type="text:number" />
        <display>{name} {number}</display>
    </model>

In the normalized model, the user likely selects a customer and a model from dropdown lists, and the developer copies the relevant fields from each onto the asset field. This means data between the asset and the related models are not tied together. If, for instance, a customer address is changed, the developer will have to manually update the text fields on the asset, likely by manually running a script in production to retain data integrity.

In the denormalized model, the integrity of the data between asset, model and customer is maintained. In the same scenario as described above, if the customer address is updated, the developer does not have to do any work to update the customers address on the asset object. The address can simply be retrieved via asset.customer().address.

A good rule of thumb is to denormalize the model if you are going to re-use the data (e.g. customer has-many assets). This will ensure data integrity, which is crucial for any operational database.

1 Like

There are cases where normalized data models might be useful, e.g. for report objects where you care about the state of an object at some point in time (i.e. what the asset and related belongs-to models were set to at the time of creating a PDF report, or something similar)

@MachielReyneke There is definitely a case for normalized models, especially when using objects for e.g. reporting, thanks for the comment!