How we create a reliable database through data modelling?

The essential of Data Modelling!

Data vs Model

If we talk about data modeling, then we are talking in 2 words namely model and data. Data is a collection of facts, while a model is a form of representation. Then, what is data modeling? Data modelling is a representation of the data structures that required by a database. I would say that an effective and efficient database will depends on the data modelling process.

Imagine that you will build your house, then what will you do? Will you just buy the materials and build a house? I don’t think so. Yups, you should make a “blueprint” for making a master plan such as, specify how many room in your house, The installation of plumbing line, the building framework and etc. Same like data modelling, we also make a representation of the data structures that are required by a database for our business process.

An advantage of doing a data modelling!

Someone who read this article will ask the same question about an advantage doing a data modelling, so here is the several advantage:

  • The data modelling will provide us an insight about the attributes of particular table
  • Clear understanding about relationship of each table
  • A business process rules

It may seem trivial and easy, but in fact data modeling can help us to design a database. So, let’s going deeper! after we know about the data modelling, we are going to know about the type of data modelling that commonly use in the industry.

Type of Data Modelling

Data Modelling Pyramid
  • Conceptual Data Model:

Defines WHAT the system contains. Used by Business stakeholder. The purpose is to organize, scope and define business concepts and rules. This step is sometimes considered to be a high-level and abstract design phase.

  • Logical Data Model:

Defines HOW the system should be implemented regardless of the DBMS. Used by Data Architects and Business Analysts. The purpose is for developed technical map of rules and data structures. In the second step, the data items, the relationships and the constraints are all expressed using the concepts provided by the high-level data model. Because these concepts do not include the implementation details, the result of the data modelling process is a (semi) formal representation of the database structure.

  • Physical Data Model:

Describes HOW the system will be implemented using a specific DBMS system. Used by DBA and developers. The purpose is actual implementation of the database.

Data Modelling Stage

Relation in each tables? How?

Data Cardinality In database design, cardinality also can represent the relationships between tables. There are multiple cardinality concept, this cardinality depends on the business process of each company. Let me give you and easiest example about cardinality according to the model figure above.

Cardinality

In this article i also want to give an explanation about the dimensional modelling that commonly used in industry!

Dimensional modelling aims to optimize the retrieval of data and focus in arranging data based on a business process. Sometimes dimensional modelling arranged based on the business dimensional. In the dimensional modelling there are have 2 important things, which a “Measurement” and “Context”. For better understanding about the “Measurement” and “Context” let’s take a look to the figure below.

Measurement vs Context

For the dimensional modelling, they have 2 common schema namely star schema and snowflake schema. Star Schema in data warehouse, in which the center of the star can have one fact table and a number of associated dimension tables. It is known as star schema as its structure resembles a star. The Star Schema data model is the simplest type of Data Warehouse schema. It is also known as Star Join Schema and is optimized for querying large data sets.

Star Schema

For instance, the following query would allow us to calculate the total revenue by product in the year 2010:

SELECT
p.Product_Name AS product,
SUM(r.Revenue) AS total_revenue
FROM
Revenue r
JOIN
Product p ON (r.Model_ID = p.Model_ID)
JOIN
DateDim d ON (r.Date_ID = d.Date_ID)
WHERE
d.Year = 2010
GROUP BY
p.Product_ID
Snowflake Schema VS Star Schema

Snowflake Schema in data warehouse is a logical arrangement of tables in a multidimensional database such that the ER diagram resembles a snowflake shape. A Snowflake Schema is an extension of a Star Schema, and it adds additional dimensions. The dimension tables are normalized which splits data into additional tables.

Snowflake Schema

The star and snowflake schema consists of two types of tables:

  1. Facts: Metrics of a business process. These are generally numeric and additive (e.g. amount of an invoice or the number of invoices), or quantitative. The fact table also contain keys pointing to relevant dimension tables. There is just one fact table at the center of the star schema.
  2. Dimensions: The where, when, what, etc. (e.g. date/time, locations, goods sold). These typically contain qualitative information. There are multiple dimension tables in the schema, all of which are related to the fact table.

I love Data Science!

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store