Pages

Popular Posts

Wednesday, April 10, 2013

MariaDB Dynamic Columns

MariaDB  Dynamic columns


Very interesting new feature included in MariaDB 5.3 onwards and I hope it will get enough attention for continued development and improvement - enter dynamic columns. What does it mean - dynamic column? In the simple terms this feature will allow you to create a column for any row in the table. The actual data will be stored in blobs and this feature also comes with cool set of functions to manipulate the data. If this does not make any sense at this point - stick around as the example for practical use will make it easy to grasp.

For example, you have typical analytics scenario  - you have a single large fact table (fact or measures tables are usually the large ones) and a collection of smaller dimension tables (aka attributes), all and all classical star schema that contains:

Say you are a marketing company to the mobile phone industry and you have an app that captures the data in the following tables:

call log table -       the fact table that stored all calls
device table -         dimension table that stores all smart phone devices (HTC, iPhone etc...)
region table -         country etc...
carrier table -         stores the carrier (AT&T, Verizon etc....)
os table       -         stores all operating systems (iOS, Android etc...)


All and all five tables to get the data in and five tables to join when you want to report on the data. Not too bad you'd say and I normally would agree, but how about the possibility that those attributes (or dimension) could be stored in the single table along with the fact table? Yes you can try to de normalize the data by cramming the attributes into the fact table, but there is a better way.

Enter dynamic columns in MariaDB - so now you can store those attributes in the same table thus avoiding the table joints.  Those who know can bring the enum and set data types that have the similar function, but enum and set are limited by the volume of the data that you can store in them - enum can have 65,535  elements stored in it and set has 64 max members that can be stored in it.

The dynamic columns are free from set and enum data types limitations - one can store up to 1 GB in a single column. Pretty cool huh? 

Now to the practical usage - that is of very simple and elegant design.

Creating tables with dynamic columns

Creating tables with dynamic columns is easy:



CREATE TABLE call_log (
  call_id int primary key,
  incoming_number varchar(11) DEFAULT NULL,
  outgoing_number  varchar(11) DEFAULT NULL,
  dynamic_cols blob
) ENGINE=InnoDB


The above statement created a table with dynamic column. Now we can store the data in the dynamic column we created.


insert into call_log values (1,'8888888888', '9999999999', COLUMN_CREATE('carrier', 'AT&T'));
insert into call_log values (2,'8888888888', '9999999999', COLUMN_CREATE('2', 'Verizon'));        
insert into call_log values (3,'8888888888', '9999999999', COLUMN_CREATE('carrier', 'Tmobile'));
   


Now will insert the device values:


insert into call_log values (4,'8888888888', '9999999999', COLUMN_CREATE('device', 'iPhone'));
insert into call_log values (5,'8458888888', '9999779999', COLUMN_CREATE('device', 'Blackberry'));


Now lets insert some operating system values into our call log table


insert into call_log values (6,'8458888888', '9999779999', COLUMN_CREATE('os', 'iOS'));

insert into call_log values (7,'8458888888', '9999779999', COLUMN_CREATE('os', 'Android'));



Now lets insert region data:


insert into call_log values (8,'8458888888', '9999779999', COLUMN_CREATE('region', 'North West'));


insert into call_log values (9,'8458888888', '9999779999', COLUMN_CREATE('region', 'South East'));


Now we can insert a series of values into the dynamic column:

insert into call_log values (7,'8458888888', '9999779999', COLUMN_CREATE('1', 'Android', '2', 'iPhone', '3', 'Android', '4', 'South East'));

Cool huh?

So how do we query for the data then? Very easy using functions - 



select call_id, COLUMN_GET(dynamic_cols, '3' as char) as color from call_log;
+---------+---------+
| call_id | color      |
+---------+---------+
|       1 | Android    |
|       2 | iOS           |
|       3 | Symbian   |
|       4 | iOS           |
|       5 | NULL       |
|       6 | iOS           |
|       7 | Android    |
|       8 | NULL       |
|       9 | NULL       |
|      10 | Android   |
+---------+---------+
10 rows in set (0.00 sec)

The above selected the call_id and all operating systems (3).

Important Note: MariaDB 5.3 and MariaDB 5.5 only support number as a column name MariaDB 10.x supports characters as columns name.

Limitations.

Presently there is not support for indexing dynamic columns, but as usage will increase so will the support for indexing.

Hope you found it interesting, for more visit MariaDB 


Be sure to visit accelerationdb.com





2 comments:

Nem said...

FYI... you have a few errors in your SQL statements.

The 2nd insert should have 'carrier' instead of '2'.

The last insert should have 10 for a key instead of 7.

Also, shouldn't the select statement that is showing all the awesomeness of this select 'os' instead of 3?

If not, how the heck are you supposed to remember/know that 3 is related to the os? Also, wouldn't you label it as 'os' instead of 'color'?

Joe Privett said...

Great post - thanks! I'm excited about this feature of MariaDB because it bring a no-sql dimension to sql tables - nice.

Do you know if it is possible to do a join between a dynamic column and another column (either standard or dynamic) n a different table? If so, are there any performance comparison against standard SQL joins and any notes on scaling/sharding efficiency of this type of table?