Thursday, August 22, 2013

NoSQL - types and use cases



In order to explain what NoSQL is and why its needed lets first introduce what was before it came into the big game...and before we knew RDBMS what provided us the main concept - ACID.
Atomicity- transactivity of actions - if we have a transaction that contains actions A,B and C - all those actions should success .If one of them fails - we shoudl rollback the previous one's to initial state.
Consistency - if tranascation A is ececuted and it should do 2 actions : 1. incease balance of account a1 in 200$  and action 2. should decrease account b1 in 100$ onec transaction is done (in success!) both numbers will be updated ( we dont care what happens during transaction execution. consistency is about what happens in the end)
Isolation - transactions that executed in parallel dont impact each other.
Durability - we dont care if the electiricty goes down in the whole area or the machine totally went down - once transaction was done - even if the system goes down - when it gets up again - the transaction result will be updated.

Great!the concept is clear and here main keyplayer come...
ORACLE , SQLServer, MySQL,DB2 etc..
Everything went fine..till the data started growing with huge speed.... then it started beeing clear that setting up 1 single machine is not enough anymore.In addition to that oracle licensing is based on CPU amount on the machine...making simple map shows us that something totally different should come and replace RDBMS....not yet! Let's shot a last bullet ..last nail to the grave of RDBMS...
CAP theorem(Brewer's theorem)
It says that its impossible in destributed computing provide Consistency,Isolation and Partition tolerance.
So lets summarize:
We have ACID (Oracle) and we want to add more machines to reach scalability.Then we go into destributed computing.And then comes CAP theorem and kicks our a**.
So what know?
We want ACID,we buy license but we cant buy it?
So one option is  to use open source RDBMS and save money...
But what if i say that you you are still in trouble? Your system based on RDBMS cant be infinitly scalible..
Once you reach to petabyte - you will be soooooo slow that you buisness will crush?
Then NoSQL comes to the game...
And what you have there?
*Open Source
*Greatly scalible
*Not Relational
*Destributed
system that during last 10 years proved itself in different companies.
We have 4 families:
And each family has its advantages and disadvantages:
Data ModelPerformanceScalabilityFlexibilityComplexityFunctionality
Key-Value Storeshighhighhighnonevariable (none)
Column Storehighhighmoderatelowminimal
Document Storehighvariable (high)highlowvariable (low)
Graph Databasevariablevariablehighhighgraph theory
Relational Databasevariablevariablelowmoderaterelational algebra.

So you have different databases and the you defenetly say : so which one to pick?
Thats what i've asked first time i saw it. Sometimes i needed graph and sometimes relational and sometimes key-value...
So the answer is simple : you can hold all of them... and combine them...

Sources;
http://en.wikipedia.org/wiki/NoSQL
http://db-engines.com/en/ranking
http://highlyscalable.wordpress.com/2012/03/01/nosql-data-modeling-techniques/


PS: if you look into db-engines link you can see top DB engines rank.Sure!Oracle Number 1. But im almost sure that NoSQL and other technologies will make with oracle the same Linux did to Windows.Sure!ORacle will stay forever...but world around it will change..

No comments:

Post a Comment