Create an account

Very important

  • To access the important data of the forums, you must be active in each forum and especially in the leaks and database leaks section, send data and after sending the data and activity, data and important content will be opened and visible for you.
  • You will only see chat messages from people who are at or below your level.
  • More than 500,000 database leaks and millions of account leaks are waiting for you, so access and view with more activity.
  • Many important data are inactive and inaccessible for you, so open them with activity. (This will be done automatically)


Thread Rating:
  • 224 Vote(s) - 3.68 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Difference between scaling horizontally and vertically for databases

#1
I have come across many NoSQL databases and SQL databases. There are varying parameters to measure the strength and weaknesses of these databases and scalability is one of them. What is the difference between horizontally and vertically scaling these databases?
Reply

#2
Yes scaling horizontally means adding more machines, but it also implies that the machines are equal in the cluster. MySQL can scale horizontally in terms of Reading data, through the use of replicas, but once it reaches capacity of the server mem/disk, you have to begin sharding data across servers. This becomes increasingly more complex. Often keeping data consistent across replicas is a problem as replication rates are often too slow to keep up with data change rates.

Couchbase is also a fantastic NoSQL Horizontal Scaling database, used in many commercial high availability applications and games and arguably the highest performer in the category. It partitions data automatically across cluster, adding nodes is simple, and you can use commodity hardware, cheaper vm instances (using Large instead of High Mem, High Disk machines at AWS for instance). It is built off the Membase (Memcached) but adds persistence. Also, in the case of Couchbase, every node can do reads and writes, and are equals in the cluster, with only failover replication (not full dataset replication across all servers like in mySQL).

Performance-wise, you can see an excellent Cisco benchmark:

[To see links please register here]


Here is a great blog post about Couchbase Architecture:

[To see links please register here]

Reply

#3
There is an additional architecture that wasn't mentioned - SQL-based database services that enable horizontal scaling without the complexity of manual sharding. These services do the sharding in the background, so they enable you to run a traditional SQL database and scale out like you would with NoSQL engines like MongoDB or CouchDB. Two services I am familiar with are <a href="http://www.enterprisedb.com">EnterpriseDB</a> for PostgreSQL and <a href="http://www.xeround.com">Xeround</a> for MySQL. I saw an in-depth <a href="http://xeround.com/blog/2011/07/scaling-mysql-database-in-the-cloud">post</a> by Xeround which explains why scale-out on SQL databases is difficult and how they do it differently - treat this with a grain of salt as it is a vendor post. Also check out Wikipedia's <a href="http://en.wikipedia.org/wiki/Cloud_database">Cloud Database entry</a>, there is a nice explanation of SQL vs. NoSQL and service vs. self-hosted, a list of vendors and scaling options for each combination. ;)
Reply

#4
SQL databases like Oracle, db2 also support Horizontal scaling through Shared disk cluster. For example Oracle RAC, IBM DB2 purescale or Sybase ASE Cluster edition. New node can be added to Oracle RAC system or DB2 purescale system to achieve horizontal scaling.

But the approach is different from noSQL databases (like mongodb, CouchDB or IBM Cloudant) is that the data sharding is not part of Horizontal scaling. In noSQL databases data is shraded during horizontal scaling.
Reply

#5
Adding lots of load balancers creates extra overhead and latency and that is the drawback for scaling out horizontally in nosql databases. It is like the question why people say RPC is not recommended since it is not robust.

I think in a real system we should use both sql and nosql databases to utilize both multicore and cloud computing capabilities of today's systems.

On the other hand, complex transactional queries has high performance if sql databases such as oracle being used. NoSql could be used for bigdata and horizontal scalability by sharding.
Reply

#6
Let's start with the need for scaling that is increasing resources so that your system can now handle more requests than it earlier could.

When you realise your system is getting slow and is unable to handle the current number of requests, you need to scale the system.

This provides you with two options. Either you increase the resources in the server which you are using currently, i.e, increase the amount of RAM, CPU, GPU and other resources. This is known as vertical scaling.

Vertical scaling is typically costly.
It does not make the system fault tolerant, i.e if you are scaling application running with single server, if that server goes down, your system will go down.
Also the amount of threads remains the same in vertical scaling.
Vertical scaling may require your system to go down for a moment when process takes place. Increasing resources on a server requires a restart and put your system down.

Another solution to this problem is increasing the amount of servers present in the system. This solution is highly used in the tech industry.
This will eventually decrease the request per second rate in each server.
If you need to scale the system, just add another server, and you are done. You would not be required to restart the system.
Number of threads in each system decreases leading to high throughput.
To segregate the requests, equally to each of the application server, you need to add load balancer which would act as reverse proxy to the web servers. This whole system can be called as a single cluster.
Your system may contain a large number of requests which would require more amount of clusters like this.

Hope you get the whole concept of introducing scaling to the system.
Reply

#7
Traditional relational databases were designed as client/server database systems. They can be scaled horizontally but the process to do so tends to be complex and error prone. NewSQL databases like NuoDB are memory-centric distributed database systems designed to scale out horizontally while maintaining the SQL/ACID properties of traditional RDBMS.

For more information on NuoDB, read their [technical white paper]( ).
Reply

#8
You have a company and there is only 1 worker but you got 1 new project at that time you hire new candidate -- this is horizontal scaling. where new candidate is new machines and project is new traffic/calls to your api's.

Where as 1 project with an IIT/NIT guy handling all request to your api/traffic. If any time more request to your api's then fire him and replacing him with a high IQ NIT/IIT guy -- this is vertical scaling.

Reply

#9
**Horizontal scaling means that you scale by adding more machines** into your pool of resources whereas **Vertical scaling means that you scale by adding more power (CPU, RAM) to an existing machine**.

An easy way to remember this is to think of a machine on a server rack, we add more machines across the **horizontal** direction and add more resources to a machine in the **vertical** direction.

                  [![Horizontal Scaling/Vertical Scaling Visualisation][1]][1]


In the database world, horizontal-scaling is often based on the partitioning of the data i.e. each node contains only part of the data, in vertical-scaling the data resides on a single node and scaling is done through multi-core i.e. spreading the load between the CPU and RAM resources of that machine.

With horizontal-scaling it is often easier to scale dynamically by adding more machines into the existing pool - Vertical-scaling is often limited to the capacity of a single machine, scaling beyond that capacity often involves downtime and comes with an upper limit.

Good examples of horizontal scaling are Cassandra, MongoDB, [Google Cloud Spanner][2] .. and a good example of vertical scaling is MySQL - Amazon RDS (The cloud version of MySQL). It provides an easy way to scale vertically by switching from small to bigger machines. This process often involves downtime.

In-Memory Data Grids such as [GigaSpaces XAP][3], [Coherence][4] etc.. are often optimized for both horizontal and vertical scaling simply because they're not bound to disk. Horizontal-scaling through partitioning and vertical-scaling through multi-core support.

You can read more on this subject in my earlier posts:
[Scale-out vs Scale-up][5] and [The Common Principles Behind the NOSQL Alternatives][6]





[1]:

[2]:

[To see links please register here]

[3]:

[To see links please register here]

[4]:

[To see links please register here]

[5]:

[To see links please register here]

[6]:

[To see links please register here]

Reply

#10
**Scaling horizontally** ===> Thousands of minions will do the work together for you.

**Scaling vertically** ===> One big hulk will do all the work for you.

[![enter image description here][1]][1]


[1]:
Reply



Forum Jump:


Users browsing this thread:
1 Guest(s)

©0Day  2016 - 2023 | All Rights Reserved.  Made with    for the community. Connected through