Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are horizontal and vertical partitions in database and what is the difference?

Tags:

database

I read that

SELECT is a horizontal partition of the relation into two set of tuples.

and

PROJECT is a vertical partition of the relation into two relations.

However, I don't understand what that means. Can you explain it in layman's terms?

like image 940
John Snowden Avatar asked Aug 18 '13 19:08

John Snowden


People also ask

What are horizontal partitions?

Horizontal partitioning involves putting different rows into different tables. For example, customers with ZIP codes less than 50000 are stored in CustomersEast, while customers with ZIP codes greater than or equal to 50000 are stored in CustomersWest.

What is difference between horizontal and vertical partitioning does MySQL support both horizontal and vertical partitioning?

Horizontal vs. Horizontal partitioning means that all rows matching the partitioning function will be assigned to different physical partitions. Vertical partitioning allows different table columns to be split into different physical partitions. Currently, MySQL supports horizontal partitioning but not vertical.

How do horizontal and vertical partitioning help with database performance?

Like horizontal partitioning, vertical partitioning lets queries scan less data. This increases query performance. For example, a table that contains seven columns of which only the first four are generally referenced may benefit from splitting the last three columns into a separate table.


3 Answers

Not a complete answer to the question but it answers what is asked in the question title. So the general meaning of horizontal and vertical database partitioning is:

Horizontal partitioning involves putting different rows into different tables. Perhaps customers with ZIP codes less than 50000 are stored in CustomersEast, while customers with ZIP codes greater than or equal to 50000 are stored in CustomersWest. The two partition tables are then CustomersEast and CustomersWest, while a view with a union might be created over both of them to provide a complete view of all customers.

Vertical partitioning involves creating tables with fewer columns and using additional tables to store the remaining columns. Normalization also involves this splitting of columns across tables, but vertical partitioning goes beyond that and partitions columns even when already normalized.

See more details here.

like image 106
Kuldeep Jain Avatar answered Oct 09 '22 08:10

Kuldeep Jain


A projection creates a subset of attributes in a relation hence a "vertical partition"

A selection creates a subset of the tuples in a relation hence a "horizontal partition"

Given a table (r) as

a : b : c : d : e
-----------------
1 : 2 : 3 : 4 : 5
1 : 2 : 3 : 4 : 5
2 : 2 : 3 : 4 : 5
2 : 2 : 3 : 4 : 5

An expression such as

PROJECT a, b (SELECT a=1 (r))

-- SELECT a, b FROM r WHERE a=1

Would "do"

a : b | c : d : e
-----------------
1 : 2 | 3 : 4 : 5
1 : 2 | 3 : 4 : 5
=================    <  -- horizontal partition (by SELECTION)
2 : 2 | 3 : 4 : 5
2 : 2 | 3 : 4 : 5

      ^  -- vertical partition (by PROJECTION)

Resulting in

a : b
------
1 : 2 
1 : 2 
like image 22
T I Avatar answered Oct 09 '22 06:10

T I


Necromancing.
I think the existing answers are too abstract.

So here my attempts at a more practical explanation:

Partitioning form a developer's point of view is all about performance.
More exactly, it's about what happens when you have large amounts of data in your tables, and you still want to query the data fast.

Here some excerpts from slides by Bill Karwin about what exactly horizontal partitioning is all about:

Horrible

The above is bad, because:

Problem 1

Problem 2

The solution:

HORIZONTAL PARTITONING

Horizontal partitioning divides a table into multiple tables. Each table then contains the same number of columns, but fewer rows.

solution horizontal

The difference: Query Performance and simplicity

Kirk Quote




Now, on the difference between horizontal and vertical partitioning:

"Tribbles" can also accumulate in columns. Example: Column Tribble

The solution to that problem is VERTICAL PARTITIONING
Proper normalization is ONE form of vertical partitioning

To quote technet

Vertical partitioning divides a table into multiple tables that contain fewer columns.

The two types of vertical partitioning are normalization and row splitting:

Normalization is the standard database process of removing redundant columns from a table and putting them in secondary tables that are linked to the primary table by primary key and foreign key relationships.

Row splitting divides the original table vertically into tables with fewer columns. Each logical row in a split table matches the same logical row in the other tables as identified by a UNIQUE KEY column that is identical in all of the partitioned tables. For example, joining the row with ID 712 from each split table re-creates the original row. Like horizontal partitioning, vertical partitioning lets queries scan less data. This increases query performance. For example, a table that contains seven columns of which only the first four are generally referenced may benefit from splitting the last three columns into a separate table. Vertical partitioning should be considered carefully, because analyzing data from multiple partitions requires queries that join the tables.

Vertical partitioning also could affect performance if partitions are very large.

That sums it up nicely.

Proper normalization

Now on SELECT vs. PROJECT:

This SO post describes the difference as such:

Select Operation : This operation is used to select rows from a table (relation) that specifies a given logic, which is called as a predicate. The predicate is a user defined condition to select rows of user's choice.

Project Operation : If the user is interested in selecting the values of a few attributes, rather than selection all attributes of the Table (Relation), then one should go for PROJECT Operation.

SELECT is an actual SQL operation (statement), while PROJECT is a term used in relational algebra.

Judging from you posting this on SO and not on MathOverflow, I would suggest you don't read relational algebra books if you just want to learn SQL for developing applications.

If you are in dire need of a recommendation for a good book about (advanced) SQL, here is one

SQL Antipatterns: Avoiding the Pitfalls of Database Programming
Bill Karwin
ISBN-13: 978-1934356555
ISBN-10: 1934356557

That's the one book about SQL worth reading.
Most other books about SQL that I've seen out there can be summed up by this cynical statement about photoshop books:

There are more books about photoshop than people actually using photoshop.

like image 21
Stefan Steiger Avatar answered Oct 09 '22 07:10

Stefan Steiger