How much data do I need to have to make use of Presto? The web site states that it can query data sizes from gigabytes to petabytes. I understand how it is used to query very large datasets, but is anyone using it for hundreds of gigabytes?
Currently, Presto is most useful if you already have an existing Hive installation. If you are using Hive, you should definitely try Presto. If all your data fits in a relational database like PostgreSQL or MySQL on a single machine, and you are happy with the performance, then keep using that.
However, Presto should be much faster than either of those databases on a single machine for analytic queries because it executes a query in parallel. Neither of those databases parallelize the execution of individual queries. At the moment, using Presto requires setting up HDFS and Hive (even on a single machine), so getting started will be more work than if you already have an existing Hive installation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With