I have a Spark Thrift Server. I connect to the Thrift Server and get data of Hive table. If I query the same table again, it will again load the file in memory and execute the query.
Is there any way I can cache the table data using Spark Thrift Server? If yes, please let me know how to do it
Two things:
CACHE LAZY TABLE as in this answer: Spark SQL: how to cache sql query result without using rdd.cache() and cache tables in apache spark sqlspark.sql.hive.thriftServer.singleSession=true so that other clients can use this cached table.Remember that caching is lazy, so it will be cached during first computation
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With