I've developed a reporting application in PHP. The application is built with HTML, CSS, javascript libraries, charting library(Highcharts) & MySQL to store data. The user chooses some options in the front end & clicks a "Submit button". Then the PHP layer executes a bunch of required SQLs & sends json result back to the UI where the charting & data tables are drawn.
The requirement now is, to be able to plug in a big data solution, Apache Spark to the existing application. I've been researching for the last 2 weeks on if I can in someway plug in the PHP application using REST API or some sort of Spark SQL driver to connect to Spark SQL server & execute the same set of SQLs that I have now, on the Spark SQL. I haven't hit a solution yet. I've now started researching on java based technologies such as Spring, others such as Angularjs, Nodejs other MVC frameworks to rewrite the project from scratch. I'm not a big fan of java development as I'm not a hardcore developer.(I build some handy tools to get things done).
I did read this - https://cwiki.apache.org/confluence/display/Hive/HiveClient#HiveClient-PHP, but looks like it's for a standalone spark installation. I'm dealing with a huge cluster in my case.
I'd highly appreciate any direction here please.
Yes it can be done by using a hive context and spark sql thrift server in spark application.
you can run your spark application and do all the processing. After processing if you are using a Data frame you have to just register it as a temporary table.
Now you can start a thrift server from the spark application.
After starting the thrift server you can query the temporary table and get the results and insights using proper jdbc divers in PHP.
refer the link below for more details https://medium.com/@anicolaspp/apache-spark-as-a-distributed-sql-engine-4373e254e0f9#.ekc3cs28u
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With