Bigdata Developer, Cloudera Certified Developer for Hadoop, JAVA-J2EE Developer
Favourite Skills : Core java, Python, Apache Spark 1.5.x, 1.6.x & Spark 2.x, Hadoop 2.x , Hive 1.0, 1.2 and 2.x, HBase, Sqoop
My daily activities involve the followings: — To design and architect the different business modules based on the requirement.
— To develop efficient scala codes to be run on Cloudera Clusters.
— Writing oozie workflows using forks and joins to run spark actions concurrently.
— Managing YARN queues using the Fair Scheduler for the best utilization of the resources in different environments such as DEV, UAT and PROD.
— Currently, I am developing a spark streaming application to read from a JMS queue ,process it real time and send the output to another JMS queue.
— Development of utility modules using scala and register them as hive UDF whereever needed.
— Finding the best possible way to make the spark applications performant.
— Design and Maintenance of data lake which receives data from the upstream such as RDBMS, online streamed data and WebApps data from the servers. I have implemented data profiling and data security using different tools and concepts to provide the transformed useful data for business end users and App consumers.