Software engineer for about 14 years mainly focused on OO languages such as C#, C++, PHP, Java. Since 2017 interested on distributed systems, machine learning and big data overall. Currently eager to learn more over Spark, Scala, Python, Kafka and ElasticSearch.
Does pyspark changes order of instructions for optimization?
Spark copying dataframe columns best practice in Python/PySpark?
databricks partitioning w/ relation to predicate pushdown
Regex match with dataframe column values
Scala Spark: Flatten Array of Key/Value structs
Custom sorting based on the content of an external array with Scala/Java API
Apache Spark: Get the first and last row of each partition
Get column of list of ratings with 0's in place for missing ratings in Pyspark
Pyspark - Looping through structType and ArrayType to do typecasting in the structfield
Compare two dataframes Pyspark