Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to dump generated Java code to stdout?

Using DataFrames on Apache Spark 2.+, is there a way to get the underlying RDDs and dump the generated Java code to the console?

like image 487
Midiparse Avatar asked Nov 23 '25 12:11

Midiparse


1 Answers

This can be done using QueryExecution.debug.codegen. This value is accessible on Dataframe/Dataset via .queryExecution (which is a "Developer API", that is, unstable, subject to breakage, and so should only be used for debugging). This works on Spark 2.4.0, and, from the code, seems like it should work since 2.0.0 (or more):

scala> val df = spark.range(1000)
df: org.apache.spark.sql.Dataset[Long] = [id: bigint]

scala> df.queryExecution.debug.codegen
Found 1 WholeStageCodegen subtrees.
== Subtree 1 / 1 ==
*(1) Range (0, 1000, step=1, splits=12)

Generated code:
/* 001 */ public Object generate(Object[] references) {
/* 002 */   return new GeneratedIteratorForCodegenStage1(references);
/* 003 */ }
/* 004 */
/* 005 */ // codegenStageId=1
/* 006 */ final class GeneratedIteratorForCodegenStage1 extends org.apache.spark.sql.execution.BufferedRowIterator {
/* 007 */   private Object[] references;
/* 008 */   private scala.collection.Iterator[] inputs;
/* 009 */   private boolean range_initRange_0;
/* 010 */   private long range_number_0;
/* 011 */   private TaskContext range_taskContext_0;
/* 012 */   private InputMetrics range_inputMetrics_0;
/* 013 */   private long range_batchEnd_0;
/* 014 */   private long range_numElementsTodo_0;
/* 015 */   private org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter[] range_mutableStateArray_0 = new org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter[1];

...

/* 104 */       ((org.apache.spark.sql.execution.metric.SQLMetric) references[0] /* numOutputRows */).add(range_nextBatchTodo_0);
/* 105 */       range_inputMetrics_0.incRecordsRead(range_nextBatchTodo_0);
/* 106 */
/* 107 */       range_batchEnd_0 += range_nextBatchTodo_0 * 1L;
/* 108 */     }
/* 109 */   }
/* 110 */
/* 111 */ }
like image 132
huon Avatar answered Nov 25 '25 11:11

huon