Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to show full column content in a Spark Dataframe?

I am using spark-csv to load data into a DataFrame. I want to do a simple query and display the content:

val df = sqlContext.read.format("com.databricks.spark.csv").option("header", "true").load("my.csv")
df.registerTempTable("tasks")
results = sqlContext.sql("select col from tasks");
results.show()

The col seems truncated:

scala> results.show();
+--------------------+
|                 col|
+--------------------+
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-06 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:21:...|
|2015-11-16 07:21:...|
|2015-11-16 07:21:...|
+--------------------+

How do I show the full content of the column?

like image 805
tracer Avatar asked Oct 02 '22 12:10

tracer


People also ask

How do you show full column content in a PySpark DataFrame?

The only way to show the full column content we are using show() function. show(): Function is used to show the Dataframe. n: Number of rows to display. truncate: Through this parameter we can tell the Output sink to display the full column content by setting truncate option to false, by default this value is true.

How do you display the content of a DataFrame in Spark SQL?

Spark show() – Display DataFrame Contents in Table. Spark/PySpark DataFrame show() is used to display the contents of the DataFrame in a Table Row & Column Format. By default it shows only 20 Rows and the column values are truncated at 20 characters.

How do I show columns in Spark?

You can get the all columns of a Spark DataFrame by using df. columns , it returns an array of column names as Array[Stirng] .

How do I see column values in PySpark?

You can find all column names & data types (DataType) of PySpark DataFrame by using df. dtypes and df. schema and you can also retrieve the data type of a specific column name using df. schema["name"].


1 Answers

results.show(20, false) will not truncate. Check the source

20 is the default number of rows displayed when show() is called without any arguments.

like image 530
TomTom101 Avatar answered Oct 19 '22 17:10

TomTom101