I have a dataframe in Spark using scala that has a column that I need split. <pre class="prettyprint"><code>scala> test.show +-------------+ |columnToSplit| +-------------+ | a.b.c| | d.e.f| +-------------+ </code></pre> I need this column split out to look like this: <pre class="prettyprint"><code>+--------------+ |col1|col2|col3| | a| b| c| | d| e| f| +--------------+ </code></pre> I'm using Spark 2.0.0 Thanks

Try: <pre class="prettyprint"><code>import sparkObject.spark.implicits._ import org.apache.spark.sql.functions.split df.withColumn("_tmp", split($"columnToSplit", "\\.")).select( $"_tmp".getItem(0).as("col1"), $"_tmp".getItem(1).as("col2"), $"_tmp".getItem(2).as("col3") ) </code></pre> The important point to note here is that the <code>sparkObject</code> is the SparkSession object you might have already initialized. So, the (1) import statement has to be compulsorily put inline within the code, not before the class definition.

Split 1 column into 3 columns in spark scala

Tags:

scala

apache-spark

I have a dataframe in Spark using scala that has a column that I need split.

scala> test.show +-------------+ |columnToSplit| +-------------+ |        a.b.c| |        d.e.f| +-------------+

I need this column split out to look like this:

+--------------+ |col1|col2|col3| |   a|   b|   c| |   d|   e|   f| +--------------+

I'm using Spark 2.0.0

Thanks

881

asked Aug 31 '16 17:08

Matt Maurer

1 Answers

Try:

import sparkObject.spark.implicits._ import org.apache.spark.sql.functions.split  df.withColumn("_tmp", split($"columnToSplit", "\\.")).select(   $"_tmp".getItem(0).as("col1"),   $"_tmp".getItem(1).as("col2"),   $"_tmp".getItem(2).as("col3") )

The important point to note here is that the sparkObject is the SparkSession object you might have already initialized. So, the (1) import statement has to be compulsorily put inline within the code, not before the class definition.

187

answered Oct 07 '22 09:10

4 revs, 3 users 68%user6022341

Related questions
                            
                                Can a Scala class extend multiple classes?
                            
                                Can java run a compiled scala code?
                            
                                Scala case match default value
                            
                                Create new column with function in Spark Dataframe
                            
                                Iteration over a sealed trait in Scala?
                            
                                How to define and use a User-Defined Aggregate Function in Spark SQL?
                            
                                Which IDE for Scala 2.8? [closed]
                            
                                Adaptation of argument list by inserting () has been deprecated
                            
                                <:< operator in scala
                            
                                Scala - convert List of Lists into a single List: List[List[A]] to List[A]
                            
                                Scala, repeat a finite list infinitely
                            
                                Purely functional data structures for text editors
                            
                                Scala versus F# question: how do they unify OO and FP paradigms?
                            
                                Scala: Ignore case class field for equals/hascode?
                            
                                Strange sbt bug where I cannot import sbt project due to keys colliding with themselves
                            
                                Create Simple Project SBT 0.10.X
                            
                                Declaring multiple variables in Scala
                            
                                Column alias after groupBy in pyspark
                            
                                How to get the current date without time in scala
                            
                                How to sum the values of one column of a dataframe in spark/scala

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With