Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split 1 column into 3 columns in spark scala

I have a dataframe in Spark using scala that has a column that I need split.

scala> test.show +-------------+ |columnToSplit| +-------------+ |        a.b.c| |        d.e.f| +-------------+ 

I need this column split out to look like this:

+--------------+ |col1|col2|col3| |   a|   b|   c| |   d|   e|   f| +--------------+ 

I'm using Spark 2.0.0

Thanks

like image 881
Matt Maurer Avatar asked Aug 31 '16 17:08

Matt Maurer


People also ask

How do I split one column into multiple columns in spark?

pyspark. sql. functions provide a function split() which is used to split DataFrame string Column into multiple columns.

How do I split a column in Scala spark?

Spark split() function to convert string to Array column. Spark SQL provides split() function to convert delimiter separated String to array (StringType to ArrayType) column on Dataframe. This can be done by splitting a string column based on a delimiter like space, comma, pipe e.t.c, and converting into ArrayType.

How do you create multiple columns in Scala?

You can add multiple columns to Spark DataFrame in several ways if you wanted to add a known set of columns you can easily do by chaining withColumn() or on select(). However, sometimes you may need to add multiple columns after applying some transformations n that case you can use either map() or foldLeft().


1 Answers

Try:

import sparkObject.spark.implicits._ import org.apache.spark.sql.functions.split  df.withColumn("_tmp", split($"columnToSplit", "\\.")).select(   $"_tmp".getItem(0).as("col1"),   $"_tmp".getItem(1).as("col2"),   $"_tmp".getItem(2).as("col3") ) 

The important point to note here is that the sparkObject is the SparkSession object you might have already initialized. So, the (1) import statement has to be compulsorily put inline within the code, not before the class definition.

like image 187
4 revs, 3 users 68%user6022341 Avatar answered Oct 07 '22 09:10

4 revs, 3 users 68%user6022341