Usage of spark DataFrame "as" method

Tags:

I am looking at spark.sql.DataFrame documentation.

There is

def as(alias: String): DataFrame
    Returns a new DataFrame with an alias set.
    Since
        1.3.0

What is the purpose of this method? How is it used? Can there be an example?

I have not managed to find anything about this method online and the documentation is pretty non-existent. I have not managed to make any kind of alias using this method.

978

asked Jul 21 '15 11:07

Prikso NAI

1 Answers

Spark <= 1.5

It is more or less equivalent to SQL table aliases:

SELECT *
FROM table AS alias;

Example usage adapted from PySpark alias documentation:

import org.apache.spark.sql.functions.col
case class Person(name: String, age: Int)

val df = sqlContext.createDataFrame(
    Person("Alice", 2) :: Person("Bob", 5) :: Nil)

val df_as1 = df.as("df1")
val df_as2 = df.as("df2")
val joined_df = df_as1.join(
    df_as2, col("df1.name") === col("df2.name"), "inner")
joined_df.select(
    col("df1.name"), col("df2.name"), col("df2.age")).show

Output:

+-----+-----+---+
| name| name|age|
+-----+-----+---+
|Alice|Alice|  2|
|  Bob|  Bob|  5|
+-----+-----+---+

Same thing using SQL query:

df.registerTempTable("df")
sqlContext.sql("""SELECT df1.name, df2.name, df2.age
                  FROM df AS df1 JOIN df AS df2
                  ON df1.name == df2.name""")

What is the purpose of this method?

Pretty much avoiding ambiguous column references.

Spark 1.6+

There is also a new as[U](implicit arg0: Encoder[U]): Dataset[U] which is used to convert a DataFrame to a DataSet of a given type. For example:

df.as[Person]

164

answered Oct 05 '22 23:10

zero323

Related questions
                            
                                Howto read Excel file in Scala [closed]
                            
                                Actor-based distributed concurrency libraries for Ocaml and other languages [closed]
                            
                                What is "Scala Presentation Compiler"?
                            
                                Override final method
                            
                                SBT is unable to find credentials when attempting to download from an Artifactory virtual repo
                            
                                Why "could not find implicit" error in Scala + Intellij + ScalaTest + Scalactic but not from sbt
                            
                                Type parameter does not extend given type
                            
                                Intellij Idea setup for Scala, clarification needed
                            
                                Understanding the limits of Scala GADT support
                            
                                What are advantages of a Twitter Future over a Scala Future?
                            
                                Declare a Function `type` with `implicit` parameters
                            
                                Scala: Implicit parameter resolution precedence
                            
                                Why has Scala no type-safe equals method?
                            
                                How to transpose an RDD in Spark
                            
                                Is it possible to access estimator attributes in spark.ml pipelines?
                            
                                What are "resources" folders in SBT projects for?
                            
                                Meaning of _2 sign in scala language
                            
                                Functional Programming + Domain-Driven Design
                            
                                Compiling sub projects in sbt
                            
                                Slick - Filter Row if Column is Null

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Usage of spark DataFrame "as" method

Tags:

dataframe

scala

apache-spark

apache-spark-sql

Prikso NAI

People also ask

1 Answers

zero323

Recent Activity

Donate For Us