Can pyspark.sql.function be used in udf?

Question

I define a function like

getDate = udf(lambda x : to_date(x))

When I use it in

df.select(getDate("time")).show()

I met

File ".../pyspark/sql/functions.py", in to_date
return Column(sc._jvm.functions.to_date(_to_java_column(col)))
AttributeError: 'NoneType' object has no attribute '_jvm'

Does that mean that I can not use pyspark.sql.function in my own udf?

This is not a specific question, I wonder why this happen.

zero323 · Accepted Answer

Functions from pyspark.sql.functions are wrappers for JVM functions and are designed to operates on pyspark.sql.Column. You cannot use these:

To transform local Python objects. They take Column and return Column.
They cannot be used on the worker because there is no context in which they can be evaluated.

Rakesh Kumar · Answer

Looking at error seems problem with sc as sc._jvm:'NoneType' object has no attribute '_jvm'

Here sc is of NoneType.

And there is no need to write udf for it, you can use directly:-

import pyspark.sql.functions as F
df.select(F.to_date(df.time)).show()

Can pyspark.sql.function be used in udf?

Tags:

python

sql

apache-spark

pyspark

user-defined-functions

chener

2 Answers

zero323

Rakesh Kumar

Recent Activity

Donate For Us

Can pyspark.sql.function be used in udf?

Tags:

python

sql

apache-spark

pyspark

user-defined-functions

chener

2 Answers

zero323

Rakesh Kumar

Related questions

Recent Activity

Donate For Us