How to pass the parameter to User-Defined Function?

Tags:

I have a user-defined function:

calc = udf(calculate, FloatType())

param1 = "A"

result = df.withColumn('col1', calc(col('type'), col('pos'))).groupBy('pk').sum('events')

def calculate(type, pos):
   if param1=="A":
       a, b = [ 0.05, -0.06 ]
   else:
       a, b = [ 0.15, -0.16 ]
   return a * math.pow(type, b) * max(pos, 1)

I need to pass a parameter param1 to this udf. How can I do it?

584

asked Nov 13 '17 09:11

Dinosaurius

1 Answers

You can use lit or typedLit as a parameter for your udf like this:

In Python:

from pyspark.sql.functions import udf, col, lit
mult = udf(lambda value, multiplier: value * multiplier)
df = spark.sparkContext.parallelize([(1,),(2,),(3,)]).toDF()
df.select(mult(col("_1"), lit(3)))

In Scala:

import org.apache.spark.sql.functions.{udf, col, lit}
val mult = udf((value: Double, multiplier: Double) => value * multiplier)
val df = sparkContext.parallelize((1 to 10)).toDF
df.select(mult(col("value"), lit(3)))

159

answered Sep 23 '22 08:09

Paul V

Related questions
                            
                                AlphaVantage API Stock Market Indices
                            
                                Using function parameter names that are the same as passed variables
                            
                                how to download pip dependencies locally? [duplicate]
                            
                                Matplotlib - color under curve based on spectral color
                            
                                How to set Tensorflow dynamic_rnn, zero_state without a fixed batch_size?
                            
                                How to dynamically freeze weights after compiling model in Keras?
                            
                                Split List By Value and Keep Separators
                            
                                XGBoostError: b'[19:12:58] src/metric/rank_metric.cc:89: Check failed: (preds.size()) == (info.labels.size()) label size predict size not match'
                            
                                Difference in buffering of stdout on Linux and Windows
                            
                                How to get the index of filtered item in list using lambda?
                            
                                How to create a confirmation popup for class.DeleteView
                            
                                Splitting a dataframe into separate CSV files
                            
                                Trouble converting string to float in python
                            
                                Create a pandas dataframe from a nested lists of unequal lengths
                            
                                Add a validator to a Mongodb collection with pymongo
                            
                                Merge rows within a group together
                            
                                Convert string to float pandas
                            
                                Correlation between two non-numeric columns in a Pandas DataFrame
                            
                                How to flatten an xarray dataset into a 1D numpy array?
                            
                                insert missing category for each group in pandas dataframe

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to pass the parameter to User-Defined Function?

Tags:

python

apache-spark

pyspark

Dinosaurius

People also ask

1 Answers

Paul V

Recent Activity

Donate For Us