Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to pass arguments dynamically to filter function in Apache Spark?

I have a employees file which have data as below:

Name:   Age:
David   25
Jag     32
Paul    33
Sam     18

Which I loaded into dataframe in Apache Spark and I am filtering the values as below:

employee_rdd=sc.textFile("employee.txt")
employee_df=employee_rdd.toDF()
employee_data = employee_df.filter("Name = 'David'").collect() 
+-----------------+-------+
|            Name:|   Age:|
+-----------------+-------+
|David            |25     |
+-----------------+-------+

But when I am trying to do something like this:

emp_Name='Sam' and passing this Name to filter like below:

employee_data = employee_df.filter("Name = 'emp_Name'").collect

but this is giving me empty list.

like image 329
YRK Avatar asked Sep 03 '25 05:09

YRK


2 Answers

This can be done in scala you can change it to python

val emp_name = "Sam"

val employee_data = employee_df.filter(col("Name") === emp_name)

Hope this helps!

like image 136
koiralo Avatar answered Sep 05 '25 01:09

koiralo


Try the following:

emp_Name='Sam'
employee_data = employee_df.filter(employee_df["Name"] == emp_Name).collect()
like image 34
Alex Choy Avatar answered Sep 04 '25 23:09

Alex Choy