PySpark: add a new field to a data frame Row element

Tags:

I have the following element:

a = Row(ts=1465326926253, myid=u'1234567', mytype=u'good')

The Row is of Spark data frame Row class. I want to append a new field to a, so that a would look like:

a = Row(ts=1465326926253, myid=u'1234567', mytype=u'good', name = u'john')

954

asked Oct 01 '16 00:10

Edamame

1 Answers

Here is an updated answer that works. First you have to create a dictionary then update the dict and then write it out to a pyspark Row.

Code is as follows:

from pyspark.sql import Row

#Creating the pysql row
row = Row(field1=12345, field2=0.0123, field3=u'Last Field')

#Convert to python dict
temp = row.asDict()

#Do whatever you want to the dict. Like adding a new field or etc.
temp["field4"] = "it worked!"

# Save or output the row to a pyspark rdd
output = Row(**temp)

#How it looks
output

In [1]:
Row(field1=12345, field2=0.0123, field3=u'Last Field', field4='it worked!')

198

answered Oct 10 '22 23:10

Ish Mitch

Related questions
                            
                                Use Regex re.sub to remove everything before and including a specified word
                            
                                puLP solver error
                            
                                How to use values like Default and Onupdate in flask-SQLAlchemy
                            
                                TypeError: 'int' object is not iterable - Python
                            
                                How to check if *either* character is in a string in Python? [closed]
                            
                                Generate random sentences in python
                            
                                Efficient algorithm perl or python
                            
                                Is there a better way to check for vowels in the first position of a word?
                            
                                Scrapy gives URLError: <urlopen error timed out>
                            
                                Is python's random number generation easily reproducible?
                            
                                Python getter and setter via @property within SqlAlchemy model class definition: HOWTO
                            
                                Reduce function doesn't handle an empty list
                            
                                scikit cosine_similarity vs pairwise_distances
                            
                                How to check if a string represents a float number
                            
                                How to open external programs in Python
                            
                                Bypass SSL when I'm using SUDS for consume web service
                            
                                Plotting graph using python and dispaying it using HTML
                            
                                How to save numpy array into computer for later use in python
                            
                                Remove spaces between numbers in a string in python
                            
                                Pass kwargs into Django Filter

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

PySpark: add a new field to a data frame Row element

Tags:

python

dataframe

row

apache-spark

pyspark

Edamame

People also ask

1 Answers

Ish Mitch

Recent Activity

Donate For Us