Questions Linux Laravel Mysql Ubuntu Git Menu

HTML CSS JAVASCRIPT SQL PYTHON PHP BOOTSTRAP JAVA JQUERY R React Kotlin

Converting a list of rows to a PySpark dataframe

Tags:

python

apache-spark

apache-spark-sql

pyspark

rows

I have the following lists of rows that I want to convert to a PySpark df:

data= [Row(id=u'1', probability=0.0, thresh=10, prob_opt=0.45),
 Row(id=u'2', probability=0.4444444444444444, thresh=60, prob_opt=0.45),
 Row(id=u'3', probability=0.0, thresh=10, prob_opt=0.45),
 Row(id=u'80000000808', probability=0.0, thresh=100, prob_opt=0.45)]

I need to convert it to a PySpark DF.

I have tried doing data.toDF():

AttributeError: 'list' object has no attribute 'toDF'

like image

939

asked Aug 19 '19 15:08

Marcela Bejarano

People also ask

How do I add rows to a DataFrame in PySpark?

To append row to dataframe one can use collect method also. collect() function converts dataframe to list and you can directly append data to list and again convert list to dataframe.

1 Answers

This seems to work:

spark.createDataFrame(data)

Test results:

from pyspark.sql import SparkSession, Row

spark = SparkSession.builder.getOrCreate()

data = [Row(id=u'1', probability=0.0, thresh=10, prob_opt=0.45),
        Row(id=u'2', probability=0.4444444444444444, thresh=60, prob_opt=0.45),
        Row(id=u'3', probability=0.0, thresh=10, prob_opt=0.45),
        Row(id=u'80000000808', probability=0.0, thresh=100, prob_opt=0.45)]

df = spark.createDataFrame(data)
df.show()
#  +-----------+------------------+------+--------+
#  |         id|       probability|thresh|prob_opt|
#  +-----------+------------------+------+--------+
#  |          1|               0.0|    10|    0.45|
#  |          2|0.4444444444444444|    60|    0.45|
#  |          3|               0.0|    10|    0.45|
#  |80000000808|               0.0|   100|    0.45|
#  +-----------+------------------+------+--------+

like image

132

answered Oct 05 '22 12:10

ZygD

Sign in to Comment

Related questions
                            
                                Error when using statsmodels with pyinstaller
                            
                                Invalid conversion error in cv2 file while installing opencv 3.3.0 on Raspberry Pi Stretch
                            
                                How import package from PyPI with hyphen in name?
                            
                                pandas: write dataframe to excel file *object* (not file)?
                            
                                Change python 3.7 default encoding from cp1252 to cp65001 aka UTF-8
                            
                                Dask + pyinstaller fails
                            
                                Google OR tools - train scheduling problem
                            
                                How to detect good features for rotationally aligning microscope images to a template
                            
                                Sum of range(1,n,2) values using recursion
                            
                                How to replicate Python 2 style len() in Python 3?
                            
                                How to click a webpage button without going on the webpage
                            
                                'RefVariable' object has no attribute '_id'
                            
                                How to place percentage orders with Binance API and Python-CCXT?
                            
                                Feature importance 'gain' in XGBoost
                            
                                Extracting a person's age from unstructured text in Python
                            
                                How to extract individual channels from an RGB image
                            
                                How to open a closure in python?
                            
                                Spark FileAlreadyExistsException on Stage Failure
                            
                                tensorflow warning for data types
                            
                                Disable Tensorflow logging completely

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With