Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pyspark.sql.types.Row to list

Tags:

python

pyspark

My initial data set are:

{'ID': [Row(userid=17562323, gross_merchandise_value=6072210944, country=u'ID'), Row(userid=29989283, gross_merchandise_value=4931252224, country=u'ID')]

the type of dict value is pyspark.sql.types.Row

How to convert the dict to the userid list? like below:

[17562323, 29989283],

just get the userid list.

like image 381
Frank Avatar asked May 09 '18 14:05

Frank


People also ask

What is a row class in pyspark?

PYSPARK ROW is a class that represents the Data Frame as a record. We can create row objects in PySpark by certain parameters in PySpark. The row class extends the tuple, so the variable arguments are open while creating the row class. We can create a row object and can retrieve the data from the Row.

What are the data types supported in pyspark SQL?

All data types from the below table are supported in PySpark SQL. DataType class is a base class for all PySpark Types. Some types like IntegerType, DecimalType, ByteType e.t.c are subclass of NumericType which is a subclass of DataType. PySpark SQL Data Types

How to create a list in pyspark data frame?

Using the row type as List. Insert the list elements as the Row Type and pass it to the parameter needed for the creation of the data frame in PySpark. These are the method by which a list can be created to Data Frame in PySpark.

What is base class in pyspark SQL?

DataType – Base Class of all PySpark SQL Types 1 All data types from the below table are supported in PySpark SQL. 2 DataType class is a base class for all PySpark Types. 3 Some types like IntegerType, DecimalType, ByteType e.t.c are subclass of NumericType which is a subclass of DataType.


1 Answers

thank you above all,the problem solved.I use row_ele.asDict()['userid'] in old_row_list to get the new_userid_list

like image 105
Frank Avatar answered Oct 21 '22 18:10

Frank