Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Access to WrappedArray elements

I have a spark dataframe and here is the schema:

|-- eid: long (nullable = true)
|-- age: long (nullable = true)
|-- sex: long (nullable = true)
|-- father: array (nullable = true)
|    |-- element: array (containsNull = true)
|    |    |-- element: long (containsNull = true)

and a sample of rows:.

df.select(df['father']).show()
+--------------------+
|              father|
+--------------------+
|[WrappedArray(-17...|
|[WrappedArray(-11...|
|[WrappedArray(13,...|
+--------------------+

and the type is

DataFrame[father: array<array<bigint>>]

How can I have access to each element of inner array? For example -17 in the first row? I tried different things like df.select(df['father'])(0)(0).show() but no luck.

like image 950
MTT Avatar asked Feb 05 '23 07:02

MTT


1 Answers

If I'm not mistaken, the syntax for in Python is

df.select(df['father'])[0][0].show()

or

df.select(df['father']).getItem(0).getItem(0).show()

See some examples here: http://spark.apache.org/docs/latest/api/python/pyspark.sql.html?highlight=column#pyspark.sql.Column

like image 139
Raphael Roth Avatar answered Feb 13 '23 20:02

Raphael Roth