I have data like this

I want to pivot by year and show the total only from 2020

How do I achieve this?
You can achieve this as below using PIVOT on year and aggregate SUM on amount and further use reduce and add to generate total
df = pd.DataFrame(
{"id": [1]*5,
"type": ['A','A','A','B','B'],
"year": [2019,2020,2021,2019,2021],
"amount":[50,75,100,25,75]
})
sparkDF = sql.createDataFrame(df)
sparkDF.show()
+---+----+----+------+
| id|type|year|amount|
+---+----+----+------+
| 1| A|2019| 50|
| 1| A|2020| 75|
| 1| A|2021| 100|
| 1| B|2019| 25|
| 1| B|2021| 75|
+---+----+----+------+
from operator import add
sparkDF_pivot = sparkDF.groupBy("id","type")\
.pivot("year")\
.agg(F.sum(F.col('amount')))\
.fillna(0)
sparkDF_pivot.withColumn('total',reduce(add, [F.col(x) for x in sparkDF_pivot.columns if x not in ['id','type']]))\
.show()
+---+----+----+----+----+-----+
| id|type|2019|2020|2021|total|
+---+----+----+----+----+-----+
| 1| B| 25| 0| 75| 100|
| 1| A| 50| 75| 100| 225|
+---+----+----+----+----+-----+
from operator import add
sparkDF_pivot = sparkDF.groupBy("id","type")\
.pivot("year")\
.agg(F.sum(F.col('amount')))\
.fillna(0)
sparkDF_pivot.withColumn('total',reduce(add, [F.col(x)
for x in sparkDF_pivot.columns if x not in ['id','type',2019]]))\
.show()
+---+----+----+----+----+-----+
| id|type|2019|2020|2021|total|
+---+----+----+----+----+-----+
| 1| B| 25| 0| 75| 75|
| 1| A| 50| 75| 100| 175|
+---+----+----+----+----+-----+
Use slice. In this case I have a list [2019, 2020, 2021,...] using slice(array, startpostion, length) I can leave out 2019 by starting at position 2 and going the length of the size of the array.
s =df.groupby('id').pivot('year').agg(sum('amount'))#Pivot
(s.withColumn('x', array(*[x for x in s.columns if x!='id']))#create array
.withColumn('x', expr("aggregate(slice(x, 2, size(x)),cast(0 as double),(c,i)-> c+coalesce(i,0))"))#sum
).show()
data
df = spark.createDataFrame(pd.DataFrame(
{"id": [1]*5,
"type": ['A','A','A','B','B'],
"year": [2019,2020,2021,2019,2021],
"amount":[50,75,100,25,75]
}))
solution
+---+----+----+----+-----+
| id|2019|2020|2021| x|
+---+----+----+----+-----+
| 1| 75| 75| 175|250.0|
+---+----+----+----+-----+
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With