I'm looking for a way to get the last character from a string in a dataframe column and place it into another column.
I have a Spark dataframe that looks like this:
animal
======
cat
mouse
snake
I want something like this:
lastchar
========
t
e
e
Right now I can do this with a UDF that looks like:
def get_last_letter(animal):
return animal[-1]
get_last_letter_udf = udf(get_last_letter, StringType())
df.select(get_last_letter_udf("animal").alias("lastchar")).show()
I'm mainly curious if there's a better way to do this without a UDF. Thanks!
Just use the substring function
from pyspark.sql.functions import substring
df.withColumn("b", substring(col("columnName"), -1, 1))
Another way to do this would be with the "expr" function:
from pyspark.sql.functions import expr
df.withColumn("lastchar", expr('RIGHT(animal, 1)')).show()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With