I have an Pyspark RDD with a text column that I want to use as a a filter, so I have the following code:
table2 = table1.filter(lambda x: x[12] == "*TEXT*")
To problem is... As you see I'm using the *
to try to tell him to interpret that as a wildcard, but no success.
Anyone has a help no that ?
The lambda function is pure python, so something like below would work
table2 = table1.filter(lambda x: "TEXT" in x[12])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With