I have a list of 10k words in a text file like so:
G15 KDN C30A Action Standard Air Brush Air Dilution
I am trying to convert them into lower cased tokens using this code for subsequent processing with GenSim:
data = [line.strip() for line in open("C:\corpus\TermList.txt", 'r')]
texts = [[word for word in data.lower().split()] for word in data]
and I get the following callback:
AttributeErrorTraceback (most recent call last)
<ipython-input-84-33bbe380449e> in <module>()
1 data = [line.strip() for line in open("C:\corpus\TermList.txt", 'r')]
----> 2 texts = [[word for word in data.lower().split()] for word in data]
3
AttributeError: 'list' object has no attribute 'lower'
Any suggestions on what I am doing wrong and how to correct it would be greatly appreciated!!! Thank you!!
try:
data = [line.strip() for line in open("C:\corpus\TermList.txt", 'r')]
texts = [[word.lower() for word in text.split()] for text in data]
you were trying to apply .lower() to data, which is a list.
.lower() can only be applied to strings.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With