So I have a really long string which contains stock tickers of which I'm only displaying a portion of.
"\n'OPHT'\n'GALE'\n'CEMP'\n'TKAI'\n'ANTH'\n'ADPT'\n\n\n'CYTR'\n'NVAX'\n'MRTX'\n'IMGN'\n'OVAS'\n'AVGR'\n'DVAX'\n'INFI'\n'TNDM'\n'FNBC'\n\n\'CVRS'\n'CLDX'\n'CIE'\n'ARWR'\n'CYH'\n'RGLS'\n'VSLR'\n'IMDZ'\n'ITCI'\n\n\n\n\n'MDCA'
I want to get rid of the very first \n (new line)which precedes 'OPHT' as well as reduce multiple newlines that are grouped together to a single new line.
As you can see, the number of multiple newlines grouped together aren't constant and vary from two to five (even more in the original string).
It doesn't seem I can just use the simple `str.replace('\n', '') method as that will get rid of all the newlines, clumping all the stock tickers together (I want one newline after each ticker) I look through the string docs but I couldn't find a str method that would allow me to do what I want cleanly.
Any suggestions?
Thank you.
Okay, I got it:
If x is my string variable then 
import re
x = re.sub(r'\n{2, 10}', '', x)   # \n is new line, {2,10} is the range of occurences of the newline that I'm searching for.
This fixed the problem except for one exception. In certain cases it was removing all the newlines despite me wanting only one newline. This caused certain tickers to be bunched together like this:
'GALE''CEMP'
So I used another regular expression to fix this problem
import re
x = re.sub(r"''", "'\n'", x)
Everything look good now for the most part.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With