Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove \n and \r from a string

I currently am trying to get the code from this website: http://netherkingdom.netai.net/pycake.html Then I have a python script parse out all code in html div tags, and finally write the text from between the div tags to a file. The problem is it adds a bunch of \r and \n to the file. How can I either avoid this or remove the \r and \n. Here is my code:

import urllib.request
from html.parser import HTMLParser
import re
page = urllib.request.urlopen('http://netherkingdom.netai.net/pycake.html')
t = page.read()
class MyHTMLParser(HTMLParser):
    def handle_data(self, data):
        print(data)
        f = open('/Users/austinhitt/Desktop/Test.py', 'r')
        t = f.read()
        f = open('/Users/austinhitt/Desktop/Test.py', 'w')
        f.write(t + '\n' + data)
        f.close()
parser = MyHTMLParser()
t = t.decode()
parser.feed(t)

And here is the resulting file it makes:

b'
import time as t\r\n
from os import path\r\n
import os\r\n
\r\n
\r\n
\r\n
\r\n
\r\n'

Preferably I would also like to have the beginning b' and last ' removed. I am using Python 3.5.1 on a Mac.

like image 354
HittmanA Avatar asked Mar 06 '16 18:03

HittmanA


People also ask

How do you remove n and r from a string in Python?

Use the str. rstrip() method to remove \r\n from a string in Python, e.g. result = my_str. rstrip() .

How do you remove line breaks from a string?

Use the String. replace() method to remove all line breaks from a string, e.g. str. replace(/[\r\n]/gm, ''); . The replace() method will remove all line breaks from the string by replacing them with an empty string.

How do I remove a carriage return from a string in Python?

Use the strip() Function to Remove a Newline Character From the String in Python. The strip() function is used to remove both trailing and leading newlines from the string that it is being operated on. It also removes the whitespaces on both sides of the string.


2 Answers

A simple solution is to strip trailing whitespace:

with open('gash.txt', 'r') as var:
    for line in var:
        line = line.rstrip()
        print(line)

The advantage of rstrip() over using a [:-2] slice is that this is safe for UNIX style files as well.

However, if you only want to get rid of \r and they might not be at the end-of-line, then str.replace() is your friend:

line = line.replace('\r', '')

If you have a byte object (that's the leading b') the you can convert it to a native Python 3 string using:

line = line.decode()
like image 163
cdarke Avatar answered Oct 08 '22 16:10

cdarke


to remove carriage return:

  • line = line.replace('\r', '')

to remove tab

  • line = line.replace('\t', '')
like image 24
Vikram Mahapatra Avatar answered Oct 08 '22 18:10

Vikram Mahapatra