Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is my def function in Python not working?

I'm trying to save some data from a table in a CSV file.

import requests
import csv
from bs4 import BeautifulSoup

#Main function
def getContent(link):
    #Request content
    result1 = requests.get(link)

    #Save source in var
    src1 = result1.content

    #Activate soup
    soup = BeautifulSoup(src1,'lxml')

    #Look for table
    table = soup.find('table')

    #Save in csv
    with open('averageheight.csv','w',newline='') as f:
        writer = csv.writer(f)
        for tr in table('tr'):
            row = [t.get_text(strip=True)for t in tr(['td','th'])]
            writer.writerow(row)


#LINKS
getContent('https://en.wikipedia.org/wiki/Average_human_height_by_country')

The error I'm getting:

  File "c:/Users/Agent 1/Desktop/Datapackages/Average Height/process.py", line 31, in <module>
    getContent('https://en.wikipedia.org/wiki/Average_human_height_by_country')
  File "c:/Users/Agent 1/Desktop/Datapackages/Average Height/process.py", line 27, in getContent
    writer.writerow(row)
  File "C:\Users\Agent 1\AppData\Local\Programs\Python\Python38-32\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2044' in position 24: character maps to <undefined>
like image 792
Rob Grootjen Avatar asked Nov 22 '19 19:11

Rob Grootjen


2 Answers

Ran your code on my machine and found no errors. However, you may want to consider specifying encoding='utf-8' to with open(...) as f.

import requests
import csv
from bs4 import BeautifulSoup

#Main function
def getContent(link):
    #Request content
    result1 = requests.get(link)

    #Save source in var
    src1 = result1.content

    #Activate soup
    soup = BeautifulSoup(src1,'lxml')

    #Look for table
    table = soup.find('table')

    #Save in csv
    with open('averageheight.csv','w',newline='', encoding='utf-8') as f:
        writer = csv.writer(f)
        for tr in table('tr'):
            row = [t.get_text(strip=True)for t in tr(['td','th'])]
            writer.writerow(row)


#LINKS
getContent('https://en.wikipedia.org/wiki/Average_human_height_by_country')
like image 109
Daniele Cappuccio Avatar answered Nov 05 '22 01:11

Daniele Cappuccio


Convert the ascii characters to utf-8. Use the below modified line of code:

row = [(t.get_text(strip=True)).encode('utf-8') for t in tr(['td','th'])]
like image 20
Subhrajyoti Das Avatar answered Nov 05 '22 02:11

Subhrajyoti Das