
So here is the code im using to extract at first the time at the upper-left.
import qgrid
import webbrowser
import requests
from bs4 import BeautifulSoup
page = requests.get('http://www.meteo.gr/cf.cfm?city_id=14') #sending the request to take the html file.
soup = BeautifulSoup(page.content, 'html.parser') #creating beautifulSoup object of the html code.
four_days = soup.find(id="prognoseis")#PINPOINTING to the section that i want to focus (the outer).
#Selecting specific elements , having as my base the seven_day.
periods = [p.get_text() for p in four_days.select(".perhour-rowmargin .innerTableCell-fulltime")]
#creating a Data Frame via pandas to print it TABLE-like.
import pandas as pd
weather = pd.DataFrame({"period ": periods})
print weather
I looked up a good tutorial to start get the hang of it. at the four_days object i hold the part of html code thats included in 'prognoseis', thats where the info i want is. After at the periods object i choose the element that includes the info i want and as a second argument i specify which exaxtly text i want to extract.
The code runs and gives me empty.
You are adding dashes between class names, where no such dashes exist. The <tr> element you are selecting has two classes, perhour and rowmargin, but you are selecting on a non-existing class perhour-rowmargin. The same applies to the td elements; they have separate classes fulltime and innerTableCell
Just pick one or the other for both; the following returns the cells you want:
four_days.select(".perhour .fulltime")
You probably also want to remove the extra newlines around each cell data; add strip=True to the get_text() calls:
[p.get_text(strip=True) for p in four_days.select(".perhour .fulltime")]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With