I am trying to extract Meta Description for fetched webpages. But here I am facing the problem of case sensitivity of BeautifulSoup. As some of the pages have <code><meta name="Description</code> and some have <code><meta name="description</code>. My problem is very much similar to that of Question on Stackoverflow The only difference is that I can't use lxml .. I have to stick with Beautifulsoup.

You can give BeautifulSoup a regular expression to match attributes against. Something like <pre class="prettyprint"><code>soup.findAll('meta', name=re.compile("^description$", re.I)) </code></pre> might do the trick. Cribbed from the BeautifulSoup docs.

Is it possible for BeautifulSoup to work in a case-insensitive manner?

1 Answers

You can give BeautifulSoup a regular expression to match attributes against. Something like

soup.findAll('meta', name=re.compile("^description$", re.I))

might do the trick. Cribbed from the BeautifulSoup docs.

184

answered Dec 24 '22 20:12

Will McCutchen

Related questions
                            
                                How do I create an application domain and run my application in it?
                            
                                Tomcat - How to limit the maximum memory Tomcat will use
                            
                                vim remapping the hjkl
                            
                                Custom Position of Hint in Edit Text box.
                            
                                Do I need to release a gesture recognizer?
                            
                                Compiling C# code from the command line gives error
                            
                                Computing the inverse of a matrix using lapack in C
                            
                                Prevent wrapping <span> tags for ASP.NET server control
                            
                                How to replace repeated instances of a character with a single instance of that character in python
                            
                                Constants in xaml
                            
                                What does the "..." mean in a parameter list? doInBackground(String... params)
                            
                                JAVA: how to obtain keystore file for a certification (crt) file

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Is it possible for BeautifulSoup to work in a case-insensitive manner?

Tags:

Nitin

People also ask

1 Answers

Will McCutchen

Recent Activity

Donate For Us