Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

TypeError: Type str doesn't support the buffer API # find method?

Here is my input:

<!DOCTYPE html>
..........
<div class="content">
      <div class="stream-item-header">
          <a class="account-group js-account-group js-action-profile js-user-profile-link js-nav" href="https://twitter.com/jimcramer" data-user-id="14216123">
    <img class="avatar js-action-profile-avatar" src="Twitter%20_%20Search%20-%20%23tsla_files/988b4c2369623b634782f4c0469ec38f_normal.jpg" alt="">
    <strong class="fullname js-action-profile-name show-popup-with-id">Jim Cramer</strong>
    <span>‏</span><span class="username js-action-profile-name"><s>@</s><b>jimcramer</b></span>
  </a>
       <small class="time">
    <a href="https://twitter.com/jimcramer/status/405348028417994752" class="tweet-timestamp js-permalink js-nav js-tooltip" title="3:51 PM - 26 Nov 13"><span class="_timestamp js-short-timestamp " data-time="1385477475" data-long-form="true">26 Nov</span></a>
</small>
      </div>
      <p class="js-tweet-text tweet-text">Love this spirited &amp; rigorous <a href="https://twitter.com/search?q=%24TSLA&amp;src=ctag" data-query-source="cashtag_click" class="twitter-cashtag pretty-link js-nav" dir="ltr"><s>$</s><b>TSLA</b></a> defense ! RT <a href="https://twitter.com/InfennonLabs" class="twitter-atreply pretty-link" dir="ltr"><s>@</s><b>InfennonLabs</b></a>: Why are these idiots selling <a href="https://twitter.com/search?q=%23tsla&amp;src=hash" data-query-source="hashtag_click" class="twitter-hashtag pretty-link js-nav" dir="ltr"><s>#</s><b><strong>tsla</strong></b></a> are they that blind? <a href="https://twitter.com/jimcramer" class="twitter-atreply pretty-link" dir="ltr"><s>@</s><b>jimcramer</b></a></p>
      <div class="stream-item-footer">
<div class="context">
      <span class="metadata with-icn">
        <i class=" badge-top"></i>Favorited 5 times</span>
</div>
...........
</html>

For instance this "input" is in my input variable.

Here is my code:

  start_link = input.find(' <p class="js-tweet-text tweet-text" ')

if i run it, i'll get the following error:

  start_link = input.find('<p class="js-tweet-text tweet-text" ')
TypeError: Type str doesn't support the buffer API

How can I fix this?

NOTE: type of my input variable is: class 'bytes'

like image 363
Michael Avatar asked Nov 30 '13 16:11

Michael


2 Answers

You can't use bytes.find() to find a str object inside a bytes object(since they're different types, a str can't be inside bytes).
You can, however, look for a bytes object inside it. This should work:

start_link = input.find(b' <p class="js-tweet-text tweet-text" ')

Btw, you should be using an html parser if you're parsing html.

like image 179
stranac Avatar answered Nov 13 '22 17:11

stranac


You can also make the data coming from your input a str object, like this:

url = "http://www.google.com"
req = request.Request(url)
response = request.urlopen(req)
page = str(response.read()) # make it a str object
print(page[page.find('id='):]) # now you don't need a b in front of your string
like image 22
mimoralea Avatar answered Nov 13 '22 16:11

mimoralea