Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Twitter website doesn't have open graph tags?

I'm trying to get url previews (for websites that support them) to show up in a project I'm working on. I recently noticed that twitter urls don't have open graph meta tags anymore. I was expecting og:title, og:description and so on, which if I remember correctly used to exist for all twitter links.

E.g. if I see the page source for this link: twitter.com/DalaiLama/status/1274998376338124800

I don't see og metadata apart from og:site_name. I also don't see any twitter:title or respective content. What am I missing?

Update: so it turns out view source doesn't show og:title, but I do see it under Chrome's "inspect" menu. Does that mean the JS actually has it but not the HTML (also it only shows the og:title and not other fields)? Is that expected?

like image 613
xcoder Avatar asked Jun 23 '20 02:06

xcoder


People also ask

Does Twitter use open graph tags?

In addition to the Open Graph tags that are used by many different websites, Twitter has added their own custom meta tags. Luckily, Twitter uses the Open Graph tags as a fallback, so we do not need to add them twice. For example, if no twitter:title tag is found, Twitter will use the og:title tag.

How do you find open graph tags?

If you have implemented OG tags in your website and you want to check them. You can go to smallseotools.com and use the OG checker tool. Once you navigate to the website; enter the URL of the website you want the meta og checker tool to check. It will run a check and if it finds the tags it will display them.

How do I add Open Graph Protocol to my website?

If you are using WordPress, you can use the WordPress for SEO Plugin by Yoast to add open graph protocol meta tags on your website. Once you have added the plugin, follow the steps below: Go to plugin settings (SEO > Social) Click on the checkbox 'Add Open Graph Meta Data'

What sites use open graph?

Open Graph meta tags are snippets of code that control how URLs are displayed when shared on social media. They're part of Facebook's Open Graph protocol and are also used by other social media sites, including LinkedIn and Twitter (if Twitter Cards are absent). You can find them in the <head> section of a webpage.


2 Answers

Twitter uses client-side-rendering (CSR) to generate HTML in the browser

Viewing the source directly will not show any of the relevant <meta> tags or actual page HTML content, because it is all dynamically generated on the client's browser in React using JavaScript (i.e. CSR: Client-side rendering). In fact, the HTML source will have a stub containing "We've detected that JavaScript is disabled in your browser. Would you like to proceed to legacy Twitter?". This can be verified by opening up developer tools and peeking at the "Elements" tab during page load/render or downloading the page without JavaScript emulation.

However, to improve Search Engine Optimization (SEO) for various prominent web-crawlers, Twitter will instead return server-side-rendered (SSR) HTML content (which does contain the <meta> tags). This enables crawlers to not have to emulate JavaScript to view the page, and only crawl raw HTML content. Twitter recognizes crawlers based on the supplied User-Agent HTTP Header. Server-side-rendering is generally a more expensive operation than offloading the HTML rendering onto the client, which may be a reason why Twitter opts for client-side-rendering as the default behavior.

Bypassing the User-Agent whitelist to receive server-side-rendered (SSR) HTML

Various prominent web-crawlers are whitelisted by Twitter to receive server-side-rendered HTML. By spoofing the User-Agent HTTP Header in your own request, you can bypass the whitelist and receive server-side-rendered HTML containing the relevant <meta> tags (whether or not this is recommended is a totally different subject matter). For programmatic HTTP requests, check for support for changing the User-Agent HTTP Header in your HTTP library - most non-trivial libraries support this functionality.

whatismybrowser.com has a list of well known web-crawler User-Agent headers; some of these web crawlers are whitelisted (but not necessarily all). At the time of writing, here are some working user agents:

  • Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
  • Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)
  • facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)
  • Mozilla/5.0 (compatible; Discordbot/2.0; +https://discordapp.com)
like image 67
concision Avatar answered Oct 29 '22 14:10

concision


It looks like twitter is allowing the facebook crawler to view their open graph tags. If you can set your user agent similar to what is described in the Troubleshooting section on the facebook crawler site, the full set of tags appears.

$ curl -s --compressed -H "Range: bytes=0-524288" -H "Connection: close" -A "facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)" "https://
twitter.com/sharifshameem/status/1284095222939451393" | grep -i 'og:'
    <meta  property="og:type" content="video">
    <meta  property="og:url" content="https://twitter.com/sharifshameem/status/1284095222939451393">
    <meta  property="og:title" content="Sharif Shameem on Twitter">
    <meta  property="og:image" content="https://pbs.twimg.com/ext_tw_video_thumb/1284094287383166977/pu/img/LsArMNT3djA7xg53.jpg">
    <meta  property="og:description" content="“I just built a *functioning* React app by describing what I wanted to GPT-3. &#10;&#10;I&#39;m still in awe. https://someurl”">
    <meta  property="og:site_name" content="Twitter">
    <meta  property="og:video:url" content="https://twitter.com/i/videos/1284095222939451393?embed_source=facebook">
    <meta  property="og:video:secure_url" content="https://twitter.com/i/videos/1284095222939451393?embed_source=facebook">
    <meta  property="og:video:type" content="text/html">
    <meta  property="og:video:width" content="1200">
    <meta  property="og:video:height" content="696">

Without specifying the user agent:

$ curl -s "https://twitter.com/sharifshameem/status/1284095222939451393" | grep -i 'og:'
  <meta property="og:site_name" content="Twitter" />
like image 43
fantapop Avatar answered Oct 29 '22 13:10

fantapop