What application does google use to show PDF attachments in gmail

Tags:

I watched the traffic when google displays PDF attachments in gmail in a new window. The content is served as PNG images for each PDF page. And its text can be selected. What does google use on server side to generate a PNG file for a particular page in a pdf file? How does the selection of text on a png file work? Any ideas?

385

asked Apr 25 '09 18:04

varun

1 Answers

By default attachments are viewed securely using https://docs.google.com/gview, however it turns out you are allowed to request files over plain HTTP. This makes it a little bit easier to figure out what is going on using Wireshark.

As you indicated it was already clear that the PDF is converted on the server side to a PNG (ImageMagick is indeed a reasonable solution for this purpose), the obvious reason for this is to preserve the exact layout while still being able to view the file without requiring a PDF viewer.

However, from looking at the traffic I found out that the entire PDF is also converted to a custom XML format when calling /gview?a=gt&docid=&chan=&thid= (this is done as soon as you request the document). As I couldn't use Wireshark to copy the XML I resorted to the Firefox extension Live HTTP Headers. Here's an excerpt:

<pdf2xml>
    <meta name="Author" content="Bruce van der Kooij"/>
    <meta name="Creator" content="Writer"/>
    <meta name="Producer" content="OpenOffice.org 3.0"/>
    <meta name="CreationDate" content="20090218171300+01'00'"/>
    <page t="0" l="0" w="595" h="842">
        <text l="188" t="99" w="213" h="27" p="188,213">Programmabureau</text>
        <text l="85" t="127" w="425" h="27" p="85,117,209,61,277,21,305,124,436,75">Nederland Open in Verbinding (NOiV)</text>
    </page>
</pdf2xml>

I'm not quite sure yet what all the attributes on the text element stand for (with the exception of w and h) but they're obviously the coordinates of the text and possibly length. As the JavaScript Google uses is minimized (or possibly obsfuscated, but this is not likely) figuring out precisely how the client-side selection function works is not quite that easy. But most likely it uses this XML file to figure out what text the user is looking at and then copies that to the user's clipboard.

Note that there is an open source (GPL licensed) tool called pdf2xml which has similar but not quite the same output. Here's the example from their homepage:

<?xml version="1.0" encoding="utf-8" ?>
<pdf2xml pages="3">
  <title>My Title</title>
  <page width="780" height="1152">
    <font size="10" face="MHCJMH+FuturaT-Bold" color="#FF0000">
      <text x="324" y="37" width="132" height="10">Friday, September 27, 2002</text>
      <img x="324" y="232" width="277" height="340" src="text_pic0001.png"/>
      <link x="324" y="232" width="277" height="340" dest_page="2" dest_x="141" dest_y="187"/>
    </font>
    <font size="12" face="AGaramond-Regular" italic="true" bold="true">
      <text x="509" y="68" width="121" height="12">This is a test PDF file</text>
      <link x="509" y="68" width="121" height="12" href="www.mobipocket.com"/>
    </font>
  </page>
</pdf2xml>

Hope this information is in any way useful, however like one of the other posters mentioned the only way to be sure what Google does is by asking them. It's a shame Google doesn't have an official IRC channel but they do have a forum for Google Docs support questions.

Good luck.

197

answered Oct 11 '22 17:10

Bruce van der Kooij

Related questions
                            
                                Handle www-authentication request using ajax?
                            
                                How do I append / merge additional XML into an existing XML field in SQL Server 2005
                            
                                Unit testing in PHP? [duplicate]
                            
                                How can a background thread hang the UI thread?
                            
                                jQuery Slider UI - Enhancements
                            
                                How to enable an AfxMessageBox Yes/No (MB_YESNO) "close" button? (upper right corner "X")
                            
                                How to map groups in Xcode the same on the filesystem
                            
                                HTTP 400 : detected invalid characters in the URL. IIS decoding URL too early? Whats going on here?
                            
                                When to use a page method versus creating a web service?
                            
                                What happens if I don't call RegCloseKey on an opened key?
                            
                                How to Implement Generic Method to do Math calculations on different value types
                            
                                .Net Repeater equivalent for a single object?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With