Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does Pygment.rb not highlight <code> tags within <pre class="lang"> properly -i.e. Google Prettify friendly tags?

I am calling it in my view like this:

<%= markdown question.body %>

This is what my ApplicationHelper looks like:

module ApplicationHelper
    class HTMLwithPygments < Redcarpet::Render::HTML
      def block_code(code, language)
        Pygments.highlight(code, lexer:language)
      end
    end

    def markdown(text)
        renderer = HTMLwithPygments.new(hard_wrap: true)
        options = {
          autolink: true,
          no_intra_emphasis: true,
          fenced_code_blocks: true,
          lax_html_blocks: true,
          strikethrough: true,
          superscript: true
        }
        Redcarpet::Markdown.new(renderer, options).render(text).html_safe
    end
end

But, when it encounters tags like this:

<pre class="lang-cpp prettyprint-override">

It doesn't apply the color highlights to that code. Why is that?

P.S. This is generated, for instance, by Stack Overflow by doing this: <!-- language: lang-cpp -->

Edit 1

Or more specifically, it seems that it won't format the <code> tags that are within <pre> tags. Once <code> is not within <pre> it seems to format it fine. How do I remedy that?

Edit 2

The problem seems to be the data that Pygment.rb is acting on. It is HTML, as can be seen in this gist - https://gist.github.com/marcamillion/14fa121cf3557d38c1a8. So what I want to be able to do is to have Pygment properly format the code returned in the body attribute of that object in my gist.

How do I do that?

Edit 3

This is the HTML code that I would like Pygment.rb and Redcarpet to perform syntax highlighting on:

<p>Here is a piece of C++ code that shows some very peculiar performance. For some strange reason, sorting the data miraculously speeds up the code by almost 6x:</p>

<pre class="lang-cpp prettyprint-override"><code>#include &lt;algorithm&gt;
#include &lt;ctime&gt;
#include &lt;iostream&gt;

int main()
{
    // Generate data
    const unsigned arraySize = 32768;
    int data[arraySize];

    for (unsigned c = 0; c &lt; arraySize; ++c)
        data[c] = std::rand() % 256;

    // !!! With this, the next loop runs faster
    std::sort(data, data + arraySize);

    // Test
    clock_t start = clock();
    long long sum = 0;

    for (unsigned i = 0; i &lt; 100000; ++i)
    {
        // Primary loop
        for (unsigned c = 0; c &lt; arraySize; ++c)
        {
            if (data[c] &gt;= 128)
                sum += data[c];
        }
    }

    double elapsedTime = static_cast&lt;double&gt;(clock() - start) / CLOCKS_PER_SEC;

    std::cout &lt;&lt; elapsedTime &lt;&lt; std::endl;
    std::cout &lt;&lt; "sum = " &lt;&lt; sum &lt;&lt; std::endl;
}
</code></pre>

<ul>
<li>Without <code>std::sort(data, data + arraySize);</code>, the code runs in <strong>11.54</strong> seconds.</li>
<li>With the sorted data, the code runs in <strong>1.93</strong> seconds.</li>
</ul>

<hr>

<p>Initially I thought this might be just a language or compiler anomaly. So I tried it in Java:</p>

<pre class="lang-java prettyprint-override"><code>import java.util.Arrays;
import java.util.Random;

public class Main
{
    public static void main(String[] args)
    {
        // Generate data
        int arraySize = 32768;
        int data[] = new int[arraySize];

        Random rnd = new Random(0);
        for (int c = 0; c &lt; arraySize; ++c)
            data[c] = rnd.nextInt() % 256;

        // !!! With this, the next loop runs faster
        Arrays.sort(data);

        // Test
        long start = System.nanoTime();
        long sum = 0;

        for (int i = 0; i &lt; 100000; ++i)
        {
            // Primary loop
            for (int c = 0; c &lt; arraySize; ++c)
            {
                if (data[c] &gt;= 128)
                    sum += data[c];
            }
        }

        System.out.println((System.nanoTime() - start) / 1000000000.0);
        System.out.println("sum = " + sum);
    }
}
</code></pre>

<p>with a similar but less extreme result.</p>

<hr>

<p>My first thought was that sorting brings the data into cache, but my next thought was how silly that is because the array was just generated.</p>

<p>What is going on? Why is a sorted array faster than an unsorted array? The code is summing up some independent terms, the order should not matter.</p>

You can see the current way that this particular question is being rendered at: http://boso.herokuapp.com

It is the most popular question on that site, the first one that you see. You will notice that the code simply has a grey background and is indented. There is no pretty highlighting like Pygment.rb promises and does on other code snippets (similarly to how @rorra has illustrated in other examples in his answer).

I can't strip out the HTML - because I want to parse it properly (i.e. make sure the spacing, etc. is included properly). The only difference that I want, is to get syntax highlighting on the code represented in the body of the question.

like image 390
marcamillion Avatar asked Mar 28 '13 11:03

marcamillion


1 Answers

Is there something else you can add in order to reproduce the issue? Like the content of question.body?

If I do something like this on the controller:

class HomeController < ApplicationController
  def index
    @data = <<EOF
~~~ cpp
#include <fstream.h>

int main (int argc, char *argv[]) {
return(0);
}
~~~
EOF
  end
end

and the on the view:

<pre class="lang-cpp prettyprint-override">
  <%= markdown @data %>
</pre>

it works totally fine, I can see the parsed code without any problem. What's the content of question.body? And can you save the content of the web page (from your browser) and save it on a gist so we can debug?

Thx


Regarding your last comment, its a simple css issue, on your stylesheet, you can add:

.code {
  color: #DD1144 !important;
}

and it will work, the problem is that you have a css rule written like:

pre .code {
  color: inherited;
}

and that's using the color #333333 inherited from the body class


Here's a screen on how it looks like with the css updated:

enter image description here


The sample app with your code runs totally fine, I would need a sample app code app, or a sample code where we can reproduce the issue you are having (not having the right css/stylesheets for the formatted code).

This is an example of how the sample app looks like:

enter image description here


enter image description here


Final edit, the problem is not the library, and its not the way you are rendering the question, its the content you are rendering, check the body of your questions, this is one of the questions I got with the body that actually is rendered as the library should render, but its not rendering as you are expecting :)

@data = <<EOF
    <p>I've been messing around with <a href="http://en.wikipedia.org/wiki/JSON">JSON</a> for some time, just pushing it out as text and it hasn't hurt anybody (that I know of), but I'd like to start doing things properly.</p>

    <p>I have seen <em>so</em> many purported "standards" for the JSON content type:</p>

    <pre><code>application/json
    application/x-javascript
    text/javascript
    text/x-javascript
    text/x-json
    </code></pre>

    <p>But which is correct, or best? I gather that there are security and browser support issues varying between them.</p>

    <p>I know there's a similar question, <em><a href="http://stackoverflow.com/questions/404470/what-mime-type-if-json-is-being-returned-by-a-rest-api">What MIME type if JSON is being returned by a REST API?</a></em>, but I'd like a slightly more targeted answer.</p>
EOF

And this is another one I just copied/pastle from stackoverflow, that renders with all the syntax highlighted, do you notice the difference? So update your crawler to get the questions in the right format and it will work

@data = <<EOF
Here is a piece of C++ code that shows some very peculiar performance. For some strange reason, sorting the data miraculously speeds up the code by almost 6x:

<!-- language: lang-cpp -->

    #include <algorithm>
    #include <ctime>
    #include <iostream>

    int main()
    {
        // Generate data
        const unsigned arraySize = 32768;
        int data[arraySize];

        for (unsigned c = 0; c < arraySize; ++c)
            data[c] = std::rand() % 256;

        // !!! With this, the next loop runs faster
        std::sort(data, data + arraySize);

        // Test
        clock_t start = clock();
        long long sum = 0;

        for (unsigned i = 0; i < 100000; ++i)
        {
            // Primary loop
            for (unsigned c = 0; c < arraySize; ++c)
            {
                if (data[c] >= 128)
                    sum += data[c];
            }
        }

        double elapsedTime = static_cast<double>(clock() - start) / CLOCKS_PER_SEC;

        std::cout << elapsedTime << std::endl;
        std::cout << "sum = " << sum << std::endl;
    }

 - Without `std::sort(data, data + arraySize);`, the code runs in **11.54** seconds.
 - With the sorted data, the code runs in **1.93** seconds.

----------

Initially I thought this might be just a language or compiler anomaly. So I tried it in Java:

<!-- language: lang-java -->

    import java.util.Arrays;
    import java.util.Random;

    public class Main
    {
        public static void main(String[] args)
        {
            // Generate data
            int arraySize = 32768;
            int data[] = new int[arraySize];

            Random rnd = new Random(0);
            for (int c = 0; c < arraySize; ++c)
                data[c] = rnd.nextInt() % 256;

            // !!! With this, the next loop runs faster
            Arrays.sort(data);

            // Test
            long start = System.nanoTime();
            long sum = 0;

            for (int i = 0; i < 100000; ++i)
            {
                // Primary loop
                for (int c = 0; c < arraySize; ++c)
                {
                    if (data[c] >= 128)
                        sum += data[c];
                }
            }

            System.out.println((System.nanoTime() - start) / 1000000000.0);
            System.out.println("sum = " + sum);
        }
    }

with a similar but less extreme result.

----------

My first thought was that sorting brings the data into cache, but my next thought was how silly that is because the array was just generated.

What is going on? Why is a sorted array faster than an unsorted array? The code is summing up some independent terms, the order should not matter.

EOF
like image 59
rorra Avatar answered Nov 02 '22 23:11

rorra