Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Word Count with Ruby

I am trying to figure out a way to count a words in a particular string that contains html.

Example String:

<p>Hello World</p>

Is there a way in Ruby to count the words in between the p tags? Or any tag for that matter?

Examples:

<p>Hello World</p>
<h2>Hello World</h2>
<li>Hello World</li>

Thanks in advance!

Edit (here is my working code)

Controller:

class DashboardController < ApplicationController
  def index
    @pages = Page.find(:all)
    @word_count = []
  end

end

View:

<% @pages.each do |page| %>

        <%  page.current_state.elements.each do |el| %>
            <% @count = Hpricot(el.description).inner_text.split.uniq.size  %>
            <% @word_count << @count %>
        <% end %>

            <li><strong>Page Name: <%= page.slug %> (Word Count: <%= @word_count.inject(0){|sum,n| sum+n } %>)</strong></li>

<% end %>
like image 363
dennismonsewicz Avatar asked Feb 26 '23 00:02

dennismonsewicz


1 Answers

Here's how you can do it:

require 'hpricot'
content = "<p>Hello World...."
doc = Hpricot(content)
doc.inner_text.split.uniq

Will give you:

[
  [0] "Hello",
  [1] "World"
]

(sidenote: the output is formatted with awesome_print that I warmly recommend)

like image 127
Thibaut Barrère Avatar answered Mar 06 '23 17:03

Thibaut Barrère