I have a string of words; let's call them <code>bad</code>: <pre class="prettyprint"><code>bad = "foo bar baz" </code></pre> I can keep this string as a whitespace separated string, or as a list: <pre class="prettyprint"><code>bad = bad.split(" "); </code></pre> If I have another string, like so: <pre class="prettyprint"><code>str = "This is my first foo string" </code></pre> What's the fasted way to check if any word from the <code>bad</code> string is within my comparison string, and what's the fastest way to remove said word if it's found? <pre class="prettyprint"><code>#Find if a word is there bad.split(" ").each do |word| found = str.include?(word) end #Remove the word bad.split(" ").each do |word| str.gsub!(/#{word}/, "") end </code></pre>

bad = "foo bar baz" => "foo bar baz" str = "This is my first foo string" => "This is my first foo string" (str.split(' ') - bad.split(' ')).join(' ') => "This is my first string"

What's the fastest way to check if a word from one string is in another string?

Tags:

performance

regex

ruby

ruby-on-rails

I have a string of words; let's call them bad:

bad = "foo bar baz"

I can keep this string as a whitespace separated string, or as a list:

bad = bad.split(" ");

If I have another string, like so:

str = "This is my first foo string"

What's the fasted way to check if any word from the bad string is within my comparison string, and what's the fastest way to remove said word if it's found?

#Find if a word is there
bad.split(" ").each do |word|
  found = str.include?(word)
end

#Remove the word
bad.split(" ").each do |word|
  str.gsub!(/#{word}/, "")
end

329

asked Mar 31 '10 02:03

Mike Trpcic

2 Answers

If the list of bad words gets huge, a hash is a lot faster:

    require 'benchmark'

    bad = ('aaa'..'zzz').to_a    # 17576 words
    str= "What's the fasted way to check if any word from the bad string is within my "
    str += "comparison string, and what's the fastest way to remove said word if it's "
    str += "found" 
    str *= 10

    badex = /\b(#{bad.join('|')})\b/i

    bad_hash = {}
    bad.each{|w| bad_hash[w] = true}

    n = 10
    Benchmark.bm(10) do |x|

      x.report('regex:') {n.times do 
        str.gsub(badex,'').squeeze(' ')
      end}

      x.report('hash:') {n.times do
        str.gsub(/\b\w+\b/){|word| bad_hash[word] ? '': word}.squeeze(' ')
      end}

    end
                user     system      total        real
regex:     10.485000   0.000000  10.485000 ( 13.312500)
hash:       0.000000   0.000000   0.000000 (  0.000000)

143

answered Sep 22 '22 23:09

steenslag

bad = "foo bar baz"

=> "foo bar baz"

str = "This is my first foo string"

=> "This is my first foo string"

(str.split(' ') - bad.split(' ')).join(' ')

=> "This is my first string"

answered Sep 23 '22 23:09

jeem

Related questions
                            
                                Rails ActiveJob - how to stop job from being enqueued in before_enqueue
                            
                                How do I check constraints on table columns using Rails?
                            
                                Expose Rails Env to Webpacker
                            
                                How to run Rails multi-threaded in development?
                            
                                Sprockets::FileNotFound: couldn't find file 'jquery' with type 'application/javascript' Heroku in Rails App
                            
                                Testing a databaseless Rails 5 application with rspec-rails
                            
                                "554 Please activate your Mailgun account. Check your inbox or log in to your control panel to resend the activation email." error Ruby on Rails
                            
                                ActiveModel Serializer - Passing params to serializers
                            
                                No template found for UsersController#create, rendering head :no_content
                            
                                How do I run Rails in Docker? PG::ConnectionBad could not translate host name "pg" to address: No address associated with hostname
                            
                                Can you _remove_ a variant from ActiveStorage?
                            
                                How to resolve "key not found: :ciphers"?
                            
                                Rails: Bootsnap fails to load
                            
                                What's your favorite Prototype framework compatible, javascript date picker? [closed]
                            
                                Application Dashboard View Logic
                            
                                Ruby on Rails - generating column for textarea and pics
                            
                                Rails optional gem config
                            
                                sql injection prevention for create method in rails controller
                            
                                Rails - Using another app's SOAP interface
                            
                                Rails 3 / Bundler gem: 'undefined method `setup' for Bundler:Module (NoMethodError)'

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With