Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Algorithm used in Ruby for "String#include?"

Is anyone able to pinpoint which algorithm is used for the include? method in Ruby? For example

"helloworld".include?("hello")
like image 416
John Rich Avatar asked Oct 17 '11 14:10

John Rich


People also ask

How do you create a string in Ruby?

A string in Ruby is an object (like most things in Ruby). You can create a string with either String::new or as literal (i.e. with the double quotes "" ). But you can also create string with the special %() syntax With the percent sign syntax, the delimiters can be any special character.

What is the sequence of Ruby string?

In Ruby, string is a sequence of one or more characters. It may consist of numbers, letters, or symbols. Here strings are the objects, and apart from other languages, strings are mutable, i.e. strings can be changed in place instead of creating new strings.

Can you index a string in Ruby?

index is a String class method in Ruby which is used to returns the index of the first occurrence of the given substring or pattern (regexp) in the given string. It specifies the position in the string to begin the search if the second parameter is present. It will return nil if not found.


2 Answers

As emboss states in his answer, String#include calls rb_str_index. This function in turn calls rb_memsearch, which implements the Rabin-Karp string search algorithm, according to this post on ruby-forum.com.

like image 131
rdvdijk Avatar answered Oct 12 '22 23:10

rdvdijk


The Ruby Language Specification doesn't prescribe any particular algorithm. Every implementation can use whatever algorithm they want.

For example, in Rubinius, String#include? calls String#find_string:

def include?(needle)
  if needle.kind_of? Fixnum
    needle = needle % 256
    str_needle = needle.chr
  else
    str_needle = StringValue(needle)
  end

  !!find_string(str_needle, 0)
end

String#find_string in turn is implemented via the string_index primitive:

def find_string(pattern, start)
  Rubinius.primitive :string_index
  raise PrimitiveFailure, "String#find_string failed"
end

The string_index primitive is implemented by the rubinius::String::index function:

// Rubinius.primitive :string_index
Fixnum* index(STATE, String* pattern, Fixnum* start);

rubinius::String::index:

Fixnum* String::index(STATE, String* pattern, Fixnum* start) {
  native_int total = size();
  native_int match_size = pattern->size();

  if(start->to_native() < 0) {
    Exception::argument_error(state, "negative start given");
  }

  switch(match_size) {
  case 0:
    return start;
  case 1:
    {
      uint8_t* buf = byte_address();
      uint8_t matcher = pattern->byte_address()[0];

      for(native_int pos = start->to_native(); pos < total; pos++) {
        if(buf[pos] == matcher) return Fixnum::from(pos);
      }
    }
    return nil<Fixnum>();
  default:
    {
      uint8_t* buf = byte_address();
      uint8_t* matcher = pattern->byte_address();

      uint8_t* last = buf + (total - match_size);
      uint8_t* pos = buf + start->to_native();

      while(pos <= last) {
        // Checking *pos directly then also checking memcmp is an
        // optimization. It's about 10x faster than just calling memcmp
        // everytime.
        if(*pos == *matcher &&
            memcmp(pos, matcher, match_size) == 0) {
          return Fixnum::from(pos - buf);
        }
        pos++;
      }
    }
    return nil<Fixnum>();
  }
}
like image 44
Jörg W Mittag Avatar answered Oct 13 '22 01:10

Jörg W Mittag