Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find index of a substring?

Tags:

elixir

Looking for Elixir equivalent of Ruby's:

"[email protected]".index("@")         # => 9
"[email protected]".index("domain")    # => 10
like image 494
qertoip Avatar asked Feb 22 '16 10:02

qertoip


People also ask

How do you find the index of a substring in a string in Python?

Python String find() method returns the lowest index or first occurrence of the substring if it is found in a given string. If it is not found, then it returns -1.

How do you find the index of a substring in C++?

string find in C++ String find is used to find the first occurrence of sub-string in the specified string being called upon. It returns the index of the first occurrence of the substring in the string from given starting position. The default value of starting position is 0.

Can you take index of a string?

Because strings, like lists and tuples, are a sequence-based data type, it can be accessed through indexing and slicing.


1 Answers

TL;DR: String.index/2 is intentionally missing because smarter alternatives exist. Very often String.split/2 will solve the underlying problem - and with a way better performance.

  • I assume we are talking UTF-8 strings here and expect to cleanly deal with non-ASCII characters.

  • Elixir encourages fast code. It turns out that problems we usually try solve with String.index/2 can be solved in a much smarter way, vastly improving performance without degrading code readability.

  • The smarter solution is to use String.split/2 and/or other similar String module functions. The String.split/2 works on a byte-level while still correctly handling graphemes. It can't go wrong because both arguments are Strings! The String.index/2 would have to work on a grapheme-level, slowly seeking throughout the String.

  • For that reason the String.index/2 is unlikely be added to the language unless very compelling use cases come up that cannot be cleanly solved by existing functions.

  • See also the elixir-lang-core discussion on that matter: https://groups.google.com/forum/#!topic/elixir-lang-core/S0yrDxlJCss

  • On a side note, Elixir is pretty unique in its mature Unicode support. While most languages work on a codepoint level (colloquially "characters"), Elixir works with a higher level concept of graphemes. Graphemes are what users perceive as one character (lets say its a more practical understanding of a "character"). Graphemes can contain more than one codepoint (which in turn can contain more than one byte).

Finally, if we really need the index:

case String.split("[email protected]", "domain", parts: 2) do
  [left, _] -> String.length(left)
  [_] -> nil
end
like image 124
qertoip Avatar answered Oct 15 '22 12:10

qertoip