Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do you find a needle in a haystack?

People also ask

Can you actually find a needle in a haystack?

The phrase “like finding a needle in a haystack” is a similie — sometimes, the thing we're in search of is going to be so hard to find, it's functionally impossible to do so.

How do you answer a needle in a haystack?

There are at least three ways. You could jump around in the hay until you get pricked by the needle. You could spread the hay out into a thin layer and then make several passes over it with a large magnet. Or you could burn it all and then sift through the ashes.

How would you find a needle in a haystack interview?

The point is to shed a different light on the problem to find a new solution. John Lees has more straightforward suggestions: "If the needle is made of steel, a magnet will do the trick. Or you could simply burn the hay off and the needle will remain."

Why is it hard to find a needle in a haystack?

When something is very difficult to find it is like looking for a needle in a haystack. Especially because the area you have to search is too large and because of everything around it. We also say trying to find a needle in a haystack. Hay is grass which is cut, dried and used as animal food or as covering material.


Of the three, I prefer option #3.

The Single Responsibility Principle makes me not want to put searching capabilities on my DTOs or models. Their responsibility is to be data, not to find themselves, nor should needles need to know about haystacks, nor haystacks know about needles.

For what it's worth, I think it takes most OO practitioners a LONG time to understand why #3 is the best choice. I did OO for a decade, probably, before I really grokked it.

@wilhelmtell, C++ is one of the very few languages with template specialization that make such a system actually work. For most languages, a general purpose "find" method would be a HORRIBLE idea.


Usually actions should be applied to what you are doing the action on... in this case the haystack, so I think option 2 is the most appropriate.

You also have a fourth alternative that I think would be better than alternative 3:

haystack.find(needle, searcher)

In this case, it allows you to provide the manner in which you want to search as part of the action, and so you can keep the action with the object that is being operated on.


There is another alternative, which is the approach utilized by the STL of C++:

find(haystack.begin(), haystack.end(), needle)

I think it's a great example of C++ shouting "in your face!" to OOP. The idea is that OOP is not a silver bullet of any kind; sometimes things are best described in terms of actions, sometimes in terms of objects, sometimes neither and sometimes both.

Bjarne Stroustrup said in TC++PL that when you design a system you should strive to reflect reality under the constraints of effective and efficient code. For me, this means you should never follow anything blindly. Think about the things at hand (haystack, needle) and the context we're in (searching, that's what the expression is about).

If the emphasis is about the searching, then using an algorithm (action) that emphasizes searching (i.e. is flexibly to fit haystacks, oceans, deserts, linked lists). If the emphasis is about the haystack, encapsulate the find method inside the haystack object, and so on.

That said, sometimes you're in doubt and have hard times making a choice. In this case, be object oriented. If you change your mind later, I think it is easier to extract an action from an object then to split an action to objects and classes.

Follow these guidelines, and your code will be clearer and, well, more beautiful.


I would say that option 1 is completely out. The code should read in a way that tells you what it does. Option 1 makes me think that this needle is going to go find me a haystack.

Option 2 looks good if a haystack is meant to contain needles. ListCollections are always going to contain ListItems, so doing collection.find(item) is natural and expressive.

I think the introduction of a helper object is approproiate when:

  1. You don't control the implementation of the objects in question
    IE: search.find(ObsecureOSObject, file)
  2. There isn't a regular or sensible relationship between the objects
    IE: nameMatcher.find(houses,trees.name)

I am with Brad on this one. The more I work on immensely complex systems, the more I see the need to truly decouple objects. He's right. It's obvious that a needle shouldn't know anything about haystack, so 1 is definitely out. But, a haystack should know nothing about a needle.

If I were modeling a haystack, I might implement it as a collection -- but as a collection of hay or straw -- not a collection of needles! However, I would take into consideration that stuff does get lost in a haystack, but I know nothing about what exactly that stuff. I think it's better to not make the haystack look for items in itself (how smart is a haystack anyway). The right approach to me is to have the haystack present a collection of things that are in it, but are not straw or hay or whatever gives a haystack its essence.

class Haystack : ISearchableThingsOnAFarm {
   ICollection<Hay> myHay;
   ICollection<IStuffSmallEnoughToBeLostInAHaystack> stuffLostInMe;

   public ICollection<Hay> Hay {
      get {
         return myHay;
      }
   }

   public ICollection<IStuffSmallEnoughToBeLostInAHayStack> LostAndFound {
      get {
        return stuffLostInMe;
      }
   }
}

class Needle : IStuffSmallEnoughToBeLostInAHaystack {
}

class Farmer {
  Search(Haystack haystack, 
                 IStuffSmallEnoughToBeLostInAHaystack itemToFind)
}

There's actually more I was going to type and abstract into interfaces and then I realized how crazy I was getting. Felt like I was in a CS class in college... :P

You get the idea. I think going as loosely coupled as possible is a good thing, but maybe I was getting a bit carried away! :)


If both Needle and Haystack are DAOs, then options 1 and 2 are out of the question.

The reason for this is that DAOs should only be responsible for holding properties of the real world objects they are modeling, and only have getter and setter methods (or just direct property access). This makes serializing the DAOs to a file, or creating methods for a generic compare / generic copy easier to write, as the code wouldn't contain a whole bunch of "if" statements to skip these helper methods.

This just leaves option 3, which most would agree to be correct behaviour.

Option 3 has a few advantages, with the biggest advantage being unit testing. This is because both Needle and Haystack objects can be easily mocked up now, whereas if option 1 or 2 were used, the internal state of either Needle or Haystack would have to be modified before a search could be performed.

Secondly, with the searcher now in a separate class, all search code can be held in one place, including common search code. Whereas if the search code was put into the DAO, common search code would either be stored in a complicated class hierarchy, or with a Searcher Helper class anyway.