Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fastest way to bruteforce a string using a DOS wildcard

This problem is similar to blind SQL injections. The goal is to determine the exact value of a string, and the only test you can do is to see if a DOS-style wildcard (? = any character, * = any number of any characters) you specify is matched by the string. (So practically you only have access to a bool DoesWildcardMatch(string wildcard) function).

The straight-forward way is to test against a*, b*, c*... until you find the first letter, then repeat. Some optimizations I can think of:

  • search for *a*, *b* etc. to determine the character set
  • when a match on *x* is found, perform divide-et-impera (*a*x*, *b*x*, ...)
like image 649
Vladimir Panteleev Avatar asked May 14 '09 19:05

Vladimir Panteleev


People also ask

How do you brute force a string?

The simplest algorithm for string matching is a brute force algorithm, where we simply try to match the first character of the pattern with the first character of the text, and if we succeed, try to match the second character, and so on; if we hit a failure point, slide the pattern over one character and try again.


1 Answers

A first thought. You can determin the length n of the string in O(log2(n)).

  • Check Z* where Z represents k question marks starting with 0, then 1, and then doubling the number of question marks with every check until no match occurs. n must be between k / 2 and k
  • Find the exact length using the same pattern changing k in the same way as binary search does.

Knowing the exact length might help to perform a kind of divide-et-impera in the spatial domain.

UPDATE

If you know the length, you can use the same pattern to correctly locate a symbol.

Example:

    ..X. ..XX (spaces added for readability)

                              + symbol may be X
                              - symbol is not X
                              X symbol is X

    *X*         => MATCH      ++++ ++++
    *X*   ????  => MATCH      ++++ ++++
    *X*?? ????  => NO MATCH   --++ ++++
    ??X?  ????  => MATCH      --X+ ++++
    ??XX  ????  => NO MATCH   --X- ++++
    ??X?  *X*?? => NO MATCH   --X- --++
    ??X?  ??X?  => MATCH      --X- --X+
    ??X?  ??XX  => MATCH      --X- --XX

For string length n and alphabet size m this will take about O(log2(n)) to find the length of the string, about O(n • log2(n)) to correctly place n symbols, and O(m) to find the used symbols - summing all together yields O(n • log2(n) + m).

I could imagine that it is possible to speed this up by merging several steps - maybe test for used symbols while determining the string length or simultaneously locating two (or even more?) symbols in the first and second half of the string. This will require to recheck the merged steps in isolation if the check fails in order to determine which check faild. But as long as the merged check succeeds, you gain information on both.

Maybe I will calculate that tomorrow in order to see if it will really speed the thing up.

like image 90
Daniel Brückner Avatar answered Sep 17 '22 12:09

Daniel Brückner