Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which algorithm is used to implement search for files in Visual Studio Code / Google Chrome Developer / Sublime (Ctrl+p or Cmd+p)?

I am very impressed with the powerful search feature of Sublime which we later found in Visual Studio Code and Google Chrome Developer.

A very basic algorithm for searching may use Trie, I guess but this search for files of Sublime etc seems like some sort of a multi-direction Trie (if there is such a thing!) i.e. if you have a file name like:

"I-am-a-very-big-beautifully-created-file-and-something-else.js"

and you search for "created file", "file created", "something-beautifully", "else big", "big else" or any other combination of strings from that file's name, Sublime and Visual Studio code will find it and other files with similar names, in no time. (Google Chrome Developer version isn't very powerful though but that's not the point here).

So, I dug through the source code of Visual Studio for a little bit but still couldn't figure out how the search is implemented and which algorithm is used. I am not looking for its code. Just need to understand a high level theory of how this powerful feature, that saves us developers a lot of time, is implemented.

like image 275
Usman Avatar asked Mar 06 '18 19:03

Usman


3 Answers

The VS Code functionality you're looking for can be tracked by looking into what happens when this function is called:

https://github.com/microsoft/vscode/blob/master/src/vs/workbench/contrib/search/browser/anythingQuickAccess.ts#L365

an interesting part here (filter on the first term only , then on the whole filter):

https://github.com/microsoft/vscode/blob/d1220da07267242e9b58193da0ded43bbe7f2aa5/src/vs/workbench/contrib/search/browser/anythingQuickAccess.ts#L570

If you dig from there you will find that most of the useful stuff is regrouped in this file:

https://github.com/microsoft/vscode/blob/master/src/vs/base/common/fuzzyScorer.ts

Some notes:

The fuzzy search is pretty strict, the filter will not match if a single letter is not found or if the letters are not found in the right order. A key advantage is that you can highlight the matching part in the results and it makes pretty clear what's happening to your users.

The fuzzy scorer scores results based on a matching label/description[/path] (for a file, label = filename, description = path without name).

like image 88
Guillaume86 Avatar answered Dec 01 '22 22:12

Guillaume86


This looks like the code that DevTools uses for its fuzzy search in the Command Menu.

https://cs.chromium.org/chromium/src/third_party/WebKit/Source/devtools/front_end/quick_open/CommandMenu.js?l=174

And the underlying diff algorithm is:

https://cs.chromium.org/chromium/src/third_party/WebKit/Source/devtools/front_end/diff/Diff.js?q=Diff.Diff&dr=CSs&l=4

like image 32
Kayce Basques Avatar answered Dec 01 '22 21:12

Kayce Basques


I think the name is "fuzzy search" lol. Here's a nice github repo where some kind person released an open source JS version that is a lot easier to include in your code than the VS Code version because it's on npm and it's just one file:

https://github.com/farzher/fuzzysort

like image 27
user3413723 Avatar answered Dec 01 '22 22:12

user3413723