Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove all sub paths?

Tags:

c++

I am a new in C++ (and I think there is should be much shorter way to do what I need), but what I need to do is:

I have a vector of paths to folders

C:\\Sessions\\MyFolder 
C:\\Sessions\\Calib 
C:\\Sessions\\Calib\\2020_04_30_18_02 
C:\\Sessions\\Calib\\2020_04_30_18_03 
C:\\Sessions\\Calib\\2020_04_30_18_02\\test 
C:\\Sessions\\Calib\\777\\folder 

I need to remove all paths that are sub path of final path. It is means that finally I need to get such result

C:\\Sessions\\MyFolder 
C:\\Sessions\\Calib\\2020_04_30_18_03 
C:\\Sessions\\Calib\\2020_04_30_18_02\\test 
C:\\Sessions\\Calib\\777\\folder 

These paths were removed

C:\\Sessions\\Calib\\2020_04_30_18_02 
C:\\Sessions\\Calib

Because this path C:\\Sessions\\Calib subpath of C:\\Sessions\\Calib\\2020_04_30_18_02 and actually this path C:\\Sessions\\Calib\\2020_04_30_18_02 is subpath of C:\\Sessions\\Calib\\2020_04_30_18_02\\test.

This path C:\\Sessions\\Calib\\2020_04_30_18_02\\test doesn't have subpath in given list of paths, so we leave it.

I wrote such method

/*static*/ std::vector<std::string> Utils::remove_sub_folders(std::vector<std::string> folderPaths_in)
    {
        std::vector<std::string> result;
        std::vector<std::string> executed_paths;
        std::vector<std::string> folder_paths = folderPaths_in;
        std::sort(folder_paths.begin(), folder_paths.end());

        for each (auto & path in folder_paths)
        {
            std::vector<std::string> path_split = Utils::split(path, "\\");

            for each (auto & compare_path in folder_paths)
            {
                bool is_have_alredy_processed = false;

                for each (auto & processed_path in executed_paths)
                {
                    if (compare_path == processed_path)
                    {
                        is_have_alredy_processed = true;
                        break;
                    }
                }

                if (!is_have_alredy_processed)
                {
                    std::vector<std::string> compare_path_split = Utils::split(compare_path, "\\");

                    int path_split_size = static_cast<int>(path_split.size());
                    int path_compare_size = static_cast<int>(compare_path_split.size());

                    if (path != compare_path && path_compare_size >= path_split_size)
                    {
                        int min_size = min(path_split_size, path_compare_size);

                        bool is_equal_begin = true;

                        for (int i = 0; i < min_size; i++)
                        {
                            std::string path_word = path_split[i];
                            std::string compare_word = compare_path_split[i];

                            if (path_word != compare_word)
                            {
                                is_equal_begin = false;
                                break;
                            }
                        }

                        if ((!is_equal_begin && path_split_size == path_compare_size))
                        {
                            result.push_back(path);
                        }
                    }
                }
            }

            executed_paths.push_back(path);
        }

        return result;
    }

I can't understand what I am missing, but actuall result I get is :

C:\Sessions\Calib
C:\Sessions\Calib\2020_04_30_18_02
C:\Sessions\Calib\2020_04_30_18_02\test

This is how I call this method:

std::vector<std::string> all_dirs_by_path{
            "C:\\Sessions\\MyFolder",
            "C:\\Sessions\\Calib",
            "C:\\Sessions\\Calib\\2020_04_30_18_02",
            "C:\\Sessions\\Calib\\2020_04_30_18_03",
            "C:\\Sessions\\Calib\\2020_04_30_18_02\\test",
            "C:\\Sessions\\Calib\\777\\folder",
        };
std::vector<std::string> final_dirs_by_path = Utils::remove_sub_folders(all_dirs_by_path);

What am I doing wrong?

like image 674
Aleksey Timoshchenko Avatar asked Feb 27 '26 11:02

Aleksey Timoshchenko


1 Answers

You start by sorting the given paths. However, the order in which it sorts is propably not what you expect. The reason is that sorting std::strings does not treat the directory separator \ in any special way. And \ sorts later than upper case characters, but earlier than lower case characters. What you would want is a comparison function that is a bit smarter about paths, and sorts them by looking at each path component separately.

If you can use C++17's filesystem library (or the Boost filesystem library), then you should create a vector of std::filesystem::paths. These have an operator<() that compares paths in the way you expect them to. Once you sort those, then you have the nice property that any parent path will come before a child path, and there are no paths between a parent and child path that don't have the same parent:

C:\Foo
C:\Foo\Bar
C:\Foo\Baz
C:\Quux
...

So then it's quite easy, traverse the list, and at any path in the list, check if the next path in the list is a child of the current path. If so, you can discard the current path.

using namespace fs = std::filesystem; // or use boost::filesystem here

static bool is_parent(const fs::path &parent, const fs::path &child) {
    auto parent_size = std::distance(parent.begin(), parent.end());
    auto child_size = std::distance(child.begin(), child.end());
    return child_size > parent_size && std::equal(parent.begin(), parent.end(), child.begin());
}

std::vector<fs::path> remove_sub_folders(std::vector<fs::path> paths) {
   std::sort(paths.begin(), paths.end());
   std::vector<fs::path> result;

   for(auto cur = paths.begin(); cur != paths.end(); ++cur) {
       auto next = std::next(cur);
       if (next == paths.end() || !is_parent(*cur, *next))
           result.push_back(*cur);
   }

   return result;
}

If you want to do it with std::string, then you have to write a custom comparison function that does what std::filesystem::path::operator<() does, and pass that to std::sort(). To check if one is a parent of the other, ensure you have a trailing directory separator in the parent path, then just check if the child path starts with the parent path.

like image 118
G. Sliepen Avatar answered Mar 02 '26 00:03

G. Sliepen



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!