Parsing binary file too slow in C++ using memory-mapped files

Tags:

I'm trying to parse a binary file integer-wise in order to check whether the integer value fulfills a certain condition but the loop is very slow.

Furthermore, I found that memory-mapped files are the fastest for reading a file into the memory quickly, hence I'm using the following Boost-based code:

unsigned long long int get_file_size(const char *file_path) {
    const filesystem::path file{file_path};
    const auto generic_path = file.generic_path();
    return filesystem::file_size(generic_path);
}

boost::iostreams::mapped_file_source read_bytes(const char *file_path,
                                         const unsigned long long int offset,
                                         const unsigned long long int length) {
    boost::iostreams::mapped_file_params parameters;
    parameters.path = file_path;
    parameters.length = static_cast<size_t>(length);
    parameters.flags = boost::iostreams::mapped_file::mapmode::readonly;
    parameters.offset = static_cast<boost::iostreams::stream_offset>(offset);

    boost::iostreams::mapped_file_source file;

    file.open(parameters);
    return file;
}

boost::iostreams::mapped_file_source read_bytes(const char *file_path) {
    const auto file_size = get_file_size(file_path);
    const auto mapped_file_source = read_bytes(file_path, 0, file_size);
    return mapped_file_source;
}

My test case roughly looks as follows:

inline auto test_parsing_binary_file_performance() {
    const auto start_time = get_time();
    const std::filesystem::path input_file_path = "...";
    const auto mapped_file_source = read_bytes(input_file_path.string().c_str());
    const auto file_buffer = mapped_file_source.data();
    const auto file_buffer_size = mapped_file_source.size();
    LOG_S(INFO) << "File buffer size: " << file_buffer_size;
    auto printed_lap = (long) (file_buffer_size / (double) 1000);
    printed_lap = round_to_nearest_multiple(printed_lap, sizeof(int));
    LOG_S(INFO) << "Printed lap: " << printed_lap;
    std::vector<int> values;
    values.reserve(file_buffer_size / sizeof(int)); // Pre-allocate a large enough vector
    // Iterate over every integer
    for (auto file_buffer_index = 0; file_buffer_index < file_buffer_size; file_buffer_index += sizeof(int)) {
        const auto value = *(int *) &file_buffer[file_buffer_index];
        if (value >= 0x30000000 && value < 0x49000000 - sizeof(int) + 1) {
            values.push_back(value);
        }

        if (file_buffer_index % printed_lap == 0) {
            LOG_S(INFO) << std::setprecision(4) << file_buffer_index / (double) file_buffer_size * 100 << "%";
        }
    }

    LOG_S(INFO) << "Values found count: " << values.size();

    print_time_taken(start_time, false, "Parsing binary file");
}

The memory-mapped file reading finishes almost instantly as expected but iterating it integer-wise is way too slow on my machine despite excellent hardware (SSD etc.):

2020-12-20 13:04:35.124 (   0.019s) [main thread     ]Tests.hpp:387   INFO| File buffer size: 419430400
2020-12-20 13:04:35.124 (   0.019s) [main thread     ]Tests.hpp:390   INFO| Printed lap: 419432
2020-12-20 13:04:35.135 (   0.029s) [main thread     ]Tests.hpp:405   INFO| 0%
2020-12-20 13:04:35.171 (   0.065s) [main thread     ]Tests.hpp:405   INFO| 0.1%
2020-12-20 13:04:35.196 (   0.091s) [main thread     ]Tests.hpp:405   INFO| 0.2%
2020-12-20 13:04:35.216 (   0.111s) [main thread     ]Tests.hpp:405   INFO| 0.3%
2020-12-20 13:04:35.241 (   0.136s) [main thread     ]Tests.hpp:405   INFO| 0.4%
2020-12-20 13:04:35.272 (   0.167s) [main thread     ]Tests.hpp:405   INFO| 0.5%
2020-12-20 13:04:35.293 (   0.188s) [main thread     ]Tests.hpp:405   INFO| 0.6%
2020-12-20 13:04:35.314 (   0.209s) [main thread     ]Tests.hpp:405   INFO| 0.7%
2020-12-20 13:04:35.343 (   0.237s) [main thread     ]Tests.hpp:405   INFO| 0.8%
2020-12-20 13:04:35.366 (   0.261s) [main thread     ]Tests.hpp:405   INFO| 0.9%
2020-12-20 13:04:35.399 (   0.293s) [main thread     ]Tests.hpp:405   INFO| 1%
2020-12-20 13:04:35.421 (   0.315s) [main thread     ]Tests.hpp:405   INFO| 1.1%
2020-12-20 13:04:35.447 (   0.341s) [main thread     ]Tests.hpp:405   INFO| 1.2%
2020-12-20 13:04:35.468 (   0.362s) [main thread     ]Tests.hpp:405   INFO| 1.3%
2020-12-20 13:04:35.487 (   0.382s) [main thread     ]Tests.hpp:405   INFO| 1.4%
2020-12-20 13:04:35.520 (   0.414s) [main thread     ]Tests.hpp:405   INFO| 1.5%
2020-12-20 13:04:35.540 (   0.435s) [main thread     ]Tests.hpp:405   INFO| 1.6%
2020-12-20 13:04:35.564 (   0.458s) [main thread     ]Tests.hpp:405   INFO| 1.7%
2020-12-20 13:04:35.586 (   0.480s) [main thread     ]Tests.hpp:405   INFO| 1.8%
2020-12-20 13:04:35.608 (   0.503s) [main thread     ]Tests.hpp:405   INFO| 1.9%
2020-12-20 13:04:35.636 (   0.531s) [main thread     ]Tests.hpp:405   INFO| 2%
2020-12-20 13:04:35.658 (   0.552s) [main thread     ]Tests.hpp:405   INFO| 2.1%
2020-12-20 13:04:35.679 (   0.574s) [main thread     ]Tests.hpp:405   INFO| 2.2%
2020-12-20 13:04:35.702 (   0.597s) [main thread     ]Tests.hpp:405   INFO| 2.3%
2020-12-20 13:04:35.727 (   0.622s) [main thread     ]Tests.hpp:405   INFO| 2.4%
2020-12-20 13:04:35.769 (   0.664s) [main thread     ]Tests.hpp:405   INFO| 2.5%
2020-12-20 13:04:35.802 (   0.697s) [main thread     ]Tests.hpp:405   INFO| 2.6%
2020-12-20 13:04:35.831 (   0.726s) [main thread     ]Tests.hpp:405   INFO| 2.7%
2020-12-20 13:04:35.860 (   0.754s) [main thread     ]Tests.hpp:405   INFO| 2.8%
2020-12-20 13:04:35.887 (   0.781s) [main thread     ]Tests.hpp:405   INFO| 2.9%
2020-12-20 13:04:35.924 (   0.818s) [main thread     ]Tests.hpp:405   INFO| 3%
2020-12-20 13:04:35.956 (   0.850s) [main thread     ]Tests.hpp:405   INFO| 3.1%
2020-12-20 13:04:35.998 (   0.893s) [main thread     ]Tests.hpp:405   INFO| 3.2%
2020-12-20 13:04:36.033 (   0.928s) [main thread     ]Tests.hpp:405   INFO| 3.3%
2020-12-20 13:04:36.060 (   0.955s) [main thread     ]Tests.hpp:405   INFO| 3.4%
2020-12-20 13:04:36.102 (   0.997s) [main thread     ]Tests.hpp:405   INFO| 3.5%
2020-12-20 13:04:36.132 (   1.026s) [main thread     ]Tests.hpp:405   INFO| 3.6%
...
2020-12-20 13:05:03.456 (  28.351s) [main thread     ]Tests.hpp:410   INFO| Values found count: 10650389
2020-12-20 13:05:03.456 (  28.351s) [main thread     ]          benchmark.cpp:31    INFO| Parsing binary file took 28.341 second(s)

Parsing those 419 MB always takes around 28 - 70 seconds. Even compiling in Release mode does not really help. Is there any way to cut this time down? It doesn't seem like the operation I'm performing should be that inefficient.

Note that I'm compiling for Linux 64-bit using GCC 10.

EDIT:
As suggested in the comments, using memory-mapped files with advise() also does not help the performance:

boost::interprocess::file_mapping file_mapping(input_file_path.string().data(), boost::interprocess::read_only);
boost::interprocess::mapped_region mapped_region(file_mapping, boost::interprocess::read_only);
mapped_region.advise(boost::interprocess::mapped_region::advice_sequential);
const auto file_buffer = (char *) mapped_region.get_address();
const auto file_buffer_size = mapped_region.get_size();
...

Lessons learned so far by taking into account the comments/answers:

Using advise(boost::interprocess::mapped_region::advice_sequential) does not help
Not calling reserve() or calling it with exactly the right size can double the performance
Iterating directly on int * is a bit slower than iterating on a char *
Using a std::set is a bit slower than a std::vector for collecting the results
The progress logging is insignificant for the performance

406

asked Dec 20 '20 10:12

BullyWiiPlaza

1 Answers

As hinted by xanatos memory-mapped files are deceiving in performance since they don't really read the entire file into memory in an instant. During processing, multiple disk accesses are caused on page misses, severely degrading the performance.

In this case it is more efficient to read the entire file into the memory first and then iterating through the memory:

inline std::vector<std::byte> load_file_into_memory(const std::filesystem::path &file_path) {
    std::ifstream input_stream(file_path, std::ios::binary | std::ios::ate);

    if (input_stream.fail()) {
        const auto error_message = "Opening " + file_path.string() + " failed";
        throw std::runtime_error(error_message);
    }

    auto current_read_position = input_stream.tellg();
    input_stream.seekg(0, std::ios::beg);

    auto file_size = std::size_t(current_read_position - input_stream.tellg());
    if (file_size == 0) {
        return {};
    }

    std::vector<std::byte> buffer(file_size);

    if (!input_stream.read((char *) buffer.data(), buffer.size())) {
        const auto error_message = "Reading from " + file_path.string() + " failed";
        throw std::runtime_error(error_message);
    }

    return buffer;
}

Now the performance is much more acceptable with roughly 3 - 15 seconds in total.

110

answered Sep 19 '22 15:09

BullyWiiPlaza

Related questions
                            
                                Is it possible to determine if an enumeration was strongly typed?
                            
                                Class vs enum class as an index type
                            
                                Bug in clang thread_local initialization
                            
                                Lambda as a template variable
                            
                                C++20 template <auto> with user type leads to T/const T type mismatch in GCC 9
                            
                                How to build crashpad for Qt application
                            
                                Is it safe to mix boost::thread with C++11 std::mutex?
                            
                                Lambda copying a reference to a lambda reference VS2017 compile error
                            
                                How to write cmake modules for "boost-like" multi-component library?
                            
                                How TeamViewer simulates Ctrl-Alt-Del on Windows programmatically?
                            
                                Clang - Getting SubstTemplateTypeParm full template information
                            
                                Avoid `-Wclass-memaccess` on memcpy of a POD type w/copy disabled
                            
                                Why is size_type in std::array size_t and in std::vector usually size_t?
                            
                                Template disambiguator for dependent names
                            
                                Where is the race in this thread sanitzer warning?
                            
                                Why is a partial class template specialization on a matching template class ambiguous with another partial specialization without the template match?
                            
                                Is it OK to store information regarding an error in an Exception object?
                            
                                Difference of C++17 and C++20 in template friend function with unary and binary operators
                            
                                template type name error when same name is used for static member function
                            
                                GCC can not resolve method call with defaulted parameter and following parameter pack

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Parsing binary file too slow in C++ using memory-mapped files

Tags:

c++

optimization

micro-optimization

memory-mapped-files

BullyWiiPlaza

People also ask

1 Answers

BullyWiiPlaza

Recent Activity

Donate For Us