Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Comparing two files in C++

Tags:

c++

file

windows

I have a function that compares two files to see if they are the same. It reads the files byte by byte and checks to see they are the same.
The problem I'm having now is that for big files this function takes quite a long time.

What is the better, faster way to check if files are the same?

like image 887
discodowney Avatar asked Feb 22 '23 07:02

discodowney


2 Answers

When your files are not the same, are they likely to be of the same size? If not, you can determine the file sizes right away (fseek to the end, ftell to determine the position), and if they're different then you know they're not the same without comparing the data. If the size is the same, remember to fseek back to the beginning.

If you read your files into large buffers of memory and compare each buffer using memcmp() you will improve performance. You don't have to read the entire file at once, just set a large buffer size and read blocks of that size from each file, for each comparison iteration through your loop. The memcpy function will operate on 32 bit values, rather than 8 bit bytes.

like image 77
mah Avatar answered Mar 02 '23 01:03

mah


If you really want brute force comparison of two files, mmaping may help.

If you know the file structure of what you are reading, read unique sections which allow you to identify them quickly (e.g. a header and relevant chunks/sections). Of course, you will want to get its basic attributes before comparing.

Generate hashes (or something) if you do multiple comparisons.

like image 27
justin Avatar answered Mar 01 '23 23:03

justin