How would you implement tail efficiently?

Tags:

What is the efficient way to implement tail in *NIX? I came up (wrote) with two simple solution, both using kind of circular buffer to load lines into circular structure (array | doubly linked circular list - for fun). I've seen part of older implementation in busybox and from what I understood, they used fseek to find EOF and then read stuff "backwards". Is there anything cleaner and faster out there? I got asked this on interview and asker did not look satisfied. Thank you in advance.

814

asked Apr 15 '12 18:04

Tomas Pruzina

3 Answers

I don't think there are solutions different than "keep the latest N lines while reading forward the data" or "start from the end and go backwards until you read the Nth line".

The point is that you'd use one or the another based on the context.

The "go to the end and go backwards" is better when tail accesses a random access file, or when the data is small enough to be put on memory. In this case the runtime is minimized, since you scan the data that has to be outputted (so, it's "optimal")

Your solution (keep the N latest lines) is better when tail is fed with a pipeline or when the data is huge. In this case, the other solution wastes too much memory, so it is not practical and, in the case the source is slower than tail (which is probable) scanning all the file doesn't matter that much.

173

answered Oct 04 '22 21:10

akappa

Read backwards from the end of the file until N linebreaks are read or the beginning of the file is reached.

Then print what was just read.

I dont think any fancy datastructures are needed here.

Here is the source code of tail if you're interested.

answered Oct 04 '22 23:10

thumbmunkeys

First use fseek to find the end-of-file then subtract 512 and fseek to that offset, then read forward from there to end. Count the number of line-breaks because if there are too few you will have to do the same with a subtracted offset of 1024 ... but in 99% of cases 512 will be enough.

This (1) avoids reading the whole file forward and (2) the reason why this is probably more efficient than reading backwards from the end is that reading forward is typically faster.

answered Oct 04 '22 21:10

Bernd Elkemann

Related questions
                            
                                Should I learn to implement OOP in C? Are there projects that use OOP in C?
                            
                                C: Differences between strchr() and index()
                            
                                About the ambiguous description of sigwait()
                            
                                Comparing floating point numbers in C
                            
                                How to detect IP address change on OSX programmatically in C or C++
                            
                                sorting members of structure array
                            
                                How to check if a string is in an array of strings in C?
                            
                                Is there any way to get 64-bit time_t in 32-bit programs in Linux?
                            
                                Leaving out forward declarations (prototypes)
                            
                                C Function Explanation
                            
                                Libwebsocket client example
                            
                                How to implement "make install" in a Makefile?
                            
                                How do I use SDL2 in my programs correctly?
                            
                                How can I write a Windows application without using WinMain?
                            
                                Is C99 backward compatible with C89?
                            
                                Preprocessor directive #ifndef for C/C++ code
                            
                                Is it possible to load mismatched symbols in Visual Studio?
                            
                                Why is the return value of malloc(0) implementation-defined?
                            
                                Implementation of ceil() and floor()
                            
                                Difference between fgets and fscanf?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How would you implement tail efficiently?

Tags:

c

linux

unix

tail