Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Algorithm for Rendering Long Text in a Text Editor

I've been thinking about writing a text editor control that can edit text that can have any arbitrary length (say, hundreds of megabytes), similar in some ways to the Scintilla editor. The goal is to lazy-read the file, so the user doesn't have to read five hundred megabytes of data just to view a small portion of it. I'm having two problems with this:

  1. It seems to me to be impossible to implement any sensible scrolling feature for such an editor, unless I pre-read the entire file once, in order to figure out line breaks. Is this really true? Or is there a way to approximate things, that I'm not thinking of?

  2. Because of various issues with Unicode (e.g. it allows many bytes to represent just one character, not just because of variable-length encoding but also because of accents and such), it seems nearly impossible to determine exactly how much text will fit on the screen -- I'd have to use TextOut() or something to draw one character, measure how big it was, and then draw the next character. And even then, that still doesn't say how I'd map the user's clicks back to the correct text position.

Is there anything I could read on the web regarding algorithms for handling these issues? I've searched, but I haven't found anything.

Thank you!

like image 670
user541686 Avatar asked Dec 14 '10 20:12

user541686


People also ask

How do I edit a large text file?

To be able to open such large CSV files, you need to download and use a third-party application. If all you want is to view such files, then Large Text File Viewer is the best choice for you. For actually editing them, you can try a feature-rich text editor like Emacs, or go for a premium tool like CSV Explorer.

What editor can open Large files?

SlickEdit (Windows, macOS, Linux) – Opens large files. UltraEdit (Windows, macOS, Linux) – Opens files of more than 6 GB, but the configuration must be changed for this to be practical: Menu » Advanced » Configuration » File Handling » Temporary Files » Open file without temp file...

What is a text editor buffer?

A buffer is the basic unit of text being edited. It can be any size, from zero characters to the largest item that can be manipulated on the computer system. This limit on size is usually set by such factors as address space, amount of real and/or virtual memory, and mass storage capacity.


1 Answers

You can set a "coarse" position based on data size instead of lines. The "fine" position of your text window can be based on a local scan around an arbitrary entry point.

This means you will need to write functions that can scan locally (backwards and forwards) to find line starts, count Unicode characters, and so forth. This should not be too difficult; UTF8 is designed to be easy to parse in this way.

You may want to give special consideration of what to do about extremely long lines. Since there is no upper limit on how long a line can be, this makes finding the beginning (or end) of a line an unbounded task; I believe everything else you need for a screen editor display should be local.

Finally, if you want a general text editor, you need to figure out what you're going to do when you want to save a file in which you've inserted/deleted things. The straightforward thing is to rewrite the file; however, this is obviously going to take longer with a huge file. You can expect the user to run into problems if there is not enough room for a modified copy, so at the very least, you will want to check to make sure there is enough room on the filesystem.

like image 197
comingstorm Avatar answered Sep 18 '22 13:09

comingstorm