Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Purely functional data structures for text editors

What would be good purely functional data structures for text editors? I want to be able to insert single characters into the text and delete single characters from the text with acceptable efficiency, and I would like to be able to hold on to old versions, so I can undo changes with ease.

Should I just use a list of strings and reuse the lines that don't change from version to version?

like image 671
fredoverflow Avatar asked Sep 10 '12 19:09

fredoverflow


People also ask

Which data structure is used in text editor?

The gap buffer data structure is commonly used by text editors for storing the files that are currently being edited. The data structure is simply an array that stores the contents of a file plus some extra empty space. This empty space is called the gap.

Which data structure is used with functional programming languages?

Typically tuples, lists, and partially-evaluated functions are very common data structures in functional programming languages. Mutable data structures, like arrays and (real) hash tables, are used much less because they don't fit in as well with Haskell.

What is an immutable data structure?

Immutable data structure are data structures, like lists, arrays, sets etc., which cannot be changed, meaning that the values inside them can't be added, removed, moved or swapped. Instead of changing the data structure you make a new version of the data structure which is a separate value.


1 Answers

I don't know whether this suggestion is "good" for sophisticated definitions of "good", but it's easy and fun. I often set an exercise to write the core of a text editor in Haskell, linking with rendering code that I provide. The data model is as follows.

First, I define what it is to be a cursor inside a list of x-elements, where the information available at the cursor has some type m. (The x will turn out to be Char or String.)

type Cursor x m = (Bwd x, m, [x]) 

This Bwd thing is just the backward "snoc-lists". I want to keep strong spatial intuitions, so I turn things around in my code, not in my head. The idea is that the stuff nearest the cursor is the most easily accessible. That's the spirit of The Zipper.

data Bwd x = B0 | Bwd x :< x deriving (Show, Eq) 

I provide a gratuitous singleton type to act as a readable marker for the cursor...

data Here = Here deriving Show 

...and I can thus say what it is to be somewhere in a String

type StringCursor = Cursor Char Here 

Now, to represent a buffer of multiple lines, we need Strings above and below the line with the cursor, and a StringCursor in the middle, for the line we're currently editing.

type TextCursor = Cursor String StringCursor 

This TextCursor type is all I use to represent the state of the edit buffer. It's a two layer zipper. I provide the students with code to render a viewport on the text in an ANSI-escape-enabled shell window, ensuring that the viewport contains the cursor. All they have to do is implement the code that updates the TextCursor in response to keystrokes.

handleKey :: Key -> TextCursor -> Maybe (Damage, TextCursor) 

where handleKey should return Nothing if the keystroke is meaningless, but otherwise deliver Just an updated TextCursor and a "damage report", the latter being one of

data Damage   = NoChange       -- use this if nothing at all happened   | PointChanged   -- use this if you moved the cursor but kept the text   | LineChanged    -- use this if you changed text only on the current line   | LotsChanged    -- use this if you changed text off the current line   deriving (Show, Eq, Ord) 

(If you're wondering what the difference is between returning Nothing and returning Just (NoChange, ...), consider whether you also want the editor to go beep.) The damage report tells the renderer how much work it needs to do to bring the displayed image up to date.

The Key type just gives a readable dataype representation to the possible keystrokes, abstracting away from the raw ANSI escape sequences. It's unremarkable.

I provide the students with a big clue about to go up and down with this data model by offering these pieces of kit:

deactivate :: Cursor x Here -> (Int, [x]) deactivate c = outward 0 c where   outward i (B0, Here, xs)       = (i, xs)   outward i (xz :< x, Here, xs)  = outward (i + 1) (xz, Here, x : xs) 

The deactivate function is used to shift focus out of a Cursor, giving you an ordinary list, but telling you where the cursor was. The corresponding activate function attempts to place the cursor at a given position in a list:

activate :: (Int, [x]) -> Cursor x Here activate (i, xs) = inward i (B0, Here, xs) where   inward _ c@(_, Here, [])     = c  -- we can go no further   inward 0 c                   = c  -- we should go no further   inward i (xz, Here, x : xs)  = inward (i - 1) (xz :< x, Here, xs)  -- and on! 

I offer the students a deliberately incorrect and incomplete definition of handleKey

handleKey :: Key -> TextCursor -> Maybe (Damage, TextCursor) handleKey (CharKey c)  (sz,                         (cz, Here, cs),                         ss)   = Just (LineChanged, (sz,                         (cz, Here, c : cs),                         ss)) handleKey _ _ = Nothing 

which just handles ordinary character keystrokes but makes the text come out backwards. It's easy to see that the character c appears right of Here. I invite them to fix the bug and add functionality for the arrow keys, backspace, delete, return, and so on.

It may not be the most efficient representation ever, but it's purely functional and enables the code to conform concretely to our spatial intuitions about the text that's being edited.

like image 112
pigworker Avatar answered Oct 09 '22 07:10

pigworker