Hi!
sburrow wrote:
Hello everyone!
I'm working on a text editor that runs on my next '816 SBC, and I'm running into math problems. Yes, and I'm the math professor!
Let's start with Notepad, or equivalent. You press a key, and it inserts a character between whatever other characters you had the cursor on. Already we have math problems! So in memory, am I supposed to shift the entire document down by one character, just to slip this new one in? If the document is small enough and processor speed high enough (say like on a modern PC), no problem. But let's say I have 1MB of text to shift on a machine that is in the single digit MHz, and that gets cumbersome and slow. For each key.
You press backspace, now you have to shift everything back one. You press return, and you shift everything forward one and insert \n where the cursor was. This is all a lot of shifting...
Is there something I am missing here? Is there a technique that is just alluding me?
There are various data structures to solve your problem, with different complexities for different operations. For example:
- Gap Buffer: this data structure divides the text in memory in two parts - all the data before the cursor, then all the free memory and then all the data after the cursor. You need two pointers, one points at the start of the gap and one at the end of the gap, and move the data around when the cursor moves.
- Piece Table: this data structure stores the original text in one big block of memory, and stores a list pieces (a pointer and a length) that describe the document. Originally you have one piece pointing at the start of the data and with the full length. When you insert into a piece, you divide the piece into two and add a third that points to new memory (at the end of the block) with the added data. This structure makes having an undo operation trivial, as the data is never deleted.
- Rope: this is a three representing the data, similar to the piece table, each node has a chunk of text and two childs for the text before and after the node. Insert and delete operations are done in the tree, and you need to keep the tree balanced for good performance.
For a text of length N and an operation on k characters after e editions, the three structures are O(k) for insertions and deletions. The Gap Buffer is also O(k) to move the cursor k characters, the Piece Table is O(e) and the Rope is O(ln(N)) for moving and locating the cursor.
For the small memory amounts of the 8 and 16 bit computers, the Gap Buffer is easy to implement and very fast.
A variation of the Gap Buffer is, instead of storing a big gap in the middle of the text, to store the current line being edited in a separated buffer, this means that the gap is created when a line is started to edit and deleted when the line editing ends. This is the model that the FastBasic editor uses, when the cursor moves up/down the current line is copied to the edit buffer, and all the editions happens in that buffer - this makes editing the line fast, and it only moves memory when you go to another line after the edition.
Have Fun!