Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python data structure for a indexable list of string

I got a list of objects which look like strings, but are not real strings (think about mmap'ed files). Like this:

x = [ "abc", "defgh", "ij" ]

What i want is x to be directly indexable like it was a big string, i.e.:

(x[4] == "e") is True

(Of course I don't want to do "".join(x) which would merge all strings, because reading a string is too expensive in my case. Remember it's mmap'ed files.).

This is easy if you iterate over the entire list, but it seems to be O(n). So I've implemented __getitem__ more efficiently by creating such a list:

x = [ (0, "abc"), (3, "defgh"), (8, "ij") ]

Therefore I can do a binary search in __getitem__ to quickly find the tuple with the right data and then indexing its string. This works quite well.

I see how to implement __setitem__, but it seems so boring, I'm wondering if there's not something that already does that.

To be more precise, this is how the data structure should honor __setitem__:

>>> x = [ "abc", "defgh", "ij" ]
>>> x[2:10] = "12345678"
>>> x
[ "ab", "12345678", "j" ]

I'd take any idea about such a data structure implementation, name or any hint.

like image 774
jd_ Avatar asked May 26 '11 15:05

jd_


People also ask

Which data structure is used for list in Python?

In Python, lists are represented by square brackets. Therefore, we create a list as follows. The above list, colors is stored in memory as shown below. We can also create a list that contains multiple data types, like strings, integers, and floats.

How do you define a list of strings in Python?

To create a list of strings, first use square brackets [ and ] to create a list. Then place the list items inside the brackets separated by commas. Remember that strings must be surrounded by quotes. Also remember to use = to store the list in a variable.

How do you index a string in a list in Python?

To find the index of a list element in Python, use the built-in index() method. To find the index of a character in a string, use the index() method on the string. This is the quick answer.

Which symbol is used for list data type in Python?

In Python, lists are ordered collections of items that allow for easy use of a set of data. List values are placed in between square brackets [ ] , separated by commas.


1 Answers

What you are describing is a special case of the rope data structure.

Unfortunately, I am not aware of any Python implementations.

like image 90
NPE Avatar answered Oct 01 '22 10:10

NPE