Sunday, May 18, 2008

VList data structure in C# (.NET Framework 2.0)

I've made an implementation of Phil Bagwell's VList data structure in C#, with a fairly comprehensive test suite. It comes in two flavors: VList(of T), where you normally add/remove items at the beginning of the list, and RVList(of T), to which you normally add/remove items at the end. It implements the complete IList(of T) interface plus quite a few additional members including AddRange, InsertRange, RemoveRange, Push, and Pop. Converting a VList to a RVList and vice versa is a trivial O(1) operation that appears to reverse the order of the elements. VList and RVList are value types (structs) that contain a reference to the underlying linked list of arrays (VListBlock(of T)). Small lists (0 to 2 items) are optimized with a specialized block class (VListBlockOfTwo(of T)).

License: Lesser GPL. Contact me at qwertie256, at, gmail.com if you would like the source code. Here's an example usage:

void Example()
{
VList<int> oneTwo = new VList<int>(1, 2);
VList<int> threeFour = new VList<int>(3, 4);
VList<int> list = oneTwo;
VList<int> list2 = threeFour;

ExpectList(list, 1, 2);
list.InsertRange(1, threeFour);
ExpectList(list, 1, 3, 4, 2);
list2.InsertRange(2, oneTwo);
ExpectList(list2, 3, 4, 1, 2);

// oneTwo and ThreeFour are unchanged:
ExpectList(oneTwo, 1, 2);
ExpectList(threeFour, 3, 4);
}
static void ExpectList<T>(IList<T> list, params T[] expected)
{
Assert.AreEqual(expected.Length, list.Count);
for (int i = 0; i < expected.Length; i++)
Assert.AreEqual(expected[i], list[i]);
}

I thought that I would use the RVList to implement Loyc's AST to help make it possible to take AST snapshots easily, but I now suspect it's not a good idea. I am still working on the problem.

Performance characteristics

Similarly to a persistent linked list,
  • Adding an item to the front of a VList or the end of an RVList is always O(1) in time, and often O(1) in space (though, unlike a linked list, it may be much more)
  • Removing an item from the front of a VList or the end of an RVList is O(1) in time, although space not necessarily reclaimed.
  • Adding or removing an item at the end of a VList or the front of an RVList is O(N) and requires making a copy of the entire list.
  • Inserting or removing a list of M items at the end of a VList or the front of an RVList is O(N + M).
  • Changing an item at an arbitrary position should be avoided, as it performs as poorly as inserting or removing an item at that position.
VLists, however, offer some operations that singly-linked lists cannot provide efficiently:
  • Access by index averages O(1) in ideal conditions
  • Getting the list length is typically O(log N), but O(1) in my version
  • If a sublist points somewhere within a larger list, its index within the larger list can be obtained in between O(1) and O(log N) time. Consequently, reverse enumeration is possible without creating a temporary stack or list.
Also, VLists can (in the best cases) store data almost as compactly as ordinary arrays.

3 comments:

diff.operator said...

I am thinking of using this data structure to implement a reversible debugger for #D. The expectation is that it will provide persistent storage for the stack frame at each line of execution.

And dont be disheartened that LOYC didnt work out. You might come up with something better some other day :-)

Qwertie said...

Sorry about replying slowly. I posted the VList code along with my new related "WList" at http://www.codeproject.com/KB/collections/vlist.aspx

Qwertie said...

It will be interesting to see how well this data structure performs as a persistent call stack. The list may degenerate and use a lot of RAM... or it may not.