#927 List API Changes

brian Thu 21 Jan 2010

This ticket is to capture the changes I plan to make to sys::List from the #884 discussion:

  1. Change signature of fill to fill(Range, V) to make it clearer what is occurring and allowing you to overwrite existing items.
  2. Change List.size setter to disallow creating a sparse List if the type is non-nullable
  3. Add new containsAny(L) method
  4. Add optional closure for contains, containsAll, and containsAny
  5. Review use of containsSame, containsAllSame, and indexSame in codebase to see if they it makes sense to remove them
  6. Change sort, sortr, and reverse to return a new list
  7. Add isSet field which will implement an alternate backing store to maintain Set semantics where items are enforced to be unique according to equals and hash semantics, and with hash map based performance for contains

brian Thu 21 Jan 2010

Promoted to ticket #927 and assigned to brian

lbertrand Fri 22 Jan 2010

Brian,

I was wondering what you all think about my comment in the previous discussion... If List becomes immutable, why not also tackling the change in the backing list to make it more like clojure lists so that immutable list are quite performant?

Perhpas this can be tackle after the changes you are proposing to make to the list public API.

brian Fri 22 Jan 2010

I was wondering what you all think about my comment in the previous discussion... If List becomes immutable, why not also tackling the change in the backing list

I don't like that model for the core list/map. I like my mutable collections for performance. To me the key issue is building up a collection mutably, then locking it down ro/immutable for OO encapsulation or sharing b/w thread (which we do very efficiently today).

However, if I had some performance metrics that showed building up a Clojure style list/map didn't have more than 10% hit compared to existing implementations I would have a completely different mindset. In that case I'd prefer immutability as the only option.

Either way I love the Clojure collections and would like to have implementations for Fantom as a library. It would make a great way for building up large data structures to share efficiently b/w actors. The restriction would be that those collections could only hold const types. But if anyone wishes to continue that discussion let's open a new thread.

jodastephen Fri 22 Jan 2010

I'm happy with these changes.

I would like to see the binary search proposal adopted though, perhaps as a separate parallel method. Every time I use binary search I get it wrong even after reading the docs. Fantom shouldn't limit itself to the current idiom here.

I also think that an approach to the mutating/idempotent methods is needed. Options:

  • have both with a naming convention so users aren't constrained to a specific development technique)
  • only have idempotent methods, and use list.dup.method()
  • only have mutating methods, and add replace(list)

(The options are ordered as per my preference)

brian Fri 22 Jan 2010

I would like to see the binary search proposal adopted though, perhaps as a separate parallel method

I like the way did Java did it and don't really want to change it, but I'm open to continue exploring the idea. I'd suggest we start a new topic, I'd like to keep this topic focused on items which have been fully nailed down.

I also think that an approach to the mutating/idempotent methods is needed

I don't want to pollute the List with multiple versions of transformations like sort, etc. I think the easiest solution is to introduce a method which overwrites one list with the values from another so that you can do this:

list.overwrite(list.sort)

Although you can already do that today:

list.clear.addAll(list.sort)

But might be nice to have a convenience method, but I'm not sure the proper name.

brian Sat 23 Jan 2010

Ticket resolved in 1.0.49

OK, I pushed a batch of changes to the List API. This is what I actually did:

  1. Regarding fill - I left it as is. Using a range makes it awkward since it gets into size-1 issues. This same signature was also used on Buf. So I don't think using range is a better solution.
  2. List.size will now throw ArgErr if you attempt to grow a non-nullable list. I discovered a few places in the compiler where this got me.
  3. I added a new containsAny(L) method
  4. After digging into it, I decided that optional closures for contains, containsAll, and containsAny do not make sense. Passing a value and a closure is awkward compared to just using a closure with any and all (and just enclosing the value).
  5. I removed containsSame and containsAllSame. However I left indexSame and removeSame because removing an item by reference instead of equality is something that occurs a lot.
  6. I left sort, sortr, and reverse exactly the same. After making the change and digging thru the codebase it was apparent that uses of sort with temporary lists was actually the common case. So I didn't feel it was worth taking the performance issue to create a new copy of the list. So I backed this change out and left things as is.
  7. I have not added the isSet field yet. That is a pretty big feature, and a non-breaking change so I'll probably hold off on that for a little while.
  8. In addition to the original proposal I added a moveTo method that shifts an item to a given index (typically 0 or -1). This is something that Andy and I have been meaning to add for a while now.

qualidafial Sat 23 Jan 2010

Lately I've been thinking we ought to extract the * and *Same methods out into some kind of strategy comparison object. Then you can:

  • remove all the duplicate methods,
  • add a field to configure the comparison strategy, and
  • add an optional argument to each original method that defaults to the field value

MoOm Sun 24 Jan 2010

I'm not fond of the isSet field. Why not introduce a new Set type? A list and a set are really not the same thing and many methods of a list does not make sense in a set (and the common methods could maybe be moved to a mixin).

As for the mutable vs immutable list thing, maybe we could adopt the same solution as in Objective-C: the List type would only have methods that does not modify the list (e.g. constructors, iteration, search...) and we could have a MutableList type that inherits from List and that introduces methods that can modify the list.

You can look at NSArray and NSMutableArray for more details.

DanielFath Sun 24 Jan 2010

I definitely agree about the first part. IMO Set should be separated from List because they are clearly two very different data structures. Then again brian is strictly opposed to this because he believes that every other data structure can be made out of Map and List.

As for the mutable/immutable lists, shouldn't it be other way around? I mean performance wise it is better to have Mutable lists as default.

helium Sun 24 Jan 2010

A set is not like a list. It's like a map where you only care for the keys.

andy Sun 24 Jan 2010

I've run across a few practical use cases for a Set in my work. And an actual List is always what I've wanted as the end result. So I am 100% on board that this is something built into List. Given that tho, maybe a different term than "set" should be used, like "isUnique".

tactics Sun 24 Jan 2010

I'm not against the isSet mode. But I'd like to know how the APIs would interact with them. Would methods require isSet mode when they want uniqued lists (throwing an error when it's not set)? Or would methods allow a List of any mode, handling the uniqueness itself. These are the kinds of questions I'm interested in if we're going to add a new List mode.

DanielFath Sun 24 Jan 2010

A set is not like a list. It's like a map where you only care for the keys.

Well technically speaking, everything is a map (or a table), right? Even list is a [Int:Obj?] map with different operations.

I've run across a few practical use cases for a Set in my work. And an actual List is always what I've wanted as the end result.

Wouldn't just making Set extend List have just about same if not more sense than splashing them into a single API?

helium Mon 25 Jan 2010

Well technically speaking, everything is a map (or a table), right? Even list is a [Int:Obj?] map with different operations.

That's not the question. The question is whether a set is a list. A set is unordered, a list is ordered, a set can contain an element only once, a list can contain the same element many times, ...

brian Mon 25 Jan 2010

I think for all practical purposes a set is a list with certain constraints. Just like a table in a database might have primary key constraints or not. Maybe or maybe not you need ordering, index based access, etc. I agree with Andy, that virtually all the time once I built a set (typically with Map today), that I then end up passing it around as a list. So I naturally think of a Set as a List with a uniqueness constraint.

But I'm not going to do this feature immediately, so there is no rush to decide the best model.

Login or Signup to reply.