#562 Introduce List.checked() and Map.checked()

qualidafial Fri 1 May 2009

Extracted from the discussion "object 'as' type: is this a bug or am I doing it wrong?":

Propose to add the following methods:

class List
{
  **
  ** Returns a type-checking view of this list which disallows adding
  ** elements of the wrong type.
  **
  List checked()

  **
  ** Returns whether this list performs type checking on elements before
  ** they are inserted.
  **
  Bool isChecked()
}

The fandocs need finessing but they get the point across for purposes of this discussion. :)

Similar methods could also be introduced into the Map class.

One thing I'm wondering about is how elements will be checked when of() is a nullable type e.g. Str?[].

brian Fri 1 May 2009

I'd like to propose something more generic - the ability to add a generic callback method to either list or map on add. This would be a special List constructor (since under the covers we wouldn't store the function on normal lists). I am not sure what to call it, let's say makeChecked:

List.makeChecked(Str#) |Str s| { if (s.size < 3) throw ArgErr("Str too short") }

You can easily compose checked lists with this API using either a cast or Type.fits (or we could create convenience).

Edit: although the problem with a constructor is that you don't get any type inference on the closure. So maybe an instance method on List which returns a copy of the list with the callback function:

list := Str[,].checked |s| { if (s.size < 3) throw ArgErr("Str too short") }

qualidafial Fri 1 May 2009

So maybe an instance method on List which returns a copy of the list with the callback function:

I had actually thought about something like this but I was trying to start simple.

I like the ability to specify an arbitrary check, but I think that the type-checking scenario is common enough to be supported directly the API. This could be implemented using a default argument to the checked method, or by adding a different method e.g. typeChecked.

A few other things:

  • Let's change the checked method signature to accept a |V, Int| instead of just |V| so that indices can be checked as well.
  • Does this mean it's possible to decorate a list with multiple distinct checks via nested calls? e.g.
    list.checked |v| { if (v.size) < 3) throw ArgErr("Str too short") }
        .checked |v, i| { if (i != 0) throw ArgErr("Adding is disallowed past 0") }
  • An (off topic) idea that just occured to me: could we use the same list decoration pattern to create a filtered view of a list e.g. list.filtered |v| { v is Str }

Returning back to the topic, we could use such an API to do nifty things like automatically wire up bidirection relationships like parent-child links:

Node[] children = [,].checked |v|
{
  v.parent = Node.this
}

In this scenario, checked is probably too specific a name--something along the lines of withCheck, addingWith or some such. So.. lots of possibilities to think about.

alexlamsl Fri 1 May 2009

Can't we just override List.add to provide checks when adding data to it?

qualidafial Fri 1 May 2009

Can't we just override List.add to provide checks when adding data to it?

The problem with overriding is that now you have to instantiate a specific subclass of List to add behavior. So you'd only be able to do this if you have direct control over the instantiation of the list. However if we add methods to List itself to support this then anyone can add their own specific checks.

I was thinking another option for the generic method approach would be to specify the error message as a first argument and have the func return a Bool indicating whether the element is allowed:

Str[] names = [,].check("Name too short") { it.size > 0 }

This is a little more concise and supports what I think would be the most common pattern. The tradeoff is that now you have a static error message instead of having fine-grained control over the error shown. Although you could always explicitly throw an error within the closure.

jodastephen Fri 1 May 2009

This is a specific example of a bigger issue - restricting the contents of types.

For example, I might want to define that I have a string which must have between 3 and 8 characters. Or a number that must be positive.

I'd like to define those as types, something like:

const RestrictedString {
  const Str value
  // validate in constructor
  ...
  // provide means to auto-convert to a string
  auto-convert Str toStr() {value}
}

The concept is that this class should operate just like a string (including being passed directly to a method that takes a Str), with auto-convert being called automatically.

Obviously, if you held a RestrictedString[], then all the strings within would be of the right length anyway without the list needing to do anything.

But what if you want to add behaviour to List/Map itself - creating effectively a RestrictedList class. Well, thats not really possible, as you can't subclass or wrap them and retain the benefits of the literals and other special List/Map goodness. Because of this, there will always be pressure to add "hooks" like the validation one proposed here to List/Map. Thats why I bring in the broader "auto-convert" concept, as it should apply to List/Map too, allowing a RestrictedList class to be created and still benefit from the language level treatment of lists/maps.

Overall, it is better to have a "hook" than have none at all. But we need to consider the deeper issue as to whether List/Map as they stand are too restrictive, since we can't really declare our own versions.

andy Fri 1 May 2009

I actually like both qualidafial and Brian's proposals. A type-check List/Map seems something common enough to include in the API. However, its not something I have ever wanted so far, but do I like the idea of the checked method.

Having said that, I think Brian's proposal is of much more use, and something I would probably use quite often. And since it encapsulates the type-check case, it might be good enough on its own.

Though I need to take some time and grok this whole discussion to make an educated decision.

brian Fri 1 May 2009

After sleeping on it, I am not sure this is the right path to go down. Certainly at times, I've debated about allowing you to subclass List or attach functional hooks to a List. But it is a such a core built-in type with so much special behavior regarding type parametrization and concurrency, that I've always sought other design patterns. So I guess I'm leery of adding an arbitrary hook into a List.

The design pattern I've been using today is basically to never expose a List directly outside of a class unless it is List.ro. This allows you to craft the exact type safe API you want to expose. A perfect example is fwt::Widget and how it deals with children components. But it is so easy to return a "safe list", it still lets you access all the functionality built-into list, but with controlled modifications. So I think that is the pattern we should stick with for Fan.

Regarding the type check, I do want to point out that having a "checked" list doesn't do anything to solve the problem we discussed related to as. In fact it won't work at all with casting of the entire list. Consider:

nums := Num[1, 2, 3]
ints := nums as Int[] // ints === num
ints.add(4) // type check is still for Num, not Int

KevinKelley Fri 1 May 2009

I think that the checked idea sounds neat, but I'm having a feeling that it's a little bit of a solution in search of a problem, maybe. As far as the type-checking:

nums := Num[1,2,3]      // Num[]
ints := (Int[]) nums    // still Num[], underneath
echo(ints === nums)     // true

nums[0] = 1             // okay
nums[1] = 2.0f          // okay
ints[0] = 3             // okay
ints[1] = 4.0f          // compile-time error

If it were possible to have a completely correct, strongly-typed type system, I'd probably be in favor of that. But it's never been done correctly, afaik. All the languages that try, eventually end up backing you into a corner where you aren't allowed to do something that ought to be just fine. So, imo, a more loose type system is the best trade-off between correctness and expressive power.

qualidafial Fri 1 May 2009

I think that the checked idea sounds neat, but I'm having a feeling that it's a little bit of a solution in search of a problem, maybe.

The original discussion that led to this proposal had to do with casting arrays and what we protections the language will or will not provide.

There are two different types of casts being discussed, and each scenario has different levels of safety:

  • Widening conversions e.g. Obj[] objs := Str["a","b","c"]
  • Narrowing conversions e.g. Str[] strs := Obj["a", 1]

Widening conversions are safe in the sense that all elements in the widened list are guaranteed to be instances of the widened type. They are also unsafe in the sense that runtime modifications to the widened list will allow you to insert objects of the wrong type--this would not be possible on the narrow list reference. Example:

Fan Shell v1.0.41 ('?' for help)
fansh> a := Str["a","b","c"]
[a, b, c]
fansh> Obj[] b := a
[a, b, c]
fansh> a === b // same instance
true
fansh> a[0] = 1 // wrong type
fan.fansh.Var@b2002f ?= a[0]
fan.fansh.Var@2a4983 ?= a[0]
ERROR(2): Invalid args set(sys::Int, sys::Str), not (sys::Int, sys::Int)
fansh> b[0] = 1 // valid for declared type but wrong for runtime type
fan.fansh.Var@b2002f ?= b[0]
fan.fansh.Var@2a4983 ?= b[0]
[1, b, c]
fansh> a.each { echo (it) } // now what?
sys::CastErr: java.lang.ClassCastException: java.lang.Long cannot be cast to java.lang.String
  fan.sys.List.each (List.java:509)
fansh> b.each { echo (it) } // valid for declared Obj[] type but not for runtime type
1
b
c

Narrowing conversions are safe in the sense that runtime modifications to the narrowed reference will implicitly restrict you the correct type. They are also unsafe in the sense that you are gambling that all elements in the list are instances of a more specific type than is guaranteed by the array's runtime type. Example:

Fan Shell v1.0.41 ('?' for help)
fansh> a := ["a", 1]
[a, 1]
fansh> Str[] b := a
[a, 1]
fansh> a === b
true
fansh> a.each { echo (it) }
a
1
fansh> b.each { echo (it) }
a
sys::CastErr: java.lang.ClassCastException: java.lang.Long cannot be cast to java.lang.String
  fan.sys.List.each (List.java:509)
fansh> a.add(2)
[a, 1, 2]
fansh> b.add(3)
ERROR(3): Invalid args add(sys::Str), not (sys::Int)
fansh> b.add("b")
[a, 1, 2, b]

So widening casts are safe when the list is used in read-only scenarios, whereas narrowing casts are safe when the list is used in write-only scenarios. This mismatch is, I think, what led the Java generics team to introduce all those horrible ? extends E vs ? super E wildcards.

Instead of emulating java generics, I think java arrays give us a good jumping off point:

  • Java protects against bad narrowing casts by throwing a RuntimeException whenever the cast is narrower than the array's runtime type. Fan's current policy is permissive and allows you to cast to any type that could possibly be compatible, but you are casting at your own risk. I'm okay with this trade-off but I think it needs to be documented more thoroughly in the fandoc.
  • Java always allows widening casts on arrays, however you can't put bad elements into the array because elements type-checked at runtime before they go into the array. Fan allows widening casts as well, but does not offer any runtime type-checking. So you are only protected if the declared slot/variable type is the same as the runtime type. I'm not so okay with this, but I can understand folks being worried about the performance hit. Hence the suggestion to add a checked method so the paranoid among us can sleep a little better. :)

I want to clarify that my proposal is that checked will wrap the list in a decorator that performs some sort of check on new list elements before forwarding the method call to the original list--that's it. I am not proposing that every list be burdened with runtime type-checking as a default. This is an opt-in approach to protecting data before sending it to untrusted code.

tompalmer Thu 11 Jun 2009

I think that the checked idea sounds neat, but I'm having a feeling that it's a little bit of a solution in search of a problem, maybe.

In #633, we've raised the issue that we can't ever (without breaking compatibility) optimize Float[], Int[], and Bool[] into packed value arrays unless we enforce of at runtime.

Also, there's the suggestion that if we can control the underlying JVM (or .NET) array type backing up the list, then the virtual machine might be able to do all the dynamic checking for us. Except that JavaScript would still need the extra manual check, and any null checking would also presumably have to be done manually.

Login or Signup to reply.