#1192 Formalize type inference rules with parameterized generics

vkuzkokov Tue 31 Aug 2010

The program:

mixin Parent { }
mixin Child : Parent { }
class Impl : Child
{
  static Parent parent() { Impl() }
  static Child child() { Impl() }
  static Void main() {
    echo([parent,child].typeof.toStr)
    echo([child,parent].typeof.toStr)
  }
}

results in output:

check::Parent[]
sys::Obj[]

So the type inferred on list depends on order of items. Also this can break backward compatibility while narrowing type (which happens even more often with inference). Consider refactoring:

Parent doSth() => Child doSth()
[doSth,parent]:  Parent[] => Obj[]

Now when some method was accepting Parent[] we can't pass this list without explicit type.

What I'm thinking of is to limit inference for list and map arguments to classes, because in hierarchy:

mixin A { }
mixin B { }
mixin C : B,A { }
mixin D : A,B { }

there is no unquestionable way to infer type of list of C's and D's.

vkuzkokov Tue 31 Aug 2010

Also:

class Main
{
  static const Num num := 5
  static Void main()
  {
    echo([[num:"a"],[5:"b"]].typeof)
    echo([[5:"a"],[num:"b"]].typeof)
  }
}

results in:

[sys::Num:sys::Str][]
sys::Map[]

I believe these two bugs are easier to be resolved at once.

brian Tue 31 Aug 2010

Promoted to ticket #1192 and assigned to brian

The current algorithm is coded in the CType.common method. I don't think we take mixins into account at all (so maybe there is just a bug there).

This is probably good point to discuss the formal inference rules for finding the most specific common type in a list of types. Then once we decide on something we can live with, I can change that code.

So what should the formal rules be?

I think we want to take into account mixins if there no other common class types. That would be better than just using Obj.

But should common class types always trump common mixin types?

Should public versus internal be considered?

vkuzkokov Tue 31 Aug 2010

Again with

mixin A { }
mixin B { }
mixin C : B,A { }
mixin D : A,B { }
class Impl : C, D {
  C getC() { Impl() }
  D getD() { Impl() }
  echo([getC,getD].typeof)
}

If I know that mixins are allowed to be inferred I would expect the result to be:

check::B[]

because it is the first ancestor of the first element, therefore it's more equal than A.

Historically, some issues of C were resolved in a way "It's how K&R compiler works". It breaks the principle of least surprise for people not familiar with the compiler implementation or how compilers are implemented.

vkuzkokov Tue 31 Aug 2010

Found one more test:

echo([|->Num|{5},|->Int|{5}].typeof)
echo([|->Int|{5},|->Num|{5}].typeof)
echo([|Int->|{},|Num->|{}].typeof)
echo([|Num->|{},|Int->|{}].typeof)

which results in

|->sys::Num|[]
sys::Func[]
|sys::Int->sys::Void|[]
sys::Func[]

I'm more interested in latter two. While return type should be taken as the closest ancestor (like in list and map), in functions arguments should get closest common child.

Solution I see is to check whether any of types is a descendant of all the others. Otherwise, we have generic Func. Still this solution leaves us with a curious case, when narrowing return type in uninheritable interface can break client code.

a: |Int->|  =>  |Num->|  // function still works with Ints, but more general
                         // As a library maintainer I consider it a good thing
b: |Float->|
[a,b] Func[] => |Num->|  // used to accept all functions, now doesn't

vkuzkokov Wed 1 Sep 2010

Also generic parameters don't seem to infer well:

echo([Num[5],[5]].typeof)
echo([[5],Num[5]].typeof)

results in:

sys::Num[][]
sys::List[]

vkuzkokov Wed 1 Sep 2010

There's also inference in ternary operator

[(check ? 4 : 4.0)].typeof  =>  sys::Num[]

with all the same ambiguity when used with mixins.

brian Thu 9 Sep 2010

Renamed from Type inference ambiguity with mixins to Formalize type inference rules with parameterized generics

brian Thu 9 Sep 2010

Ticket resolved in 1.0.55

The fundamental issue here was formalizing the rules for how to compute the "common" type among types which are themselves parameterized generics. This was a bit of a dark corner that needed flushing out.

In the case of Lists there is a clear mechanism to derive the common class since it is parameterized in only one dimension. In maps and functions however we have multiple dimensions of parameterization. So I think the best solution is the simplest solution - don't try to figure it out (either they are all typed exactly or else we fallback to generic Map/Func).

The new rules are now specified as follows....

This section formally defines the rules for how the compiler performs type inference on lists, maps, and the ternary operator:

// list of expressions
[v1, v2, ...]  =>
  V = common(v1, v2, ...)

// map of expressions
[k1:v1, k2:v2, ...] =>
  K = common(k1, k2, ...)
  V = common(v1, v2, ...)

// ternary operator
cond ? t1 : t2  =>
  common(t1, t2)

Type inference of collections is based on a function we call common which is used to find the most common base class among a list of types. The following algorithm is used to compute the common type:

  1. if the list of types is empty return Obj?
  2. if the list of types has only one item, return that type
  3. if any one type is nullable, then the result is nullable
  4. if none of the types is a parameterized generic, then find the most common class which all the types share; we take only classes into account, mixins are ignored
  5. if any one of the types is a parameterized generic then:
    1. if all the types are parameterized Lists, then compute the common V type
    2. if all the tyeps are parameterized Maps then:
      1. if all have the exact same signature, then use that type
      2. use sys::Map
    3. if all the types are parameterized Funcs then:
      1. if all have the exact same signature, then use that type
      2. use sys::Func
    4. if none of the above holds true, then use Obj

dmoebius Tue 14 Sep 2010

I retested the issues raised by vkuzkokov with 1.0.55. Here's what I found:

mixin Parent { }
mixin Child : Parent { }
class Impl : Child
{
  static Parent parent() { Impl() }
  static Child child() { Impl() }
  static Void main() {
    echo([parent,child].typeof.toStr)
    echo([child,parent].typeof.toStr)
  }
}

still results in the output:

check::Parent[]
sys::Obj[]

So the type inferred on a list still depends on the order of items. This should be fixed IMHO.

class Main
{
  static const Num num := 5
  static Void main()
  {
    echo([[num:"a"],[5:"b"]].typeof)
    echo([[5:"a"],[num:"b"]].typeof)
  }
}

now results in:

sys::Map[]
sys::Map[]

because rule 5.b.ii. of the common function is in effect. That's ok, I guess. Just a short check if 5.b.i. works:

echo([[num:"a"],[num:"b"]].typeof)
echo([[5:"a"],[5:"b"]].typeof)

results in:

[sys::Num:sys::Str][]
[sys::Int:sys::Str][]

Works as expected. Now this one:

mixin A { }
mixin B { }
mixin C : B,A { }
mixin D : A,B { }
class Impl : C, D {
  C getC() { Impl() }
  D getD() { Impl() }
  echo([getC,getD].typeof)
}

infers sys::Obj[] because mixins are no longer considered.

The function tests:

echo([|->Num|{5},|->Int|{5}].typeof)
echo([|->Int|{5},|->Num|{5}].typeof)
echo([|Int|{},|Num|{}].typeof)
echo([|Num|{},|Int|{}].typeof)

now all resolve to sys::Func[] because of rule 5.c.ii. This works as expected.

Generic parameters seem to be inferred better now:

echo([Num[5],[5]].typeof)
echo([[5],Num[5]].typeof)

now results in:

sys::Num[][]
sys::Num[][]

so this has clearly been fixed.

Summary: all issues have been fixed except the first one.

brian Tue 14 Sep 2010

Summary: all issues have been fixed except the first one.

dmoebius, thanks very much for taking the time to verify those cases!

I definitely missed that mixin case. I pushed a fix and beefed up the test case.

changeset

Login or Signup to reply.