#374 Value Types

brian Wed 15 Oct 2008

The next feature I plan to tackle is value-types or as Java calls them primitives. Part of this work will include the runtime checks for coercion from nullable to non-nullable (the two topics are closely related).

In a previous post I presented a brief, but incomplete design. Contrary to my previous post, we do need to make a distinction between value types and non-nullable types. My proposal is that Fan supports three implicit value types: Bool, Int, and Float. These types map to the runtime as follows:

Fan      Java                .Net
---      ----                ------
Bool     boolean             bool
Bool?    java.lang.Boolean   bool?
Int      long                long
Int?     java.lang.Long      long?
Float    double              double
Float?   java.lang.Double    double?

The difference between Java and .NET is that nullable types in Java are reference types, but not in .NET. But both Java and .NET do require boxing to a reference type when upcasting a type like Int to a Num or Obj. So I'll probably add three new fan opcodes for Box, Unbox, and CheckNull.

Java has a bytecode model where special opcodes are required for manipulating 64-bit types on the stack - this will actually be the most painful aspect of this process. I could just define a bunch of new fcodes for everything Int/Float related. Or probably what I will do is add type arguments to opcodes like Pop and Dup (that seems a little more elegant and future-proof). There are lots of similar cases such as selecting the right return JVM bytecode to use (in that case probably derived from the method signature's return type).

Once the basics are in place, there are lots of places where literals, comparison, and arithmetic can optimized.

Supporting value types is almost completely a compiler/fcode/runtime change. The major change to the Fan language was nullable types which we've already done. The one effect of this change to the language will be default values: value types cannot default to null. So this means that Bool values will default to false, and Int/Float to zero.

tompalmer Wed 15 Oct 2008

One fun thing to do would be to see about making Int[] and Float[] expand the values instead of storing as pointers. Might require some gymnastics but the benefit would be that you could use really huge arrays efficiently. Alternatively, requiring less gymnastics, you could have separate IntList and FloatList classes that expand in place. The other argument for separate types is that with really huge arrays, sometimes you explicitly want smaller sizes (such as 32-bits or less), and that obviously requires fancier APIs than just doing automatic gymnastics with Int[], for example.

brian Wed 15 Oct 2008

I'm definitely thinking about that. Although I'm thinking that work will actually be part of the interop work up next. I haven't talked about interop yet, but one of the things we'll need to do it handle arrays natively. I'm kind of thinking about something like @array @int32 Int[].

We also have this problem with List - for example Fan's list type can't implement java.util.List unless we change size() to return a 32-bit int. So that might be something like @int32 Int size(). By making it an annotation we don't screw ourselves in the future where collection sizes really should be 64-bit integers.

So the fundamental problem will be how to keep Fan's type system free from all the cruft like arrays and oodles of various numeric bit-widths, yet still annotate things correctly to efficiently map into the Java and .NET type systems.

tompalmer Wed 15 Oct 2008

I vote against explicit @int32 Int size() for lists. Maybe implement it that way secretly behind the scenes on Java (due to platform limitations, etc.), but I don't think you want to guarantee the limit across all platforms generically nor have it part of the API. Just my thoughts.

I know Scala faces some pain from their List being different from java.util.List, but I'm not sure whether or not Fan would be better using Java lists. Especially from the immutability issues (part of why it also matters for Scala). So I think unifying lists (and Java arrays, too) might not be so easy.

Similar issues presumably exist for .NET, though I know the world there much less well.

JohnDG Wed 15 Oct 2008

If ever we added toXXX/'fromXXX' auto-casting, then it would not be strictly necessary to derive from C#/Java collections to obtain the benefits of easy interoperability, at least in the C#/Java -> Fan direction (the other way would be somewhat of a pain, but is less likely since Fan is the newer language).

As for the table above, it looks exactly like I would expect.

brian Thu 16 Oct 2008

I've gotten a little deeper into this problem, and the difference between Java and .NET is a really big issue here. Let's look at some conversions:

Conversion       Java         .NET
-------------    --------     ----------
Bool => Bool?    box          call Nullable<bool> ctor
Bool => Obj      box          box
Bool => Obj?     box          box

Bool? => Bool    unbox        call Nullable.get_Value
Bool? => Obj?    nop          box
Bool? => Obj     check null   if (null) push null else box

Obj => Bool      cast/unbox   unbox.any bool
Obj => Bool?     cast         unbox.any Nullable<bool> 

null => Bool?    nop          initobj Nullable<bool>

In Java nullable value-types are object references, but not in .NET. So I can't figure out what Fan opcodes to define to map cleanly to both scenerios. So for now I've punted and replaced Fan's Cast opcode with a Coerce opcode which provides both the from and to types. Then we'll let the Java and .NET runtime sort it all out.

Here is a snippet of Fan code:

Int? i1 := 77
Int  i2 := i1
Str? s1 := "hi"
Str  s2 := s1
Obj? o2 := i1
Obj? o3 := i2
Obj? o4 := s1

This compiles into the following fcode:

0:  LoadInt             77
3:  Coerce              sys::Int => sys::Int?
8:  StoreVar            1
11: LoadVar             1
14: Coerce              sys::Int? => sys::Int
19: StoreVar            2
22: LoadStr             hi
25: StoreVar            3
28: LoadVar             3
31: Coerce              sys::Str? => sys::Str
36: StoreVar            4
39: LoadVar             1
42: Coerce              sys::Int? => sys::Obj?
47: StoreVar            5
50: LoadVar             2
53: Coerce              sys::Int => sys::Obj?
58: StoreVar            6
61: LoadVar             3
64: StoreVar            7

The only conversion which does not generate a coerce opcode is an upcast between two non-value types with the same nullable state.

brian Sat 18 Oct 2008

I've gotten primitive booleans working for the JVM runtime. This was a pretty painful process since it uncovered all the weird places the compiler wasn't coercing correctly. I expect from this point that long and double primitives should go much smoother. The trick with long and double is to find all the places which need to change to handle the double wide stack size.

From this point onwards the tip cannot be bootstrap compiled from the released 1.0.33 build. It requires bootstrapping thru changeset 7f715d0d5040. Soon as I get primitives fully implemented I'll do another clean build.

brian Mon 20 Oct 2008

I've gotten booleans, long, and double primitives all working in the JVM. I still have a couple boundary issues with switch statements and safe navigation to finish up. Then I need to optimize value-type comparisons. Overall I'm really happy with the design - although we still need to see how it works for .NET.

JohnDG Wed 22 Oct 2008

I'm eager to see the performance of this new release. I think Stephen wrote some program that heavily relied on integer math and was disappointed with the performance. It will be interesting to try that on the new compiler.

brian Wed 22 Oct 2008

I'm eager to see the performance of this new release.

Me too :-)

I think once I get the comparison stuff optimized, that we should be able to get pretty close to Java speeds for numeric computation. Although we will be doing everything with 64-bit types. I'm hoping HotSpot will inline static methods, otherwise I can use a more sophisticated emit process.

One area where we will still have issues is boxing up arguments to functions. Of course you don't really do that today in Java so it is kind of apples to oranges. Although that is an area where dynamicinvoke method handles could play an important role.

Login or Signup to reply.