#337 Java/C# "native" integration

brian Tue 19 Aug 2008

Continuation of:

Today integration with native Java and .NET libraries is implemented by the native/peer architecture. The current design is for efficiently binding Fan APIs to the underlying platform in a portable way. It was not designed as a general purpose way for applications to interact with native APIs (when portability is not a concern).

These are the major problems which must be solved to do integration cleanly:

  • The JVM and CLR each come with their own type system which is different than Fan's.
  • Both JVM and CLR have primitives which Fan does not understand. Primitives are thorny (especially in Java) because they are a type system anomaly (they don't have real classes)
  • Both JVM and CLR have arrays which are another type system anomaly
  • Both JVM and CLR have different packaging and namespace designs
  • Both JVM and CLR support method overloading by parameters which Fan can't handle

To me the basic three designs available:

  1. Pre-stub Java/C# APIs into Fan APIs:
    1. Pros: easy, flexible, can introduce extra meta-data such as coalescing method overloading into one Fan method with defaults
    2. Cons: not as friendly to use for quick, one shot stuff
  2. Compile time: either allow Fan to access JVM/CLR type system directly (like Scala), or allow as embedded language (like asm in C):
    1. Pros: easy to use, might be able to replace existing native design
    2. Cons: much, much more complicated
  3. Dynamically: map to Java APIs at runtime using "->" operator
    1. Pros: easy to use, great for prototyping like using Groovy or JRuby
    2. Cons: not as efficient, no static type checking

JohnDG Tue 19 Aug 2008

This really is an important topic, because no matter how nice Fan's libraries may be (and they certainly are nice!), they can't compete with the 1 billion lines of free Java code out there.

To thrive in this environment, I think Fan needs to be able to use all that code -- to play really well with the languages that Fan is designed to replace.

(1) is not an option for me because the reality is that Java is going to be around for a very long time, and the typical application incorporates a dozen or more third party libraries, each of which is updated one or more times a month. I can't imagine stubbing every new version of every JAR (and then hoping the stubbing is done in such a way that small JAR changes are not amplified into huge Pod changes -- this is a very real concern because there is not 1-to-1 correspondence between Fan and Java).

(3) is not an option for me because of performance reasons and developer pushback against dynamic languages. Groovy has attracted less than 1% of Java developers, and that's with perfect Java integration. Even if you can address performance issues (which might be possible -- pnuts is quite fast and Java 7 may come to the rescue), there's still the dynamic language issue. I don't know much about C# developers, but I can say that the vast majority of Java developers hate dynamic typing.

Which leaves (2). I think you can simplify the issue by not translating types. Perhaps you can do on-the-fly Pod virtualizations for a given classpath, which would represent a hybrid with (1), albeit with greater user-friendliness. Tool support should be kept in mind, too -- vendors aren't going to want to learn the translation rules, they'll want to work only in Pods.

andy Tue 19 Aug 2008

My main problem with #2 (besides the complexity of implementing it), is that it doesn't seem like it would work very well for native-heavy libraries. Take a look at the inet pod source, for example, where 90% of the code is native. And the more different the Java and C# implementations are, the worse that gets. I think #1 will always be the best option here. But I will keep thinking about it.

However I do agree that there should be a simple "in-place" way to access native code for short or off-the-cuff cases. I think #3 makes the most sense here - if for no other reason then we skip two additional compile processes. If performance matters, you can always go back to using #1.

So right now I vote we just add #3 and keep #1 - but I'm open to be swayed.

alexlamsl Tue 19 Aug 2008

I was thinking along the lines of Java's Scripting API solution (JSR 232), but then Java / C# / VB.NET / .... won't fit that pattern so easily.

Then I wonder whether we actually should worry about introducing any features at all. Today many of us (myself included) works on projects which often ended up utilising multiple languages and language platforms, and there are already so many interoperable technologies out there.

Most of these technologies are based on Socket (isn't it?), and I couldn't think of any other needs when mixing languages / platforms other than exchange of data between them.

(I guess I am not being constructive here...)

mrmorris Tue 19 Aug 2008

#2 feels like the "right solution" no? I fear for next go-round of language impedance mismatch, i.e. extension to the current trend in Java of embedding languages inside type-unsafe tokens, whether it be SQL in Strings or EJB-QL in annotations.

Possibly I don't fully understand the implications of #2 though and it's easy to prefer the hard one as an outsider. So take this comment with a grain of salt.

jodastephen Tue 19 Aug 2008

I support #1 and #2 but not especially #3. I see the dynamic aspects as not being what integrating with Java/.Net as being really about - the static types need to be maintained. I see #1 as probably remaining for those cases where proper low level integration is required. ie. the current approach seems to be working OK for the current problems.

However, I'd really like to see #2 (embedded language blocks) fleshed out and working. It provides the ability to write fully formed DSLs with no artificial workarounds. Proper SQL integration, native XML blocks, XPath, native JSON blocks, whatever.

As I said, I see the implementation as being a plugin compiler that can parse the block and generate fcode (or maybe directly generate bytecode). The variables from the scope are made available for calling, with simple mapping of the basic types (Str to String, Int to Integer, etc.)

cgrinds Thu 21 Aug 2008

For mostly the same reasons Andy outlined, I like #3 and #1.

As jodastephen mentions #2 does introduce interesting DSL possibilities that would be interesting to explore. My gut says that #2 probably isn't needed in the near term, but long term would probably be useful to add. It certainly isn't needed for a 1.0.

katox Mon 25 Aug 2008

#3 is seemingly the least-effort solution to get to the large existing codebase

#2 as embedded language blocks sounds also interesting, but it is not really needed in the short term

#1 would be most probably impossible to manage (it can be done for selected packages, though)

brian Tue 26 Aug 2008

#1 would be most probably impossible to manage (it can be done for selected packages, though)

I agree that one of the big problems with this approach in Java is going to be that Java lacks modularity and has created one huge ass monolithic "rt.jar". Even worse, internally Java has circular dependencies between packages which makes organizing Fan stubs by packages difficult. But I think a manual mapping of J2SE packages to Fan pods can be done, then most jars would be a one-for-one mapping. The cool thing about that is that pods can contain Java bytecode, so you could actually use Fan pods to manage deployment of the Java bytecode too (instead of separate jars).

Login or Signup to reply.