#200 Not null by default

jodastephen Sat 19 Apr 2008

My final issue for now is 'not null by default, and it is by far the most important for me. I see it as the biggest design issue preventing me from saying this is a great language ;-)

In case you don't know what I mean, its that all variables should only be able to hold non-null values by default. In order to hold a null you need to use an additional syntax, such as Str? or Str!.

Non-null has been researched and shown to be the most common programmer intent. The key advantage is that it prevents the vast majority of NPEs. The NPE is by far the most common runtime exception in production code, and yet is a solved problem with the not-null technique.

Unfortunately, it seems that adding not-null to fan may be difficult, as things like local variable initialisation use null for example.

One solution may be to build in the Null Object Pattern.

Some random ideas here would include

  • an isNull() method as an operator overload for == null:
  • default values defined in each class
  • special syntax to define the null object as a hidden subclass of every class

The next really popular language really needs to have not-null as the default - it just avoids so many errors. I hope you can consider changing fan to this approach.

Stephen

tactics Sat 19 Apr 2008

I agree having a "nullable" higher order type is a very useful feature of a language, but it's a pretty fundamental thing in a language, and for Fan, it would probably require massive changes to the libraries. Not to mention, then, that you have to be able to "lift" nonnullable methods to nullable ones when they have to interact with non-nullable methods.

Also, you'd probably want to use a cleaner implementation and syntax. The idea works nicely in ML-based languages, but there's a lot of details to be worked out for inclusion in a Java/Ruby derivative.

brian Sat 19 Apr 2008

I've definitely thought about something like Str! to indicate a nullable type, and originally it was part of my plan. But as things have progressed I've started moving more to the dynamic end of the spectrum, and trying to keep the type system really simple. And I agree that NPE do happen, but they tend to be spotted early and trivial to fix (say versus something nefarious like a heisenberg race condition).

So although I'd say the issue is still open, I definitely lean against adding the extra complexity to the type system. It seems the experience would kind of be like achieving const correctness in C++ which is a pain in the butt when you don't care about it.

That said, I also think there still tends to be a fair amount of null checking boiler plate which deserves some syntax sugar. For example something like Groovy's elvis operator or C# null coalescing operator. I just haven't spent much time thinking about it yet - so ideas would be great!

jodastephen Sun 20 Apr 2008

I agree that a blind non-null implementation could be as annoying as checked exceptions. So we need to design something better :-)

Here is a use case for why its a problem. In my development team, we have a coding standard that the javadoc of all method parameters and method results must specify whether they accept null or return null. But this isn't something that should be javadoc only - its compiler level knowledge.

What if we added the ! suffix to types to indicate that they may contain null:

Str a  - cannot contain null
Str! a  - can contain null

Method parameters can then be defined using these types:

Int! parseInt(Str str) {
  ...
}

Internally, the converted bytecode for this method will check that the input str is non-null before proceeding.

JAVA:

public Integer parseInt(String str) {
  checkNotNullAndThrow(str, "Str");
  ...
}

What I think we need is a way such that the calling code does not have to be checked:

{
  Str! s = null;
  Int result = parseInt(s);
}

Maybe this is something as simple as only using the ! not-null symbol for method parameters/results and fields (not other local variables, and not checked at compile time).

In addition, I think the Groovy ?. operator would be very useful.

Stephen

helium Sun 20 Apr 2008

Edit: I don't understand why my whole post is "preformated".

Perhaps you shoudl have a look at the Nice programming language.


OR let's look at what can be done in different languges:

Java:

Phonenumber phonenumber = null;
Department department = company.getDepartment("Foo");
if (department != null) {
    Leader leader = department.getLeader();
    if (leader != null) {
        Phone phone = leader.getPhone();
        if (phone != null)
           phonenumber = phone.getPhoneNumber();
    }	
}

Very verbose. A lot of clutter that has nothing to do with task itself.

Haskell:

phonenumber = do
    department <- getDepartment "Foo" company
    leader     <- getLeader department
    phone      <- getPhone leader
    return $ getPhoneNumber phone

A lot less code that only describes what you want to achive only using normal language elements. You can use just the normal monad syntax you use all the time.

Scala:

phonenumber = for(
    department <- company.getDepartment("Foo"),
    leader     <- department.getLeader(),
    phone      <- leader.getPhone())
    yield phone.getPhoneNumber();

Just the same with a little different syntax. Just the normal for-comprehension-syntax.

Groovy:

phonenumber = company?.getDepartment("foo")?.getLeader()?.getPhone()?.getPhoneNumber();

A specialised language element only for this special purpose. It's shorter than Haskell or Scala but less powerfull. (You cant use the intermediate result in later expressions, but it's not needed in this case anyway.)

Ruby:

phonenumber = company.andand.getDepartment("foo").andand.getLeader().andand.getPhone().andand.getPhoneNumber();

Made possible by metaprogramming.

brian Sun 20 Apr 2008

Helium: the plaintext is formatted like markdown: http://www.fandev.org/doc/docLib/Fandoc.html. Although I've seen a couple bugs where we have duplicate text - so sometime this week I will take a look at a look at some of these posts to see what is happening.

The issue I see with marking types as nullable or not doesn't seem easily tackled without tackling all variables. It seems like it would worm its way into all the code with lots of null/nonnull casting and be one of those things that you really have to deal with when you are just prototyping and don't care (aka checked exceptions). It also seems like it would add an awful lot of complexity.

One thing we could do is annotate with a @nonnull facet, and allow compiler lint like plugins. I'll also take a look at Nice (its been a long time since I've looked at it).

brian Sun 20 Apr 2008

I added to roadmap doc:

Can we find a simple solution to enhance the type system to support nullable/non-nullable types? It would be nice to have some syntax sugar for checking nulls.

andy Sun 20 Apr 2008

I'm not sure I see the benefit of that (unless I am missing something). You still have to denote an "uninitizaled" value somehow - like using -1 for a Java int. So code expecting a valid value will still fail - or worse will silently work but produce the wrong results. Using null seems much more elegant than some arbitrary value - and is actually something I really like about Fan.

I do however think (and we've discussed it before) we should have some syntax sugar that makes dealing with null checking easier.

jodastephen Sun 20 Apr 2008

Hmmm, my proposal meant that method parameter variable could be relied on to be not-null (unless qualified with !). Local variables would still initialize to null if you want, and methods could return nulls if required.

Thus, this doesn't affect your examples of using null instead of -1 to mark the end of a stream. The API would just be changed to be something like:

Int! readChar() {
  ...
}

The caller can still check for null to find the end of stream.

Thus, the proposal means that (a) you can document method parameters and return types using language syntax rather than documentation, and (b) the compiler will check that a method parameter declared as non-null really is non-null before you get to use it. I think that could be rather elegant.

andy Mon 21 Apr 2008

Ah, yeah, I didn't read your second post close enough. I understand now, that would be nice to have if it works out cleanly.

tompalmer Sat 14 Jun 2008

+1 on the general idea. But despite possible syntax issues, it needs to be ? instead of ! for nullable. That's what other languages with similar features use.

cbeust Sat 14 Jun 2008

Brian,

As a point of reference, take a look at Scala's Option type (here is an overview: http://blog.lostlake.org/index.php?/archives/50-The-Scala-Option-class-and-how-lift-uses-it.html).

I'm not saying how I feel about it in order not to influence you, but I'd be interested in hearing your opinion.

-- Cedric

brian Mon 16 Jun 2008

My feelings on this feature are that I really like the general idea. But when I think about implementation I get nervous. It seems like something which can easily devolve into a hell like checked exceptions.

There are two ways to approach the problem and I've heard:

  1. As Cedric pointed out something like Haskell's Maybe, Scala's Option class
  2. Something like Nice's ? which to me is kind of like const in C and C++

I'm not thrilled with either. Scala's approach (as I grok it) is to generate a special null instance for every type. Since I haven't programed using a language that does that, I don't know what that it is like to use. To an outsider (me) it doesn't seem to really solve the problem all that well.

Using some type of marker such nonnull or ? seems more elegant at the outset. But, then I can see a lot of ugly casting (or some like syntax) to please the compiler.

My general philosophy has been to keep the type system simple as a compromise between static and dynamic languages. I'm not opposed to adding something like this to the type system, but it to be really well designed - and I don't personally have any great ideas.

For further discussion consider this use case: a Fan convention is to provide an optional checked parameter. When checked is true a failure raises an exception, and when checked is false the method returns null. I use pattern for all the parsing methods and just about every lookup method. That is a case where we might return null, but if using checked then I know that either I'm getting non-null or an exception, so I don't want the compiler making be do busy work.

tompalmer Mon 16 Jun 2008

My thoughts: Scala's Option is effectively a list of up to one item, and it's not bad to use. But I think ? is better for something like Fan.

But I also don't recommend Nice rules, either. In fact, I think all casting (including between nullable and not-nullable) should be transparent to the programmer. Duck-typed languages show that people mostly know what they are doing without being reminded. When I tried this in Fan, and it worked, I was super happy:

Int i := 1
Obj obj := i
Num n := obj

I was also happy to see Str str := obj throw a casting error but not require me to say I was casting. I'd love to see that behavior carried further to automatic downcasting everywhere the types are seemingly compatible. And I think the same can apply to nullable types. If you ever try to assign a null to a non-nullable type, a null check is automatically inserted, and you get an error thrown if it's null. But you aren't forced to check manually for not-null as required in Nice.

I'm not sure I like the checked idea, but the return type would have to be ? for a case like that. My main concern with not-null is data structures where usually not-null is good, but what if it's data from a user form or otherwise sparsely populated, and an Int needs to be nullable unexpectedly? (For the user form case, my answer is that all the freeform fields should be represented as strings, at least in the intermediate forms, and empty strings work great for representing that. But a few data transformations away, you might be to Ints again.)

helium Mon 16 Jun 2008

And I think the same can apply to nullable types. If you ever try to assign a null to a non-nullable type, a null check is automatically inserted, and you get an error thrown if it's null. But you aren't forced to check manually for not-null as required in Nice.

You explicitly tell the compiler: Hey this is an optional value, maybe it contains a value maybe not and than you can use it where non-optional values are required without giving a default-value or something?

In which situations do you think you prefere a null pointer exception over a default value? The language simply needs an operator like C#'s ??.

definitlyHasAValue := mightHaveAValue ?? defaultValue

If I want an exception to be thrown something like this would be cool:

definitlyHasAValue := mightHaveAValue ?? throw SomeException.make("WTF is goning on?")

tompalmer Mon 16 Jun 2008

I'm informal like that. I see explicit types as making analysis, refactoring, and performance easier. If duck typing works mostly well enough (witness plenty of examples in dynamic languages), then I don't think it's necessary to force people to the extremes of static type checking in every way possible. Just so long as you can accurately identify a type in any particular case, that gets the job done for me. And it seems to me like Fan has somewhat of the same mentality.

But I agree that helpful language or API features for dealing with null values conveniently could still be nice.

JohnDG Tue 17 Jun 2008

I don't think anything is needed here. That said, a default null object for every type would be nice. A null object has two properties: any method of the null object returning another object will return the null object for that type, down to primitives (in languages that have them), which will return anything at all; and the null object of some type is not equal to (or the same as) any other object, except for itself.

Null objects are nice do nothing objects that simplify code even while they eliminate many NPEs. They're also great for developing automated tests.

brian Wed 18 Jun 2008

This continues to be a great discussion, although I don't think we've made much progress other than what I've written up in the roadmap:

Can we find a simple solution to enhance the type system to support nullable/non-nullable types? It would be nice to have some syntax sugar for checking nulls.

We all agree that some type of enhancement to the type system would be nice, but no elegant solution has emerged that I can put my finger on.

I definitely still think we need some syntax sugar like C# ?? or Groovy's ?. and ?: operators.

helium Wed 18 Jun 2008

So what do we have?

Null by default and optional not-null types by appending ! or not-null types by default and optional nullable types by appending ?.

Null throws NPEs or a message-eating null like in objective-c.

tompalmer Wed 18 Jun 2008

Here's the programming convention I want to avoid:

  1. Programmer usually forgets that nulls exist (except when wanted at arbitrary times) and codes as if everything is non-null.
  2. Random code paths lead to unexpected null pointer exceptions.
  3. Scared and confused, programmer adds arbitrary null-checking statements wherever it seems might fix the issue, cluttering code and perhaps not solving all the problems anyway.

Note that the third step can be an issue (though less ugly) with handy operators like ?.. Either make . behave like ?. (which I think has risks of unexpected behavior), or let people know they are safe by defaulting to non-nullable.

I just want a low overhead way to let programmers know that "everything will be all right".

tompalmer Thu 19 Jun 2008

And one more way to look at it. I see a subtype as a subset. Int is a subtype of Int?. Downcasting assignment throws CastErr if incompatible. Going from Int? to Int is effectively downcasting. As such, an equivalent attempt to assign a null to Int should throw a NullErr. The concepts are very analogous.

katox Sat 30 Aug 2008

Also, if nullable ? is introduced, then we could change the meaning of as like so (and potentially get rid of other forms of explicit inline casting):

a := x as Int // throws NullErr if null and CastErr unless Int

b := x as Int? // same behavior as current Fan/C#

+1 to this Tom's idea. I personally dislike paren-casting since I've seen it for the first time in C. Things are really hideous when typing expressions like

((Bar)((Foo)foo.getObj())?.doBarObj())?.bar()

I would be much more readable to specify the same as

((foo.getObj() as Foo?)?.doBarObj() as Bar?)?.bar()

As the parens can be dropped it could be shortened to

(foo.getObj as Foo?)?.doBarObj as Bar?)?.bar

though the last case confuses me a bit because it is harder to visually differentiate local objects (foo) from slots (getObj, doBarObj, bar).

Login or Signup to reply.