I'm getting pretty close to having the parser rewritten in Fan. One of the big trade-offs I made to solve grammar ambiguities was that you can't have local variables or slot names which conflict with an imported type. That solves a ton of problems to narrow down what an expression means when it starts with a type literal (something I can do now via my two-pass parser). I think that restriction should be ok, especially since our convention is slots and local variables should always start with a lower-case. Do you guys agree with that restriction?
andySun 30 Jul 2006
Yeah, thats sounds reasonable.
brianSun 30 Jul 2006
I reached a major milestone today - the parser has been ported/rewritten in Fan.
One thing I decided not to do (actually I did it, then ripped it out) was support commas in field declarations, local variable declarations, and in for loop init/update. While it is a nice convenience, I feel the confusion caused by how it combines with explicit and inferred typing is too confusing. With fields it is even more confusing when combined with getter/setters. I think inferred typing and the fact that most loops are closures will also help offset this need.
Now that the parser is written in Fan this provides another performance metric. If you remember the tokenizer's performance in Fan compared to Java was miserable - a 7.5 times degradation. The Fan parser had a more reasonable 40% degradation. Time to parse sysTest:
Java: 364ms
Fan: 607ms
So we can surmise from these numbers that Fan code working with typical application data structures runs about 40% slower compared to similar Java code. However code working with primitives like a Tokenizer is significantly slower. That isn't too surprising considering the biggest trade-off we made was to have all primitives boxed. My gut still tells me we have the right compromises in place.
brianWed 2 Aug 2006
I went back and did a bit more research and I wasn't calculating the performance metric right for the Java side of things. The actual numbers are:
Java: 168ms
Fan: 607ms
So Fan is 3.8 slower, better than the tokenizer but still a ways off from my goal of 2x performance hit. Although I have a lot of optimizations I want to implement with the new compiler.
brian Sun 30 Jul 2006
I'm getting pretty close to having the parser rewritten in Fan. One of the big trade-offs I made to solve grammar ambiguities was that you can't have local variables or slot names which conflict with an imported type. That solves a ton of problems to narrow down what an expression means when it starts with a type literal (something I can do now via my two-pass parser). I think that restriction should be ok, especially since our convention is slots and local variables should always start with a lower-case. Do you guys agree with that restriction?
andy Sun 30 Jul 2006
Yeah, thats sounds reasonable.
brian Sun 30 Jul 2006
I reached a major milestone today - the parser has been ported/rewritten in Fan.
One thing I decided not to do (actually I did it, then ripped it out) was support commas in field declarations, local variable declarations, and in for loop init/update. While it is a nice convenience, I feel the confusion caused by how it combines with explicit and inferred typing is too confusing. With fields it is even more confusing when combined with getter/setters. I think inferred typing and the fact that most loops are closures will also help offset this need.
Now that the parser is written in Fan this provides another performance metric. If you remember the tokenizer's performance in Fan compared to Java was miserable - a 7.5 times degradation. The Fan parser had a more reasonable 40% degradation. Time to parse sysTest:
So we can surmise from these numbers that Fan code working with typical application data structures runs about 40% slower compared to similar Java code. However code working with primitives like a Tokenizer is significantly slower. That isn't too surprising considering the biggest trade-off we made was to have all primitives boxed. My gut still tells me we have the right compromises in place.
brian Wed 2 Aug 2006
I went back and did a bit more research and I wasn't calculating the performance metric right for the Java side of things. The actual numbers are:
So Fan is 3.8 slower, better than the tokenizer but still a ways off from my goal of 2x performance hit. Although I have a lot of optimizations I want to implement with the new compiler.