#665 Design Proposal: Fan Repos

brian Sat 11 Jul 2009

This is a design proposal to address ticket 425 "Separate pod install location" along with a general purpose framework for distributing, installing, and managing Fan software artifacts.

Repo

A Fan repo is a directory which stores files to encapsulate a runtime environment including both library artifacts and configuration/session artifacts.

A repo directory contains three primary top-level parts:

some-repo-dir/
  repo.props
  lib/
    fan/
    java/
    dotnet/
    js/
  etc/
    sys/
       pod.props
       ext2mime.props
    build/
       pod.props
    flux/
    podA/
    podB/

All repos contain a file called "repo.props" which is a Fan props file as defined by sys::InStream.readProps. I choose this format because it is simple enough to be read by bash scripts or simple C programs. Although potentially XML or Fan serialization might be alternatives.

The "lib" directory is used to organize libraries by platform. For example this is where we stick Fan pods, Java jars, .NET DLLs, native libraries, etc.

The "etc" directory is used to organize configuration by pod. Files like sys.props and flux configuration migrates to the etc directories.

Working Repo

Every Fan VM is booted off exactly one repo we call the working repo. The repo is configured:

  1. As command line switch: fan -repo {dir} {main}
  2. If no switch, then check FAN_REPO environment variable
  3. If no environment, then use repo of Fan launcher itself (the core distro repo)

Questions:

  • in previous discussions we talked about using "user.home" as a default repo - I am not sure about that versus requiring it to be explicit. Comments welcome.

Repo Inheritance

A repo can delegate to additional repos using a mechanism called repo inheritance. We configure this in "repo.props":

repo.def.distro=/opt/fan-1.0
repo.def.gac=/opt/fan-libs
repo.inherit=gac,distro

Not sure about the syntax, thinking that "repo.def.{name}" declares a name/directory pair for a repo, and "repo.inherit" is a list of repo names to inherit.

Questions:

  • should we always include repo of launcher like "system level repo", or make it explicit? do we consider the launcher part of a given repo or more like how JVM uses whatever is configured in Registry (on Windows)
  • does the idea of assigning a name to a repo make sense? If so should we require that all repos have a unique name in the flatten inheritance list?

Priority Order

When resolving information for a given Fan VM, we consult the working repo which may inherit zero or more repos. The order of inheritance is important - we call this priority order. The working repo always get top billing in priority order followed by inherited repos in order they were listed. If a repo is inherited multiple times then its priority order is determined by its first usage.

Pod Resolution

The list of Fan pods which encapsulate the runtime as returned from sys::Pod.list is the list of pods found in "lib/fan" in priority order.

For example if a "foo.pod" is found in both the working repo and an inherited repo, then the version in working repo is used. Maybe make that a warning? The "fan -pods" dump will be enhanced to dump repos.

Etc Resolutions

Each pod gets their own directory in "etc" under the working repo. In addition, there will be reflective access to the inherited repos to do implement various merge/override mechanisms for configuration. By default there will be a simple lookup mechanism (file based override) and prop merge mechanism (selective name/value pair overrides for props files). If we do the symbol feature it will replace props files most likely. In the future we can add a Fan serialization merge mechanism.

Core files found today in "lib" will be moved:

  • sys.props => etc/sys/pod.props and etc/build/pod.props (merge override)
  • ext2mime.props => etc/sys/ (merge override)
  • log.props => etc/sys/ (merge override)
  • timezones.ftz => etc/sys (file override)
  • units.fog => etc/sys (file override)
  • types.db => etc/sys (generated in working repo)

In addition the flux configuration will migrate to "etc/flux" with file based override capability.

New APIs

New reflection support for repos:

const class Repo
{
   static Repo working()
   Repo[] inherits()  // flattened list of inheritance   
   File dir()
}

const class Pod
{
  Repo repo() // what repo was this pod loaded out of
  Str:Str props(Str filename := "pod.props")  // merged props
  File etcDir() // etc dir in working repo
  File? etc(Uri path, Bool checked := true) // find file in etc via priority order
}

We might also potentially expose names for use via reflection. It might be cool especially for tools to lookup the "gac" repo. But to enable that we probably want to require all repos be given a unique name in the entire flattened list.

Distribution from Cloud

Although I am not quite ready to tackle cloud based distribution, that is a one of the things I hope to eventually address with with this design. I have some basic ideas, but not flushed out:

  • new tool called "fanr" used to manage repos
  • "fanr" can query/download cloud repos to local repos
  • new build tasks to sync local repos with remote repos
  • new "podx" format which can explode files into the lib/etc directory on installation (when single pod file doesn't cut it)

Like I said, just some basic ideas, nothing concrete. But I think the proposed repo design will suite cloud distribution well.

Typical Setup

I'd expect the simple common use case will be to utilize two repos:

  • "rel" repo that uses released Fan build
  • "dev" repo that stores all the src/pods/config for stuff in development (or maybe in production this is called the app repo)

To setup this scenerio:

  1. Install release into /somewhere/fan/rel
  2. Create /somewhere/fan/dev/repos.props:
    repo.name.rel=/somewhere/fan/rel 
    repo.inherit=rel
  3. Set FAN_REPO=/somewhere/fan/dev/

If we decide to make repo of Fan launcher implicit then potentially skip step 2.

Misc Notes

Some things must be defined before JVM is launched such as the native "java.library.path" and core JVM classpath. Right now this happens via executable written in C on Windows and via fanlaunch Bash script on Unix. Things get way more complex now if we want to allow any repo to add to the JVM native/jar classpath. So do we want that? If so I will probably need some help on Bash side. Or we could provide a C launcher for every platform (but I can't cross compile easily now). Other ideas how to launch?

I am thinking that we could potentially allow a working repo to declare named repos but not actually inherit them. Then allow a script to use a named repo in its schebang - this might be a more elegant solution for bootstrap compile where we require a full installation to build the core. Another approach I am considering is just to require you to setup two environment variables then have a batch/bash script which does the buildboot/bootpods.

So what do you guys think?

mr_bean Sat 11 Jul 2009

This is a pretty extensive design. I am still trying to grok it all.

What I am not quite sure of is suppose I have a pod in repo/dir/etc/1, can it call/link to a pod in another repo dir?

As to your default "user.home" repo, I like the multilayered defaults. The sequence you suggest works for me.

cheeser Sat 11 Jul 2009

I don't like the idea of having to explicitly list the distro repo, to be honest. That'd make deployment a hassle as you'd need to update the app distro to find the fan distro. I'd rather have the distro repo as an implied repo. But other than that, this looks OK after a first glance...

brian Sat 11 Jul 2009

@mr_bean - thanks for the feedback

What I am not quite sure of is suppose I have a pod in repo/dir/etc/1, can it call/link to a pod in another repo dir?

The intent of the design is to create a virtual directory image of pods and config files which spans multiple physical directories. But once the the repos are defined, they are treated as a flat namespace. So to answer your question: you can load any pod from the working repo and any repos your working repo inherits.

As to your default "user.home" repo, I like the multilayered defaults. The sequence you suggest works for me.

Are you voting that "user.home" is the implicit working repo or that you have to manually define it via FAN_REPO?

@cheeser

I don't like the idea of having to explicitly list the distro repo, to be honest.

So repo associated with launcher is always implied? I would say it is implied to be last on in priority order - agree?

Also do you think this design works for TradeWinds? Side note: I keep seeing that name all over Maine.

cheeser Sat 11 Jul 2009

I think that'd work for tradewinds. The way that the ruby supports, from what I can tell, is that you configure glassfish with where the ruby runtime is. Then you can simply deploy the dir where your ruby app is. This is what I'm shooting for with tradewinds as well. Then deploying a fan app on glassfish would be as simple as unzipping something (or maybe still leaving it zipped. i have yet to write that deployer) and let the system take over from there. Ideally there'd be no configuration needed in the deployed app at all. At least, as far as pod paths/repos are concerned.

andy Sun 12 Jul 2009

+1 on implied distro repo (bottom of priority order sounds correct)

KevinKelley Sun 12 Jul 2009

Does this have impact on compiling resources into a pod? Maybe icons/graphics; maybe default configurations.

Pods using zip-file format, there's already some mechanism for packaging file-based resources. Is/should the /etc/ lookup default to pod-internal resources, overridable by external files? Then a distro could be mostly just a pod file, that would use local storage under the /etc/ location for user configs and overrides.

Just as a passing thought, I wonder about merging Java jars into pods, in a way that they could be accessible to the Java runtime.

brian Sun 12 Jul 2009

Does this have impact on compiling resources into a pod? Maybe icons/graphics;

Currently resources are in the zip are accessible via sys::Pod.files. We can potentially open that up to automatically expose files in the /etc of the repos too. Although it would be pretty trivial to build solutions with the individual APIs like Pod.files and Pod.etc (or maybe just add a convenience). I guess I am inclined to leave resources alone until we get a little experience under our belt.

But regarding the core symbols in each pod, then yes my plan is it that they live in the pod zip file, and you only use /etc for overrides.

Just as a passing thought, I wonder about merging Java jars into pods, in a way that they could be accessible to the Java runtime.

Potentially, but they'd probably have to be extracted. But pods themselves are valid jars too since they are just zip files. This is actually how Java native code is loaded (straight from the pod file).

tompalmer Mon 13 Jul 2009

Mostly seems decent. Are you sure it's not possible to use a modified (after JVM start) native library path environment variable for controlling where native code is loaded from? I thought I'd seen that as being possible before, but I can't remember for sure. It would give you lots more flexibility for nice handling of (OS-level) native code, if you can do that.

Overall the design seems decent, by the way. I haven't thought enough about specifics beyond my comment above, however.

Well, maybe a question about whether it might be possible to store different versions of the same pod in the same repo, or should developers be expected to do a good job of renaming their pod (e.g., pod2) if they break compatibility?

qualidafial Mon 13 Jul 2009

Here is my stream-of-consciousness as I read through the proposal and comments. Sorry if any of my thoughts are half-baked or half-expressed. I blame the caffeine.

  • I think repos should be named, and that repo.props is a good place to put this name.
  • I like having inherited distros enumerated by name in repo.props. However I would prefer the term "dependencies" or just "depends" over "inherited."
  • I don't like putting hardcoded file paths in repo.props:
    • This will hamper shareability in the same way that sys.props currently has to be tweaked to my local file paths every time I get a fresh clone from mercurial. This is particularly annoying when team members are using different operating systems e.g. Windows and Linux have very different filesystem paths.
    • Two projects may use a common repo, yet depend on different versions of that repo's dependent pods. Using hard-coded file paths will force you to duplicate those repos instead of sharing the common one, possibly splintering the maintenance effort on that common repo.
    • Accordingly, I would prefer the repo list to be a command-line argument. e.g. fan -repos ~/dev/foo:~/dev/bar:~/dev/baz foo
      • This would require types.db to be generated independently for each repo, so that swapping out a downstream repo does not yield an inconsistent type database for either configuration.
  • System level repo should be implicit but could be overridden on command-line via --bootrepo.
  • Rename Repo.inherits() to Repo.list() (like Pod.list)?

brian Tue 14 Jul 2009

This is due to signature Type.facet(Symbol symbol, Obj? defVal = symbol.defVal). The consensus seemed to be that we should change the defVal argument to null.

I don't think I've ever been able to get that to work - but if anyone knows how please let me know (that would make life way easier).

Well, maybe a question about whether it might be possible to store different versions of the same pod in the same repo, or should developers be expected to do a good job of renaming their pod (e.g., pod2) if they break compatibility?

We talked about this on IRC - my take is that things need to be versioned together or pods need to be renamed (maybe with a tool to do it via fcode without source).

I like having inherited distros enumerated by name in repo.props. However I would prefer the term "dependencies" or just "depends" over "inherited."

I think we should use a different term than depends since that is term is used with pods to mean something slightly different. But I'd like to hear other comments.

This would require types.db to be generated independently for each repo, so that swapping out a downstream repo does not yield an inconsistent type database for either configuration.

The problem with this is that it becomes very difficult to tie a unique repo list configuration to a cached type database. The type database is a merge of all the pods, so it can't be done per downstream repo. That was my original decision for setting up repos with config files versus command line.

System level repo should be implicit but could be overridden on command-line via --bootrepo.

yeah, I think we've reached that consensus, although not sure I am going to allow you to change bootrepo since by that time sys.jar has already been loaded (unless we push more smarts into non-JVM launcher which always sucks).

Rename Repo.inherits() to Repo.list() (like Pod.list)?

As a static method, that might be a little more consistent actually.

mr_bean Wed 15 Jul 2009

@brian

I originally commented:

As to your default "user.home" repo, I like the multilayered defaults. The sequence you suggest works for me.

You responded:

Are you voting that "user.home" is the implicit working repo or that you have to manually define it via FAN_REPO?

My answer:

Sorry, I expressed that poorly. I meant to say I like this sequence you proposed above:

As command line switch: fan -repo {dir} {main} If no switch, then check FAN_REPO environment variable If no environment, then use repo of Fan launcher itself (the core distro repo)

brian Sat 18 Jul 2009

I think for starters I am going to keep it simple with just support for one or two repos. But with this design we should be able to add additional repos easily. I want to get the kinks worked out with a simple solution first.

jodastephen Sun 19 Jul 2009

Quick thoughts, Overall seems good.

I dislike the directory name etc though. It looks like pods would make more sense.

I also remain of the opinion that structured namespacing of pod names will be needed, to avoid clashes (I strongly dislike the comFooBar type names for this and would prefer com.foo.bar).

brian Sun 19 Jul 2009

Thanks for taking a look Stephen

I dislike the directory name etc though. It looks like pods would make more sense.

I picked "etc" because I think it is really similar to the etc dir in Unix - a place to stick "host wide configuration files", etc (see Wikipedia). Although I am open to other suggestions. However I don't think "pods" works because if anything "the pod files" themselves are under "lib".

(I strongly dislike the comFooBar type names for this and would prefer com.foo.bar

I don't think we will ever allow dots to be used in pod names because it would cause havoc with the grammar. Yes Java uses dots (which I am sure you are used to), but they are not structured in Java - a package name is just a single identifier that happens to includes dots. I think identifier style is mostly a matter of taste - my favorite is dashes (Lisp style), but that doesn't work with the grammar either. But pod names do have to be a valid Fan identifier, and convention right now is to use camel case across the board for Fan identifiers. Other options would be underbar or do some new double symbol that would be unambiguous.

Login or Signup to reply.