[MLton] MLton and shared libraries

Jens Axel Søgaard jensaxel@soegaard.net
Thu, 14 Apr 2005 16:48:41 +0200


Matthew Fluet:

> Generating shared libraries with MLton is not a supported option at this 
> time.  More than likely, the segmentation fault you see is arising from 
> the shared library not starting up the MLton runtime properly -- any SML 
> function that gets called needs a SML heap available to service allocation 
> and garbage collection.  I suspect that somehow you need to indicate to 
> gcc/ld that what would have been main() in the executable should be the 
> init code of the shared library.

Your suspicion is without doubt correct.

> That isn't to discourage, but I think it will take a little hacking of
> MLton to properly generate shared libraries.  The two obvious obstacles to
> overcome are 

> 1) getting the shared library to initialize the SML heap and
> run all the top-level effects (such as allocating and initializing
> globals) of the program with exported functions, and 2) ensuring that the
> suffix of the top-level program does not exit, but instead enters a state 
> where the program is waiting to service exported function requests.

Reading up on shared libraries and studying the compiler source, I realize
that the problem is a bit harder than I initially thought. I still hope
that it is doable (although I might need to wait for the summer holiday).
The following is an attempt to "think aloud" - if I am on the wrong track,
please correct me.

Since it is the job of the operating system to do the dynamic linking
when a program depending on shared libraries is run, it is important
to know the format of object files used. In the case of Linux the
format is ELF.

The normal GNU tools such as know how to generate ELF files, but there
are a few things to be aware of.

First of all the code should be position independent[1]. On systems
with file memory mapping this allows the memory pages containing the
library to be shared between several processes thus reducing memory
usage.

Each library can have a constructor and a destructor. As Terpstra
wrote the GNU tool chain follows the convention that exported
functions named _init and _fini are constructor and destructor.
However, Drepper in "How to Write Shared Libraries", state that
this wrong (the _init and _fini can override the system initialization
and destruction). In stead the constructor and destructor should be
marked with a function attribute:

   void
   __attribute ((constructor))
   init_function (void)
   {
   ...
   }


   void
   __attribute ((destructor))
   fini_function (void)
   {
   ...
   }

In general Drepper's HowTo is quite thorough.


Attempting to grok the native backend of MLton would be futile for me,
so I decided to take a look at the C-backend.

The main() function is in the file "mlton/include/c-main.h" [3] which
contains:

int main (int argc, char **argv) {					\
	struct cont cont;						\
	Initialize (al, cs, mg, mfs, mmc, pk, ps);			\
	if (gcState.isOriginal) {					\
		real_Init();						\
		PrepFarJump(mc, ml);					\
	} else {							\
		/* Return to the saved world */				\
		nextFun = *(int*)(gcState.stackTop - WORD_SIZE);	\
		cont.nextChunk = nextChunks[nextFun];			\
	}								\
	/* Trampoline */						\
	while (1) {							\
  		cont=(*(struct cont(*)(void))cont.nextChunk)();		\
  		cont=(*(struct cont(*)(void))cont.nextChunk)();		\
  		cont=(*(struct cont(*)(void))cont.nextChunk)();		\
  		cont=(*(struct cont(*)(void))cont.nextChunk)();		\
  		cont=(*(struct cont(*)(void))cont.nextChunk)();		\
  		cont=(*(struct cont(*)(void))cont.nextChunk)();		\
  		cont=(*(struct cont(*)(void))cont.nextChunk)();		\
  		cont=(*(struct cont(*)(void))cont.nextChunk)();		\
	}								\
}

Putting the Initialize in a separate function ought to be easy (I hope).

But what about the trampoline?

My test function

   val e = _export "f": int -> int;
   val _ = e (fn (i) => 42);

results in the following piece of code:

   Word32 f (Word32 x0) {
           MLton_FFI_op = 0;
           MLton_FFI_Word32_array[0] = x0;
           MLton_callFromC ();
           return MLton_FFI_Word32_array[0];
   }

And MLton_callFromC is also in c-main.h :

int nextFun;								\
bool returnToC;								\
void MLton_callFromC () {						\
	struct cont cont;						\
	GC_state s;							\
									\
	if (DEBUG_CCODEGEN)						\
		fprintf (stderr, "MLton_callFromC() starting\n");	\
	s = &gcState;							\
	s->savedThread = s->currentThread;				\
	s->canHandle += 3;						\
	/* Switch to the C Handler thread. */				\
	GC_switchToThread (s, s->callFromCHandler, 0);			\
	nextFun = *(int*)(s->stackTop - WORD_SIZE);			\
	cont.nextChunk = nextChunks[nextFun];				\
	returnToC = FALSE;						\
	do {								\
  		cont=(*(struct cont(*)(void))cont.nextChunk)();		\
	} while (not returnToC);					\
	GC_switchToThread (s, s->savedThread, 0);      			\
  	s->savedThread = BOGUS_THREAD;					\
	if (DEBUG_CCODEGEN)						\
		fprintf (stderr, "MLton_callFromC done\n");		\
}

Ah! So callFromC switches to a special thread for handling calls
to exported functions, finds the function to call at the stack
and starts a trampoline, constantly checking whether we are to
return to the C level.

What was is Matthew said, I should look out for?

 > 1) getting the shared library to initialize the SML heap and
 > run all the top-level effects (such as allocating and initializing
 > globals) of the program with exported functions,

Hm. Does Initialize handle "allocating and initializing globals"?
Let's see. Initialize is a macro defined in /include/main.h, and
it initializes "system variables". It ends in a call to MLton_init.
The function MLton_init lives in /runtime/gc.c and initializes
the garbage collection and retrieves the command line arguments.

Nothing about globals yet, but the next part of main is:

	if (gcState.isOriginal) {					\
		real_Init();						\
		PrepFarJump(mc, ml);					\
	} else {							\
		/* Return to the saved world */				\
		nextFun = *(int*)(gcState.stackTop - WORD_SIZE);	\
		cont.nextChunk = nextChunks[nextFun];			\
	}								\

The real_Init looks promising, and in codegen/c-codegen.fun [4] one sees

       fun declareReals () =
	 (print "static void real_Init() {\n"
	  ; List.foreach (reals, fn (g, r) =>
			  print (concat ["\tglobalReal",
					 RealSize.toString (RealX.size r),
					 "[", C.int (Global.index g), "] = ",
					 RealX.toC r, ";\n"]))
	  ; print "}\n")

begin part of the larger function outputDeclations. The treatment of
globals depend on their type - I am a little unsure whether all
globals that can't simply be output as C literals are initialized
in the same way as the reals. Studying c-codegen.fun some more
will probably clear this up for me.

The other thing to worry about was:

 > 2) ensuring that the
 > suffix of the top-level program does not exit, but instead enters a state
 > where the program is waiting to service exported function requests.

to which Weeks added

 > I hope this can be easily achieved using Thread_returnToC, as it is
 > done in SML functions exported via _export.  See the "register"
 > function in basis-library/mlton/thread.sml.

Is this neccessary if one puts the initizalization part of main()
into a separate function?


A tangential question: Is it possible for main() to be called
several times (perhaps due to multiple threads)? Otherwise the
if in:

	if (gcState.isOriginal) {					\
		real_Init();						\
		PrepFarJump(mc, ml);					\
	} else {							\
		/* Return to the saved world */				\
		nextFun = *(int*)(gcState.stackTop - WORD_SIZE);	\
		cont.nextChunk = nextChunks[nextFun];			\
	}								\

seems unneccesary.





[1] The gcc option is -fpic . The manual for gcc says about -shared
     that "Only a few systems support this option". Hmm. One should
     probably use both - just in case.

[2] "How to Write Shared Libraries", Ulrich Drepper, RedHat Inc
     <http://people.redhat.com/drepper/dsohowto.pdf>

[3] <http://mlton.org/cgi-bin/viewcvs.cgi/mlton/mlton/include/c-main.h?rev=1.12&content-type=text/vnd.viewcvs-markup>

[4] 
<http://mlton.org/cgi-bin/viewcvs.cgi/mlton/mlton/mlton/codegen/c-codegen/c-codegen.fun?rev=1.101&content-type=text/vnd.viewcvs-markup>


-- 
Jens Axel Søgaard