Arc Forumnew | comments | leaders | submitlogin
1 point by stefano 5882 days ago | link | parent

> How about project-based development then?

I don't quite understand that. With pack.arc development is project-based. Maybe we have different opinions of what a "project" is. To me, it is a directory with a file proj.arc describing the structure of the project (a 'defproject declaration).

> write an interpreter from scratch

Really difficult but really needed. The mzscheme dependency is quite big compared to how small as a language Arc is. The main efficency problem, as you said, is function dispatch, because we have to check if it is a function, a list, etc. One thing I don't like very much about SNAP is the dependency on the boost libraries: it is a huge dependency. Is it really needed?

Another problem with an interpreter from scratch is the GC: it is very difficult and time consuming to write an efficient, concurrent and stable GC. A good solution would be to use the Boehm-Weiser GC: it is easy to integrate in any interpreter (I don't know if it works with SNAP's process' model, though) and it is a really good GC. Even the mono project and gcj use it.



2 points by almkglor 5881 days ago | link

> With pack.arc development is project-based.

Ah, right. Of course, that's why there's 'defproject, right?

> because we have to check if it is a function, a list, etc

As an idea: generally writes to global variables are much rarer than reads from global variables; in fact, practically speaking nearly every global variable is going to be a constant. We could move the cost of checking if a call is a function, a fake arc-f function, or a data structure to the writing of global variables rather than the read.

Basically calls where the expression in function position is a reference to a global variable are transformed to callsites which monitor that global. The callsite initially determines the type of the value in the global (or creates an error-throwing lambda if the global is still unbound) and determines the proper function to perform for that call (normal function call, or a list lookup, or a table lookup, etc). The callsite also registers its presence to the global.

If the global is written, the global also notifies all living callsites (we thus need weak references for this), which will then update themselves with the new value.

This is actually "for-free" in SNAP, because there's an overhead in reading globals (copying from the global memory-space to the process memory-space), and SNAP thus needs to monitor writes to globals so it can cache reads.

> One thing I don't like very much about SNAP is the dependency on the boost libraries: it is a huge dependency. Is it really needed?

The bits of boost I've used so far are mostly the really, really good smart pointers; while I've built toy smart pointer classes I'm not sure I'd want those toys in a serious project. Also, I intend to use boost for portable mutexes. Now if only boost had decent portable asynchronous I/O...

Alternatively we could wait a bit for C++0x, which will have decent smart pointers which I believe are based on boost.

> Boehm-Weiser GC: it is easy to integrate in any interpreter (I don't know if it works with SNAP's process' model, though)

Well, one advantage of the process-local model is that process-local memory allocations won't get any additional overhead when the interpreter is multithreaded; AFAIK any malloc() drop-in replacement either needs to be protected by locks in a multithreaded environment, or will do some sort of internal synchronization anyway. In effect we have one memory pool per process, allocating large amounts of memory from the system and splitting it up according to then needs of the process.

Since processes aren't supposed to refer to other process's memory, the Boehm-Weiser GC won't have anything to actually trace across allocated memory areas anyway.

And I probably should start using tagged pointers instead of ordinary pointers ^^. They're even implementable as a C++ class wrapping a union.

In any case a copying algorithm already exists because we need to copy messages across processes anyway: minor changes are necessary to extend it to a copying GC.

-----