Arc Forum | Well, my answer to the module problem is to just use hyper-static scope. Vastly ...

Arc Forum

1 point by Pauan 4456 days ago | link | parent

Well, my answer to the module problem is to just use hyper-static scope. Vastly simpler and vastly faster while providing all the benefits of a solid module system.

As far as Arc is concerned, there would need to be a couple changes:

---

1) A distinction between creating a new variable and assigning to an existing one. In other words, like Scheme's define vs set!

Personally, I would use (var foo 1) to mean "create a new variable foo in the current scope" and (= foo 1) to mean "assign to the already existing variable foo"

---

2) Mutually recursive functions would be a bit trickier, but I already solved that problem in Nulan by introducing a new macro called "defs". Conveniently, Arc already has a "defs" macro, but it's pretty much unused. It could be easily repurposed for mutually recursive functions.

---

Once again, though, both these changes break compatibility with Arc 3.1. But I think they would be very good changes to make. In fact, I think they're so good, I would make Arc/Nu hyper-static, if I still programmed in Arc.

If you like, I can go into more detail about the benefits/drawbacks of hyper-static scope, and also give some details about one way to implement it.

3 points by Pauan 4456 days ago | link

I just realized: it's possible to add in hyper-static scope to Arc while retaining full backwards compatibility. (Crazy, no?)

Here's how you do it. The definition of "=" is the same: if the variable exists, mutate it, otherwise create a new variable.

But now you add in a new primitive called "var", which always creates a new variable, even if it already exists.

Existing Arc code uses "=", so it will get the normal dynamic vars, but new code can use "var" to get hyper-static scope. And the two play nicely together, which lets you intermix dynamic/hyper-static scope as much as you want.

Oh yeah, another thing... this whole "var" thing actually allows for mutually recursive functions without using "defs", so I guess it's the ultimate compromise: all the namespace benefits of hyper-static scope with all the conveniences of dynamic scope.

---

The way to implement this is easy. You have a hash table (or similar) at compile-time (per scope) that maps symbols to symbols. With this Arc program...

  (= foo 1)

...the hash table looks like this:

  foo -> foo

Not very exciting: it simply maps the symbol "foo" to itself. But now let's use "var":

  (var foo 1)

Now the hash table looks like this:

  foo  -> foo2
  foo2 -> foo2

Which means that whenever the compiler sees the symbol "foo", it will replace it with "foo2". So that means that this Arc program...

  (var bar 1)
  bar
  (var bar 1)
  bar
  (var bar 1)
  bar

...will get replaced at compile-time with this:

  (= bar 1)
  bar
  (= bar2 1)
  bar2
  (= bar3 1)
  bar3

The only issue is handling "var" inside functions, like this:

  (fn ()
    (var bar 1))

In that case, the "var" is local to the function, similar to using "let". But I'm sure you've already got all that worked out, since function arguments work.

---

In any case, you probably aren't yet sure why hyper-static scope is so useful. If you want, I'd be happy to go into more details about it.

-----

1 point by rocketnia 4456 days ago | link

"I just realized: it's possible to add in hyper-static scope to Arc while retaining full backwards compatibility . (Crazy, no?)"

I think you forgot the main reason why backwards-compatibility isn't very feasible: Macros.

  ; ===== util.file ====================================================
  
  (var fun-map
    (fn (func . seqs)
      ...
      ))
  
  ; Lets you write (map x seq (+ 1 x)) in place of
  ; (fun-map (fn (x) (+ 1 x)) seq).
  (var map
    (mc (x seq . body)
      ...
      ))
  
  
  ; ===== this-is-fuuun.file ===========================================
  
  (import util)
  
  (var fun-map (list "  | O | X"
                     "--+---+--"
                     "X | X | O"
                     "--+---+--"
                     "  | X |  "))
  
  (pr (map line fun-map (+ line "\n")))

Arc macros behave as though macroexpansion were simply about constructing some lists of symbols. But we really want each macro-inserted symbol to be looked up in that macro's lexical environment.

I think you've been resolving this by writing macros so that the procedures are inserted directly into the macro result, rather than referred to indirectly by symbols. Right? I don't remember how you succeed at doing this when you compile your code to JavaScript. Maybe I'm thinking of two separate languages? Anyway, some previous discussion of this approach is at http://www.arclanguage.com/item?id=14849.

---

In Penknife, I handled macros by taking advantage of an existing assumption I was making about modules: Assume there are no side effects during the loading of the program, so that we can record the macroexpansion results to a file as a precompiled program without corrupting the program's behavior. Then any code that had a macro in scope at compile time will have a doppelganger of that macro in scope at run time. Whenever we encounter a variable during program execution, we can resolve it by looking up a macro value, accessing its lexical environment, and repeating until the original variable binding is in scope.

Penknife didn't really embrace the hyper-static global environment, but it would have been built upon the same sort of basis: Each file would have started in its own fresh environment, and some commands (namely, imports) would have worked by replacing the current environment.

---

"The definition of "=" is the same: if the variable exists, mutate it, otherwise create a new variable."

The behavior I'd use is that any compile-time variable access (even under a lambda) creates a new, uninitialized variable binding if a binding doesn't already exist.

  ; Create bindings for 'even and 'odd, then set the value of 'even.
  (= even (fn (x) (case x 0 t (~odd:- x 1))))
  
  ; Set the value of 'odd.
  (= odd (fn (x) (case x 1 t (~even:- x 1))))

If you wait to create the variable bindings until assignment time, then even's reference to "odd" is initially unbound, and you have to somehow associate it with the binding of 'odd created in the second line.

-----

2 points by Pauan 4456 days ago | link

"I think you forgot the main reason why backwards-compatibility isn't very feasible: Macros."

I didn't forget: Nulan completely solved the macro hygiene problem after all. But that's a more extensive change so I figured I'd save it for after the basic hyper-static scope system is in place.

In fact, assuming Arcueid does implement my proposal, I would actually go in and make a new version of arc.arc that uses "var" and has hygienic macros by default. Then you could simply load up the new arc.arc to get all the shininess. But loading the old arc.arc would have full compat with existing Arc 3.1 programs.

---

"I think you've been resolving this by writing macros so that the procedures are inserted directly into the macro result, rather than referred to indirectly by symbols. Right?"

Nope. Macro hygiene in Nulan just uses the already existing box implementation. It's really easy, really simple, and really fast. Seriously, boxes are awesome. No need to complicate things.

The way to solve it in Arc: just provide a function called "get-variable-box" which is only available at compile-time and it returns the box for the variable.

Then you change quasiquote so it uses "get-variable-box" rather than inserting the symbol directly. Bam, hygienic macros with no additional runtime cost, and extremely small compile-time cost. And they look and feel just like Arc macros, so you don't lose any power or convenience. No clunky Scheme macros, huzzah!

Once I understood that the fundamental problem was with dynamic scope, and the best way to solve it is with boxes (or similar), everything became super easy and awesome.

---

"The behavior I'd use is that any compile-time variable access (even under a lambda) creates a new, uninitialized variable binding if a binding doesn't already exist."

Yeah I'd do that too, if I wanted to graft dynamic variables onto a hyper-static system. But since Arc uses dynamic variables, I proposed to graft hyper-static onto it instead.

-----

1 point by rocketnia 4453 days ago | link

"But that's a more extensive change so I figured I'd save it for after the basic hyper-static scope system is in place."

The middle ground doesn't seem worthwhile to me. When programmers work with with Arc-style unhygienic macros, at each use site, the variables in scope must (mostly) match the variables the macro author expected. So I think people who like using macros will be happiest if they systematically keep the variable names consistent across all the code in their program (even others' code), at which point namespace mechanisms just get in the way.

---

"Nope. Macro hygiene in Nulan just uses the already existing box implementation . It's really easy, really simple, and really fast. Seriously, boxes are awesome. No need to complicate things."

I think you caught me on a technicality. :) I see "procedures are inserted directly into the macro result" as a general approach. Mutable boxes make it possible for this approach to achieve late binding. Elsewhere in this discussion you tilt the technicality closer to my phrasing, since you recommend to let users build boxes out of getter and setter procedures.

Anyway, I'm a fan of that approach when it works, but it doesn't work so well when compilation is involved: The macroexpanded code contains unserializable values--namely, the procedures or boxes we're talking about. This is a lesson I learned with Penknife, where I at first had macros insert boxes, and then had to reengineer this so macros inserted step-by-step treasure maps for how to find a variable from the run time environment.

---

"Yeah I'd do that too, if I wanted to graft dynamic variables onto a hyper-static system. But since Arc uses dynamic variables, I proposed to graft hyper-static onto it instead."

How do you make the even/odd code work? Under the approach you described, the first line refers to an undefined variable (odd), and I interpret that as an error. I was recommending a fix.

-----

1 point by Pauan 4452 days ago | link

"The middle ground doesn't seem worthwhile to me."

Retaining Arc compatibility in general doesn't seem worthwhile to me, but a lot of people want it, so I gave a system that retains Arc compatibility while tacking on some new shininess. Nulan doesn't have to worry about Arc compatibility, so it has pure hyper-static scope and hygienic macros by default.

---

"How do you make the even/odd code work? Under the approach you described, the first line refers to an undefined variable (odd), and I interpret that as an error. I was recommending a fix."

Easy: I have a macro called "defs" that handles mutual recursion:

  (defs
    even (x)
      (if (is x 0)
        t
        (odd (- x 1)))
    odd (x)
      (if (is x 0)
        nil
        (even (- x 1))))

The above macroexpands into this:

  (var even)
  (var odd)
  (= even (fn (x)
    (if (is x 0)
      t
      (odd (- x 1)))))
  (= odd (fn (x)
    (if (is x 0)
      nil
      (even (- x 1)))))

Basically, it first creates new boxes, and then it assigns the functions to the boxes. This is one of a few reasons why I prefer mutable boxes over immutable boxes. Though, you could probably have "defs" expand to a Y-combinator instead, if you really wanted immutability...

---

"I think you caught me on a technicality. :)"

Maybe it was just a simple miscommunication. What you were talking about sounded exactly like the technique of splicing function values using macros:

http://www.arclanguage.org/item?id=14507

What I'm talking about happens entirely at compile-time using boxes. The effect is very similar, but the implementation is very different.

---

"Anyway, I'm a fan of that approach when it works, but it doesn't work so well when compilation is involved: The macroexpanded code contains unserializable values--namely, the procedures or boxes we're talking about"

Sure, if I cared about serialization, I'd have to make it more complicated. Thankfully, the only serialization I care about right now is compiling to JavaScript code, which is easy enough to do with variable renaming.

Also, what's the point in serializing boxes since functions still can't be serialized? If you found a way to serialize functions, then it'd be much more useful to be able to serialize boxes.

-----

1 point by rocketnia 4451 days ago | link

"Easy: I have a macro called "defs" that handles mutual recursion"

While I appreciate 'defs, it's a non-answer. The even/odd example I posted and the evenp/oddp example dido posted are idiomatic Arc code. While you and I don't care much about Arc compatibility, it's something dido wants for Arcueid, so these examples should work without modification.

---

I'm about to disagree with myself, but first I want to reiterate and clarify what I was saying at "caught me on a technicality":

For this discussion I don't see a much of a reason to distinguish between macros which insert mutable boxes and macros which insert functions. Either system can pretty much support the other as a special case: We can translate spliced boxes into spliced getter/setter functions, and we can translate spliced functions into spliced functions-in-the-box. Because of that equivalence, these systems share the disadvantage of being challenging to serialize.

If dido considers compilation to be important (do you, dido?), then this hygiene approach might be unsuitable, and thus the use of first-class namespaces might be unsuitable. (As I explained at "match the variables the macro author expected," first-class namespaces make hygiene more important.)

---

"What I'm talking about happens entirely at compile-time using boxes."

Ah. I think you have a point!

For compiling Nulan to JavaScript, I guess the boxes you're using aren't arbitrary getter/setter functions, and they aren't merely some mutable container either; they're globally associated with a JavaScript variable name. When you compile the macroexpansion result and it contains a (get-variable-box ...) form, you decide on its JavaScript variable name at that time. If the macroexpansion result contains a box, you use the attached variable name to compile it to JavaScript. Am I getting this right? This sounds very workable. :) And whaddayaknow, Nulan works. ^_-

I seem to remember understanding this before, when you and I talked about Nulan compilation in depth, but I guess I had to retrace the steps just now.

Anyhow, get-variable-box is fantastic IMO, but first-class namespaces still might not be ideal for Arcueid due to Arc's unhygienic macros.

dido, are you comfortable with breaking existing Arc macro idioms in favor of hygiene?

---

I have a convoluted but surprisingly comprehensive idea of how to integrate get-variable-box into a system that's compatible with unhygienic Arc macros, but I've put it in a separate simultaneous post: http://arclanguage.org/item?id=17464

Actually, it's two separate posts, because it's otherwise too long for the forum. If this becomes a tl;dr scenario, I won't be surprised. ^_^

-----

3 points by Pauan 4450 days ago | link

"While I appreciate 'defs, it's a non-answer. The even/odd example I posted and the evenp/oddp example dido posted are idiomatic Arc code. While you and I don't care much about Arc compatibility, it's something dido wants for Arcueid, so these examples should work without modification."

For this example, let's suppose there was a file "foo.arc" that contained idiomatic Arc code that implements evenp/oddp. This code works in Arc 3.1. It will work in my system as well, because undefined symbols automatically create new boxes. Basically, it'll work, but name collisions are possible, just like in Arc 3.1.

If you then write a new file "bar.arc" that uses hyper-static idioms (var, defs, etc.), it can import "foo.arc" and everything will work fine. "foo.arc" will clobber any existing evenp/oddp definitions, but "bar.arc" will not clobber "foo.arc". And of course "bar.arc" can use "w/include" and "w/exclude" to prevent "foo.arc" from clobbering things.

If you wanted to make it so that "foo.arc" behaves correctly without needing to use "w/include" and "w/exclude", you would indeed need to rewrite it to use "defs". But it's still usable even without a rewrite. So it's a perfectly graceful degradation.

My system is designed so that it can correctly use all existing Arc 3.1 code, while new code is written with the hyper-static idioms. Then, slowly, old code can be migrated to use hyper-static scope, until eventually you could make Arc purely hyper-static.

There's three issues I see with my proposal:

1) If you're writing Arc code in a hyper-static fashion, you really want "arc.arc" to be changed to be hyper-static. But old Arc code will need the non-hyper-static "arc.arc". I think the simplest solution to this is to have two versions of "arc.arc", one that uses hyper-static scope, and one that doesn't. Then you would need to make sure to load the non-hyper-static version before loading Arc 3.1 code. This could be automated a tiny bit by using a macro, something like "w/arc3".

2) "load" occurs at run-time, which is why my definition of "w/include" needed to use "eval". Nulan doesn't have this problem because file importing occurs at compile-time. Perhaps the best way to solve this is to keep "load" as-is, and add in a new "import" macro that does all its work at compile-time.

3) If you think (eventually) making Arc purely hyper-static is a bad thing, you won't like my proposal.

---

"Am I getting this right? This sounds very workable. :) And whaddayaknow, Nulan works. ^_-"

Yes, that's more or less correct. The one detail that's different is... Nulan doesn't have a "get-variable-box" function. The reason is because "quote" internally uses (the equivalent of) "get-variable-box". So in Nulan, rather than using "get-variable-box", you'd just use "quote". And if you want to break hygiene, you'd explicitly use the "sym" function.

-----

1 point by rocketnia 4450 days ago | link

I mostly followed along, but I don't understand "It will work in my system as well, because undefined symbols automatically create new boxes." You were talking about having them create new boxes at assignment time, and I was recommending compiling-a-reference-time instead so that we don't get an unbound variable error in the first definition.

-----

1 point by Pauan 4450 days ago | link

How it works is, anytime the compiler sees an undefined symbol, it creates a new box for it like as if it had been created with "var".

Another way to think about it is... the compiler would replace this:

  (= foo (fn () ... bar ...))
  (= bar (fn () ... foo ...))

With this:

  (var bar)
  (var foo)
  (= foo (fn () ... bar ...))
  (= bar (fn () ... foo ...))

What happened is, when it encountered the undefined variable "bar", it created a new box for it. Then it encountered the undefined variable "foo", so it created a new box for it. Then it did the assignments like normal.

Given how you said "compiling-a-reference-time", I think we're talking about the same thing. Why did you mention assignment time?

-----

3 points by rocketnia 4450 days ago | link

"Why did you mention assignment time?"

We've just had a long exchange about you creating boxes at assignment time and me using compiling-a-reference time instead. Here's a recap:

---

You: Here's how you do it. The definition of "=" is the same: if the variable exists, mutate it, otherwise create a new variable. But now you add in a new primitive called "var"[...]

Me: The behavior I'd use is that any compile-time variable access (even under a lambda) creates a new, uninitialized variable binding if a binding doesn't already exist.

You: Yeah I'd do that too, if I wanted to graft dynamic variables onto a hyper-static system. But since Arc uses dynamic variables, I proposed to graft hyper-static onto it instead.

Me: How do you make the even/odd code work? Under the approach you described, the first line refers to an undefined variable (odd), and I interpret that as an error. I was recommending a fix.

You: Easy: I have a macro called "defs" that handles mutual recursion

Me: While I appreciate 'defs, it's a non-answer. The even/odd example [...] should work without modification.

You: It will work in my system as well, because undefined symbols automatically create new boxes.

---

At least we seem to be agreeing now. ^_^;

-----

3 points by Pauan 4450 days ago | link

Ah, sorry, huge miscommunication and misunderstanding on my part. I've actually been agreeing with you all along.

A large part of the problem is that I've been thinking about my proposal as two separate parts: one part deals with backwards compat with Arc, and the other part describes a hyper-static system for Arc.

When I was talking about "defs", I was talking about the hyper-static part. But you were talking about the backwards compat part. Hilarity (?) ensues.

-----

1 point by rocketnia 4450 days ago | link

Okay, we're on the same page now then. ^_^

Having both kinds of scope as options would be great.

-----

1 point by dido 4455 days ago | link

I found this discussion:

http://c2.com/cgi/wiki?HyperStaticGlobalEnvironment

that seems to describe the issues in more detail. However, I don't see how it directly solves the problems that the module system is attempting to solve. To adapt an example from the Pickaxe Book, we have implementations of the trigonometric functions like 'sin', 'cos', etc. in the system. Now, say I wanted to work on a simulation of good and evil, and define a function called 'sin' as well, inside a file called 'moral.arc'. Then you find you want to write a program to find out how many angels can dance on the head of a pin, and you need both the standard trigonometric functions, and my moral.arc. If you loaded moral.arc, without hyper-static scope my definition of 'sin' would stomp on the built-in definition of the trigonometric function, and you'd be unable to use both at the same time. With hyper-static scope, code that comes after (load "moral.arc") will still see only its definition of sin in the same way. However, you could write code before loading moral.arc that used the built-in definition of sin and it would be unaffected by its subsequent redefinition by moral.arc.

I do see from this, though, how a hyper-static scope could be used as the basis for the implementation of a module system (and in fact I may actually do so in Arcueid if it can really be done with full compatibility, of which I am not quite entirely convinced). It would be fairly straightforward to write my primitives as macros in an Arc with hyper-static scope.

What I'm more concerned about here is how modules are expressed, because while an underlying implementation can be easily enough changed, a poorly-designed convention for expressing and interfacing with modules might be problematic and once codebases start cropping up that use it, conversion can be hard.

-----

1 point by Pauan 4455 days ago | link

"With hyper-static scope, code that comes after (load "moral.arc") will still see only its definition of sin in the same way."

Right, but you see, that situation is actually trivially solved with hyper-static scope:

  ; math.arc
  (var sin ...)
  
  ; moral.arc
  (var sin ...)
  
  ; other.arc
  (load "math.arc")
  (var math-sin sin)
  (load "moral.arc")
  
  ... use sin and math-sin ...

Of course, there could be a utility macro called "w/rename" that does this for you:

  (w/rename (sin math-sin)
    (load "math.arc"))
  (load "moral.arc")

---

"(and in fact I may actually do so in Arcueid if it can really be done with full compatibility, of which I am not quite entirely convinced)"

The system I have described is fully compatible, but it isn't "pure" hyper-static scope. It's a weird hybrid between hyper-static and dynamic, with the advantages/disadvantages of both. If you wanted to make a pure hyper-static system, like Nulan, you would have to give up Arc compatibility.

---

"What I'm more concerned about here is how modules are expressed, because while an underlying implementation can be easily enough changed, a poorly-designed convention for expressing and interfacing with modules might be problematic and once codebases start cropping up that use it, conversion can be hard."

I think my proposal is the best possible module system for Arc while retaining compatibility. If you gave up compatibility, it would be possible to design better systems. I am curious about what kind of conventions and interfaces you're talking about specifically, though.

---

I'm writing up a post giving more details on how to rename/include/exclude variables in a hyper-static system.

-----

1 point by Pauan 4455 days ago | link

Okay, here's the remaining stuff. One thing we want to do is hide variables. To do this, we need a new primitive called "del". It works like this:

  (var foo 1)
  
  (def bar () foo)
  
  (del foo)

Now, if you try to use "foo", it will throw an error saying "foo is undefined". But if you call (bar) it will correctly return 1. So, using "del" doesn't really remove the variable, it just hides it. This can be used to control which variables your library exports.

---

Another thing that would be really nice is a way to get at the actual box for a variable. To do this, we need a new function called "get-variable-box" that accepts a symbol and returns a box.

There's a few things you can do with this. One of them is to create a "w/exclude" macro which lets you hide the variables you specify:

  (mac w/exclude (vars . body)
    `(do ,@body
         ,@(map (fn (x)
                  (if (bound x)
                      `(var ,x ,(get-variable-box x))
                      `(del ,x)))
                vars)))

And you use it like this:

  (w/exclude (qux corge)
    (load "foo.arc"))

This will import all the variables from "foo.arc" except "qux" and "corge". Of course it isn't limited to just importing. You can use it to write a library that has private variables:

  ; qux and corge are private: they can be seen inside this code block, but not outside
  (w/exclude (qux corge)
    (var qux ...)
    (var corge ...)
    ...)

Another thing this enables is hygienic macros, which I already explained (just change quasiquote to use "get-variable-box" rather than returning the symbol directly).

---

And we can define "w/rename" using "w/exclude":

  (mac w/rename (vars . body)
    (let p pair.vars
      `(w/exclude ,(map car p)
         ,@body
         ,@(map (fn ((x y))
                  `(var ,y ,x))
                p))))

---

We also want the ability to only import certain variables... but there's actually multiple ways to do this. You could have a primitive called "w/new-scope" that creates a new dynamic scope, similar to wrapping the expression in `(fn () ...)` except it works dynamically rather than lexically.

Another option would be a "w/new-namespace" primitive that returns an object that maps symbols to boxes. This is more flexible, but I'm not sure how fast it would be.

I'm going to go with the "w/new-scope" route, but if you have a better idea, I'm all ears. This is one part of Nulan that isn't quite fleshed out to my satisfaction yet. Also, I'm really not fond of using "eval" here.

  (mac w/include (vars . body)
    (let u (eval `(w/new-scope ,@body
                    (list ,@vars)))
      (map (fn (x y)
             `(var ,x ,y))
           vars
           u)))

Now if you say this:

  (w/include (qux corge)
    (load "foo.arc"))

It will only import "qux" and "corge" and nothing else. Just like "w/exclude", this isn't limited to just importing: you can use it to create private variables in libraries as well.

-----

1 point by Pauan 4455 days ago | link

By the way, I think Arc also needs aliases. I know of at least two ways to make aliases: hard and soft. A hard alias is simply having two different symbols refer to the same box.

A soft alias is like a symlink: a new box that points to the other box. The way that I handled this in Nulan is to have &get and &set methods on each box.

These methods are expanded at compile-time like macros, but rather than expanding when the box is the first element of the list, it expands whenever the box isn't the first element of the list.

Translating it into Arc, it might look like this:

  (var foo 1)

  (defget bar ()
    `foo)

  (defset bar (x)
    `(= foo ,x))

Now whenever the compiler sees the box "bar" it will replace it with the box "foo", and when assigning to the box "bar", it will replace it with an assignment to the box "foo".

Nulan uses this not only to link variables together, but also for other purposes. For instance, assuming there was a function "get-cwd" and "set-cwd", you could do this:

  (defget cwd ()
    `(get-cwd))

  (defset cwd (x)
    `(set-cwd ,x))

And now `cwd` would expand to `(get-cwd)` and `(= cwd 1)` would expand to `(set-cwd 1)`.

The reason I mention this here is that the namespace macros (w/exclude, w/rename, and w/include) should probably use hard aliases rather than using "var".

-----

1 point by dido 4455 days ago | link

"I think my proposal is the best possible module system for Arc while retaining compatibility. If you gave up compatibility, it would be possible to design better systems. I am curious about what kind of conventions and interfaces you're talking about specifically, though."

Well, you did read the linked post from the Arcueid blog I used to start this discussion didn't you? I was just thinking about a simple mechanism for qualifying free variables the way most other languages such as Ruby, OCaml, and various Scheme dialects (e.g. Scheme48, Bigloo, and Guile) have. I would think that the use of hyper-static scope could provide an implementation mechanism for this simple sort of module system I envision. I think the mechanisms you have in your follow-up are rather overly complex and don't even address the simplest use case I gave in my example. How would you use the mechanism you've described to allow code that includes both math.arc and moral.arc to let code below it use both the definitions in math.arc and moral.arc at the same time? My proposal would just have:

  ; math.arc
  (module Math (def sin (x) ...)
   ... ; many other definitions
  )

  ; moral.arc
  (module Moral (def sin (x) ...)
  ... ; many other definitions
  )

  ; other.arc
  (load "math.arc")
  (load "moral.arc")
  (Math::sin x)
  (Moral::sin y)

These are obviously extensions to standard Arc, but they will not interfere with most plain vanilla Arc code (unless someone just happens to use the scope-resolution :: ssyntax I've chosen in their variable names too, which I doubt is likely).

-----

1 point by Pauan 4455 days ago | link

"Well, you did read the linked post from the Arcueid blog I used to start this discussion didn't you?"

Yes. It sounds vastly more complicated than hyper-static scope. With more boilerplate too. I am quite aware of that style of module system.

---

"I think the mechanisms you have in your follow-up are rather overly complex [...]"

Your mechanism has some immediately obvious problems, like the fact that namespace names can collide, and the fact it has extra boilerplate required in every Arc file. It's also probably slower. Hyper-static scope has none of those problems.

---

"I would think that the use of hyper-static scope could provide an implementation mechanism for this simple sort of module system I envision."

It seems you're not quite grasping how hyper-static scope works and what it can do. Once you see it, I think you'll stop seeing hyper-static scope as being a mere implementation strategy (for a worse module system), and just use hyper-static scope by itself.

---

"[...] and don't even address the simplest use case I gave in my example. How would you use the mechanism you've described to allow code that includes both math.arc and moral.arc to let code below it use both the definitions in math.arc and moral.arc at the same time?"

I already described it (http://arclanguage.org/item?id=17417). You can use a plain-old "var", or you can use something like "w/rename". Which parts do you see as complicated?

-----

1 point by dido 4455 days ago | link

"You can use a plain-old "var", or you can use something like "w/rename". Which parts do you see as complicated?"

Well, while your way is indeed not conceptually complicated, to my mind it creates complications in practical use. I don't think you fully realise just what the use of your proposed module system entails in an environment with lots of third-party libraries. You would need to do what amounts to a declaration of what symbols you want to use just after loading every library, placing a burden on every user of such library. You complain that my method requires boilerplate in every Arc file, but your method requires non-trivial renaming of every symbol that might conflict with another library to be loaded later after almost every library load, which I think is much worse than the two-symbol boilerplate my method requires. I don't see how that is any better, and to my mind that places a needless burden of bookkeeping onto the programmer.

Worse yet, forgetting to make such a declaration would result in a symbol getting a spurious binding in the top-level global environment, which might cause difficult to find bugs where a library defines a symbol that was used improperly in local code, so instead of getting an unbound symbol error one would get unexpected behaviour. What if, for instance, I wanted to use both moral.arc and math.arc at the same time, but didn't need the definition of sin in math.arc but needed other stuff it provided, but wanted to use the definition of sin in moral.arc. You'd say that I should just load math.arc before moral.arc but if I forgot about this and reversed the order of loading then my code might use the definition of sin in math.arc unexpectedly. And what would happen if a library happens to load another library that had naming conflicts with another library I'm using? My proposal doesn't suffer from this problem.

The purpose of a module system is to allow the programmer to control and manage name clashes, and your proposed mechanism, while I admit it can be used to accomplish the job, is much too low-level to be really useful for the kinds of use cases I envision.

True, the system I propose can have module names colliding, but it doesn't seem to be such a serious problem in actual practice for the other popular languages that make use of a similar system. And well, if I do wind up implementing hyper-static scope, that takes care of that uncommon case easily enough, and you only need to rename the module name.

Hyper-static scope is an interesting idea, and I may even actually implement your proposed hybrid version in Arcueid, but it is much too primitive on its own to provide a usable module system in my opinion.

-----

2 points by Pauan 4455 days ago | link

"You would need to do what amounts to a declaration of what symbols you want to use just after loading every library, placing a burden on every user of such library."

No you don't. It depends on what the conflict is and what you're trying to accomplish. In the easiest case, there's no extra code needed. In the hardest case, you can use something like "w/prefix" which renames all the variables in the file:

  (w/prefix Math::
    (load "math.arc"))
  (w/prefix Moral::
    (load "moral.arc"))

  ... use Math::sin and Moral::sin ...

One problem with your proposal is that the library author decides what the prefix is. But then two different libraries can use the same prefix (imagine two libraries both using the "Math" module name).

Instead, in my system, it's the one who does the importing that decides what the prefix is. Because only the importer of the library has enough information to correctly resolve name collisions: the library author doesn't have enough information.

---

"What if, for instance, I wanted to use both moral.arc and math.arc at the same time, but didn't need the definition of sin in math.arc but needed other stuff it provided, but wanted to use the definition of sin in moral.arc. You'd say that I should just load math.arc before moral.arc but if I forgot about this and reversed the order of loading then my code might use the definition of sin in math.arc unexpectedly."

In that situation, your system would require you to use the module name as a prefix, increasing verbosity by quite a bit.

And no, I wouldn't say "load them in the right order". I'd say "load them in the right order, OR use w/include, OR use w/exclude, OR use w/rename, OR use w/prefix". You have many options in my system to resolve conflicts: use the one you like the best.

In that particular case, I'd probably just use w/exclude to exclude "sin" from the "math.arc" library. Much less verbosity than your system.

---

"The purpose of a module system is to allow the programmer to control and manage name clashes, and your proposed mechanism, while I admit it can be used to accomplish the job, is much too low-level to be really useful for the kinds of use cases I envision."

Requiring module prefixes in the case of conflict is an easier rule to follow and it has some benefits, but I wouldn't call it higher level.

---

"And what would happen if a library happens to load another library that had naming conflicts with another library I'm using? My proposal doesn't suffer from this problem."

Simple: you use w/exclude or w/rename or w/include or simply load them in the right order. Your proposal does have that problem because in the case of conflict, you now need to prefix the variable with the module's name, in other words, saying Math::sin rather than just sin.

With your proposal, you have boilerplate in every Arc file (the "module" form), and you have to use the module name's prefix in case of conflict.

With my system, in the very common case that there isn't any conflict, there's zero boilerplate.

And in the case where there is conflict, you can usually get by just fine by simply using w/exclude or w/include. And in the quite rare case where you need to use the same symbol from two libraries, you can either use w/rename or w/prefix.

Using w/prefix is about the same amount of boilerplate as your system, except that there's no possibility for namespace name clashes.

The benefit of hyper-static scope (aside from being simpler to understand and implement) is that you don't need to always use prefixes. There's a sliding scale, with "simply load them in the right order" being the most concise, and "w/prefix" being the most verbose.

With your system, you're stuck with full verbosity every time there's a conflict. With hyper-static scope, you have many options to resolve the conflict, with varying amounts of verbosity and control.

Your system might work out well in other languages, but I don't think it's well suited for Arc, a language that emphasizes simplicity, axioms, conciseness, and raw power.

P.S. The system I'm describing has some similarities to Factor's module system and Racket's module system, except because it's based on hyper-static scope, it's much simpler and easier to implement.

-----

2 points by dido 4455 days ago | link

I don't know. The boilerplate that my proposal uses is also a form of documentation. So I know right away that the 'sin' function I'm using is from the Math module, or from the Moral module. If the boilerplate is too much in a given piece of code (e.g. you know that a large part of the code makes use of only the definitions in the Math module), then that's what import is for. Brevity is nice, but there is such a thing as too much brevity. Everything should be made as simple as possible, but not simpler.

I concede that your proposed system is much more general and flexible, so much so that my proposal could actually use it as a basis for its underlying implementation. Our disagreement seems to be more on actual convention and notation. You seem to feel it improper to suggest conventions for notation for a module system and place responsibility for this squarely on the users of third-party libraries. While you indeed propose many methods for accomplishing what a module system is supposed to accomplish, if I were to study someone else's code I'd need to know which method(s) were employed there. The well known Perl adage of there's more than one way to do it is a philosophy that not even Perl sticks to when it comes to organising libraries on CPAN. They have guidelines on the structure of libraries that are generally followed. Libraries that don't follow the guidelines generally see much less use because they cause trouble that users of such libraries need to work to get those libraries to play nice with the libraries that do follow the guidelines.

My real goal in proposing the design of a module system, as I state in the first paragraph of my original blog post, is to encourage the development of third-party libraries. I don't know that placing responsibility for the management of namespaces used by third party libraries entirely in the hands of the user of a library as you propose helps to further that goal.

-----

1 point by Pauan 4455 days ago | link

With my system, if you want to you can just always use w/prefix and it will behave like your system. But you have the option to use less verbosity.

By the way, you keep mentioning how the "burden is placed on the user of the library", and that's exactly right. The library author cannot and should not be expected to predict everything that can happen. The user of the library is the only one with enough information.

You talked about a system using many third-party libraries. Let's look at how that plays out. Because there's so many libraries being used, there's a pretty good chance of conflict. In your system, that would mean that when a conflict occurs, you need to go in and change all the uses of the variable to use the prefix.

And what if you upgrade the library, or if you swap it out for another library? Now you either gotta add prefixes (in case there wasn't any), or you gotta change the prefixes.

And because doing this prefix change is a huge pain in the butt, you'd either encourage users to always prefix their variables (ala Python), or you'd need a smart IDE to do it for the user. Either way, your code ends up being a lot more verbose.

With your system, all conflict resolution happens inside the actual code itself. Which means when conflicts happen or change, you have to change your code.

With hyper-static scope, you just change the imports at the top of the file. The code itself stays the same. This is vastly less verbose and vastly more maintainable. It also opens up the possibility of something like RubyGems, with dependency information kept in a separate file.

---

You mentioned knowing "whether a variable is from the math module or not", but ironically, hyper-static scope handles that case wonderfully well. Because in hyper-static scope, all variables are resolved at compile-time to a unique box.

Which means that it's trivial to lookup which variables a module uses, and which module the variable was originally defined in. This could be a command-line utility, or it could be built into some IDE.

As an example of that, check out this IDE I designed for Nulan:

http://pauan.github.com/nulan/doc/tutorial.html

If you click on a variable, it will highlight it. Try entering this:

  box foo = 1
  foo
  box foo = 2
  foo

Now try clicking on the first "foo", then the third "foo". Basically, it knows exactly which variable is which, because of boxes. And this is really really easy to do.

Factor also has a wonderful integrated IDE that can do this (and more). I mention Factor because although Factor doesn't use hyper-static scope, its module system is quite similar to the system I'm proposing. The biggest difference is that in Factor, all conflicts must be explicitly resolved, whereas with hyper-static scope, some conflicts can be resolved simply by changing the load order. I don't think either style is really superior, more a matter of taste.

I do admit that requiring module prefixes is a way to add self-documentation without the use of an IDE or whatever. But that comes at the high cost of verbosity and flexibility. I personally don't think it's worth it.

If you want that kind of documentation, you're free to use w/prefix, or just use a comment at the top of the file. You might say, "but then users will be lazy and won't do it", and, well, yeah, because it's a pain in the butt. Arc doesn't strike me as the language to force users to do things they don't want to do.

And if there were a command-line utility that would tell you which variable belongs to which module, you could just autogenerate the documentation whenever you want, rather than having it hardcoded into the file.

-----

1 point by dido 4455 days ago | link

"By the way, you keep mentioning how the 'burden is placed on the user of the library', and that's exactly right. The library author cannot and should not be expected to predict everything that can happen. The user of the library is the only one with enough information."

While I concede that this is true in general, that also does not mean that the author of a library should not be permitted to provide sensible defaults to allow someone to use the library with a minimum amount of fuss, and at the same time give the user the power to override these defaults when required.

"You talked about a system using many third-party libraries. Let's look at how that plays out. Because there's so many libraries being used, there's a pretty good chance of conflict. In your system, that would mean that when a conflict occurs, you need to go in and change all the uses of the variable to use the prefix."

In my system, every variable in a separate module in general has to have a prefix, just as Python does. That prefix is set by the library author but can be changed by the user of the library if required. You can dispense with the prefixes temporarily by using import in order to manage this verbosity.

You mentioned knowing 'whether a variable is from the math module or not', but ironically, hyper-static scope handles that case wonderfully well. Because in hyper-static scope, all variables are resolved at compile-time to a unique box.

Doesn't help. Your program knows, but you, the programmer, can't easily know this by mere inspection of the code or a snippet of code. You may even need to compare source files from different libraries in order to resolve this question fully under the system you propose, or use a special-purpose IDE or tools. I think that is much more important. With my proposed system, if you see Math::sin or (import Math ... (sin x) ...) then you'd know where it's coming from. An import form has only local effects that end at the closing parenthesis. Use of hyper-static scope in the way you propose on the other hand has unpredictable global effects that can be difficult to trace. I personally don't feel that it is too much of a high price to pay in verbosity and flexibility, and both of these can be ameliorated to a certain degree by using import.

Don't get me started on IDEs. If a language needs a special-purpose IDE in order to be usable, well, I consider that a very serious shortcoming. That's just another kind of forcing users to do things they don't want to do.

-----

1 point by Pauan 4455 days ago | link

By the way... I think it's useful to use this as a quick test of the viability of a module system:

http://lambda-the-ultimate.org/node/3991#comment-60418

Hyper-static scope easily handles all of that, as discussed here:

http://akkartik.name/2012-11-11-lexical-global-scope.html

(That discussion is actually a subthread for this discussion: http://arclanguage.org/item?id=16986)

-----

1 point by rocketnia 4453 days ago | link

"That discussion is actually a subthread for this discussion"

Er, is it the other way around? It looks like at least this post of mine came after the email discussion: http://arclanguage.org/item?id=16995

---

"Hyper-static scope easily handles all of that"

For whatever it's worth, I still technically disagree with the "all" here. I admit your approach handles the practical cases.

Your approach has developers hardcoding filenames within their source code. If a developer wants to use two files of the same name, they must find a way to segregate the files into subfolders, or they must rename a file and invade some library source code to rewrite the filename occurrences. Please let me know if I'm wrong about this. :)

The LtU post also goes over at least one use case where a library user wants to update the dependencies of the library without also updating the library itself. Fortunately, this time I think we can agree that this isn't the problem we're discussing. :) It's something I care about in a module system, but it's not directly related to name collision.

-----

1 point by Pauan 4452 days ago | link

"Er, is it the other way around? It looks like at least this post of mine came after the email discussion"

Yes, but the discussion started with the Arc topic and then moved to e-mail and then moved back to the Arc topic.

---

"I admit your approach handles the practical cases."

Well then! I'll consider that "all", since I only care about the practical cases.

---

"Your approach has developers hardcoding filenames within their source code. If a developer wants to use two files of the same name, they must find a way to segregate the files into subfolders, or they must rename a file and invade some library source code to rewrite the filename occurrences. Please let me know if I'm wrong about this. :)"

Yes. It is tied to the filesystem, or website URLs, or Git commits, or whatever. But, the filesystem already enforces a "no two files with the same name in the same folder" rule, so no biggie.

If you wanted to create something that manages dependencies at a more abstract level, that's fine, and you can build it on top of my system, but I personally don't see much use for that (yet).

If your worry isn't about filenames at all, and is simply about putting filenames into the source code, I think the answer is really easy: just do something like RubyGems, where you have a standard file called "dependencies" that imports all the dependencies in the right order. And then when you want to load the library, you'd just load the "dependencies" file. This doesn't require any changes to my system, since it's purely user-convention.

This still allows for putting the dependency information straight into the source code, which is useful for quickie scripts and such. But big projects and libraries would use the "dependencies" convention. And as Ruby showed, this kind of user-convention can be applied after the language is already in use. So it doesn't need to be baked in ahead of time, though there might be some minor transition pain. But I'll worry about that once libraries and projects become big enough that a "dependencies" convention becomes useful.

-----

1 point by Pauan 4456 days ago | link

P.S. Another strategy is to use boxes, which is what Nulan does. Using boxes has some extra benefits, including making macros completely hygienic, but I figured it would be easier for you to implement variable renaming.

-----

2 points by dido 4453 days ago | link

Pauan, although I disagree with you on precisely how the hyper-static scope primitives should be used to create an actual module system, the idea seems to be general enough that it might actually be worth implementing in Arcueid in more or less the manner which you describe. The only problem I have is what happens with forward references, e.g. such as arise with mutual recursion. To go with the classic example of mutual recursion:

  (def evenp (x) (if (is x 0) t (oddp (- x 1)))
  (def oddp (x) (if (is x 0) nil (evenp (-x 1)))

Without a special operator, it seems impossible to define something like this in pure hyper-static scope. With the sort of hybrid hyper-static scope you envision, what would compiler do if it got this sort of definition?

The way I understand it, the compiler would create a box for oddp the way an actual binding using var would have, but this box would be empty as it were. If no subsequent definition of oddp followed, that would result in an unbound free variable error at runtime when evenp was used, exactly the way the reference Arc implementation would. If oddp were defined, however, it would then fill in the empty box that the reference to it in evenp created, and so the call to oddp would make use of the first definition of evenp that followed it, just as a non-hyper-static system would. Subsequent definitions of oddp using var would have no effect on evenp though, unless evenp were redefined.

One problem though is what happens when you try to redefine evenp in terms of a new oddp. Without having a way to remove the old bindings of evenp and oddp, this seems to be impossible. One of them will continue to use the old definition of the other no matter what order you define them.

-----

1 point by Pauan 4452 days ago | link

"Without a special operator, it seems impossible to define something like this in pure hyper-static scope"

Nulan uses a very simple "defs" macro for this:

http://arclanguage.org/item?id=17449

Ordinary Arc wouldn't need such a macro, since the system I describe isn't purely hyper-static.

---

"The way I understand it, the compiler would create a box for oddp the way an actual binding using var would have, but this box would be empty as it were."

Yes, exactly.

---

"One problem though is what happens when you try to redefine evenp in terms of a new oddp. Without having a way to remove the old bindings of evenp and oddp, this seems to be impossible. One of them will continue to use the old definition of the other no matter what order you define them."

You don't change the bindings. You mutate the box using "=". The box itself stays the same, it just has a different value at runtime.

So if you want to change oddp in such a way that evenp notices the changes, you would say this:

  (= oddp ...)

And if you want to change oddp in such a way that evenp doesn't notice the changes, you would say this:

  (var oddp ...)

In fact, that's how Nulan defines self-recursive and mutually-recursive functions. Using Arc syntax, this:

  (def foo () ...)

Would get macroexpanded into this:

  (var foo)
  (= foo (fn () ...))

Notice that it first creates the box, and then assigns to it.

-----

1 point by dido 4455 days ago | link

If I understand this correctly, it would allow Arcueid's compiler to resolve global variables at compile-time, rather than at run-time. No need for variable renaming. There is currently a genv instruction in Arcueid's virtual machine which basically takes a symbol referring to a global variable and finds its binding in the global environment. If we do hyper-static scope, the genv instruction can be changed to take a reference to the actual global variable instead. Thus, the mappings of symbols to free variables are only required when compiling an expression.

-----

1 point by Pauan 4455 days ago | link

Yes, exactly.

-----