Arc Forum | I found this discussion:http://c2.com/cgi/wiki?HyperStaticGlobalEnvironmentthat ...

Arc Forum

1 point by dido 4637 days ago | link | parent

I found this discussion:

http://c2.com/cgi/wiki?HyperStaticGlobalEnvironment

that seems to describe the issues in more detail. However, I don't see how it directly solves the problems that the module system is attempting to solve. To adapt an example from the Pickaxe Book, we have implementations of the trigonometric functions like 'sin', 'cos', etc. in the system. Now, say I wanted to work on a simulation of good and evil, and define a function called 'sin' as well, inside a file called 'moral.arc'. Then you find you want to write a program to find out how many angels can dance on the head of a pin, and you need both the standard trigonometric functions, and my moral.arc. If you loaded moral.arc, without hyper-static scope my definition of 'sin' would stomp on the built-in definition of the trigonometric function, and you'd be unable to use both at the same time. With hyper-static scope, code that comes after (load "moral.arc") will still see only its definition of sin in the same way. However, you could write code before loading moral.arc that used the built-in definition of sin and it would be unaffected by its subsequent redefinition by moral.arc.

I do see from this, though, how a hyper-static scope could be used as the basis for the implementation of a module system (and in fact I may actually do so in Arcueid if it can really be done with full compatibility, of which I am not quite entirely convinced). It would be fairly straightforward to write my primitives as macros in an Arc with hyper-static scope.

What I'm more concerned about here is how modules are expressed, because while an underlying implementation can be easily enough changed, a poorly-designed convention for expressing and interfacing with modules might be problematic and once codebases start cropping up that use it, conversion can be hard.

1 point by Pauan 4637 days ago | link

"With hyper-static scope, code that comes after (load "moral.arc") will still see only its definition of sin in the same way."

Right, but you see, that situation is actually trivially solved with hyper-static scope:

  ; math.arc
  (var sin ...)
  
  ; moral.arc
  (var sin ...)
  
  ; other.arc
  (load "math.arc")
  (var math-sin sin)
  (load "moral.arc")
  
  ... use sin and math-sin ...

Of course, there could be a utility macro called "w/rename" that does this for you:

  (w/rename (sin math-sin)
    (load "math.arc"))
  (load "moral.arc")

---

"(and in fact I may actually do so in Arcueid if it can really be done with full compatibility, of which I am not quite entirely convinced)"

The system I have described is fully compatible, but it isn't "pure" hyper-static scope. It's a weird hybrid between hyper-static and dynamic, with the advantages/disadvantages of both. If you wanted to make a pure hyper-static system, like Nulan, you would have to give up Arc compatibility.

---

"What I'm more concerned about here is how modules are expressed, because while an underlying implementation can be easily enough changed, a poorly-designed convention for expressing and interfacing with modules might be problematic and once codebases start cropping up that use it, conversion can be hard."

I think my proposal is the best possible module system for Arc while retaining compatibility. If you gave up compatibility, it would be possible to design better systems. I am curious about what kind of conventions and interfaces you're talking about specifically, though.

---

I'm writing up a post giving more details on how to rename/include/exclude variables in a hyper-static system.

-----

1 point by Pauan 4637 days ago | link

Okay, here's the remaining stuff. One thing we want to do is hide variables. To do this, we need a new primitive called "del". It works like this:

  (var foo 1)
  
  (def bar () foo)
  
  (del foo)

Now, if you try to use "foo", it will throw an error saying "foo is undefined". But if you call (bar) it will correctly return 1. So, using "del" doesn't really remove the variable, it just hides it. This can be used to control which variables your library exports.

---

Another thing that would be really nice is a way to get at the actual box for a variable. To do this, we need a new function called "get-variable-box" that accepts a symbol and returns a box.

There's a few things you can do with this. One of them is to create a "w/exclude" macro which lets you hide the variables you specify:

  (mac w/exclude (vars . body)
    `(do ,@body
         ,@(map (fn (x)
                  (if (bound x)
                      `(var ,x ,(get-variable-box x))
                      `(del ,x)))
                vars)))

And you use it like this:

  (w/exclude (qux corge)
    (load "foo.arc"))

This will import all the variables from "foo.arc" except "qux" and "corge". Of course it isn't limited to just importing. You can use it to write a library that has private variables:

  ; qux and corge are private: they can be seen inside this code block, but not outside
  (w/exclude (qux corge)
    (var qux ...)
    (var corge ...)
    ...)

Another thing this enables is hygienic macros, which I already explained (just change quasiquote to use "get-variable-box" rather than returning the symbol directly).

---

And we can define "w/rename" using "w/exclude":

  (mac w/rename (vars . body)
    (let p pair.vars
      `(w/exclude ,(map car p)
         ,@body
         ,@(map (fn ((x y))
                  `(var ,y ,x))
                p))))

---

We also want the ability to only import certain variables... but there's actually multiple ways to do this. You could have a primitive called "w/new-scope" that creates a new dynamic scope, similar to wrapping the expression in `(fn () ...)` except it works dynamically rather than lexically.

Another option would be a "w/new-namespace" primitive that returns an object that maps symbols to boxes. This is more flexible, but I'm not sure how fast it would be.

I'm going to go with the "w/new-scope" route, but if you have a better idea, I'm all ears. This is one part of Nulan that isn't quite fleshed out to my satisfaction yet. Also, I'm really not fond of using "eval" here.

  (mac w/include (vars . body)
    (let u (eval `(w/new-scope ,@body
                    (list ,@vars)))
      (map (fn (x y)
             `(var ,x ,y))
           vars
           u)))

Now if you say this:

  (w/include (qux corge)
    (load "foo.arc"))

It will only import "qux" and "corge" and nothing else. Just like "w/exclude", this isn't limited to just importing: you can use it to create private variables in libraries as well.

-----

1 point by Pauan 4637 days ago | link

By the way, I think Arc also needs aliases. I know of at least two ways to make aliases: hard and soft. A hard alias is simply having two different symbols refer to the same box.

A soft alias is like a symlink: a new box that points to the other box. The way that I handled this in Nulan is to have &get and &set methods on each box.

These methods are expanded at compile-time like macros, but rather than expanding when the box is the first element of the list, it expands whenever the box isn't the first element of the list.

Translating it into Arc, it might look like this:

  (var foo 1)

  (defget bar ()
    `foo)

  (defset bar (x)
    `(= foo ,x))

Now whenever the compiler sees the box "bar" it will replace it with the box "foo", and when assigning to the box "bar", it will replace it with an assignment to the box "foo".

Nulan uses this not only to link variables together, but also for other purposes. For instance, assuming there was a function "get-cwd" and "set-cwd", you could do this:

  (defget cwd ()
    `(get-cwd))

  (defset cwd (x)
    `(set-cwd ,x))

And now `cwd` would expand to `(get-cwd)` and `(= cwd 1)` would expand to `(set-cwd 1)`.

The reason I mention this here is that the namespace macros (w/exclude, w/rename, and w/include) should probably use hard aliases rather than using "var".

-----

1 point by dido 4637 days ago | link

"I think my proposal is the best possible module system for Arc while retaining compatibility. If you gave up compatibility, it would be possible to design better systems. I am curious about what kind of conventions and interfaces you're talking about specifically, though."

Well, you did read the linked post from the Arcueid blog I used to start this discussion didn't you? I was just thinking about a simple mechanism for qualifying free variables the way most other languages such as Ruby, OCaml, and various Scheme dialects (e.g. Scheme48, Bigloo, and Guile) have. I would think that the use of hyper-static scope could provide an implementation mechanism for this simple sort of module system I envision. I think the mechanisms you have in your follow-up are rather overly complex and don't even address the simplest use case I gave in my example. How would you use the mechanism you've described to allow code that includes both math.arc and moral.arc to let code below it use both the definitions in math.arc and moral.arc at the same time? My proposal would just have:

  ; math.arc
  (module Math (def sin (x) ...)
   ... ; many other definitions
  )

  ; moral.arc
  (module Moral (def sin (x) ...)
  ... ; many other definitions
  )

  ; other.arc
  (load "math.arc")
  (load "moral.arc")
  (Math::sin x)
  (Moral::sin y)

These are obviously extensions to standard Arc, but they will not interfere with most plain vanilla Arc code (unless someone just happens to use the scope-resolution :: ssyntax I've chosen in their variable names too, which I doubt is likely).

-----

1 point by Pauan 4637 days ago | link

"Well, you did read the linked post from the Arcueid blog I used to start this discussion didn't you?"

Yes. It sounds vastly more complicated than hyper-static scope. With more boilerplate too. I am quite aware of that style of module system.

---

"I think the mechanisms you have in your follow-up are rather overly complex [...]"

Your mechanism has some immediately obvious problems, like the fact that namespace names can collide, and the fact it has extra boilerplate required in every Arc file. It's also probably slower. Hyper-static scope has none of those problems.

---

"I would think that the use of hyper-static scope could provide an implementation mechanism for this simple sort of module system I envision."

It seems you're not quite grasping how hyper-static scope works and what it can do. Once you see it, I think you'll stop seeing hyper-static scope as being a mere implementation strategy (for a worse module system), and just use hyper-static scope by itself.

---

"[...] and don't even address the simplest use case I gave in my example. How would you use the mechanism you've described to allow code that includes both math.arc and moral.arc to let code below it use both the definitions in math.arc and moral.arc at the same time?"

I already described it (http://arclanguage.org/item?id=17417). You can use a plain-old "var", or you can use something like "w/rename". Which parts do you see as complicated?

-----

1 point by dido 4637 days ago | link

"You can use a plain-old "var", or you can use something like "w/rename". Which parts do you see as complicated?"

Well, while your way is indeed not conceptually complicated, to my mind it creates complications in practical use. I don't think you fully realise just what the use of your proposed module system entails in an environment with lots of third-party libraries. You would need to do what amounts to a declaration of what symbols you want to use just after loading every library, placing a burden on every user of such library. You complain that my method requires boilerplate in every Arc file, but your method requires non-trivial renaming of every symbol that might conflict with another library to be loaded later after almost every library load, which I think is much worse than the two-symbol boilerplate my method requires. I don't see how that is any better, and to my mind that places a needless burden of bookkeeping onto the programmer.

Worse yet, forgetting to make such a declaration would result in a symbol getting a spurious binding in the top-level global environment, which might cause difficult to find bugs where a library defines a symbol that was used improperly in local code, so instead of getting an unbound symbol error one would get unexpected behaviour. What if, for instance, I wanted to use both moral.arc and math.arc at the same time, but didn't need the definition of sin in math.arc but needed other stuff it provided, but wanted to use the definition of sin in moral.arc. You'd say that I should just load math.arc before moral.arc but if I forgot about this and reversed the order of loading then my code might use the definition of sin in math.arc unexpectedly. And what would happen if a library happens to load another library that had naming conflicts with another library I'm using? My proposal doesn't suffer from this problem.

The purpose of a module system is to allow the programmer to control and manage name clashes, and your proposed mechanism, while I admit it can be used to accomplish the job, is much too low-level to be really useful for the kinds of use cases I envision.

True, the system I propose can have module names colliding, but it doesn't seem to be such a serious problem in actual practice for the other popular languages that make use of a similar system. And well, if I do wind up implementing hyper-static scope, that takes care of that uncommon case easily enough, and you only need to rename the module name.

Hyper-static scope is an interesting idea, and I may even actually implement your proposed hybrid version in Arcueid, but it is much too primitive on its own to provide a usable module system in my opinion.

-----

2 points by Pauan 4636 days ago | link

"You would need to do what amounts to a declaration of what symbols you want to use just after loading every library, placing a burden on every user of such library."

No you don't. It depends on what the conflict is and what you're trying to accomplish. In the easiest case, there's no extra code needed. In the hardest case, you can use something like "w/prefix" which renames all the variables in the file:

  (w/prefix Math::
    (load "math.arc"))
  (w/prefix Moral::
    (load "moral.arc"))

  ... use Math::sin and Moral::sin ...

One problem with your proposal is that the library author decides what the prefix is. But then two different libraries can use the same prefix (imagine two libraries both using the "Math" module name).

Instead, in my system, it's the one who does the importing that decides what the prefix is. Because only the importer of the library has enough information to correctly resolve name collisions: the library author doesn't have enough information.

---

"What if, for instance, I wanted to use both moral.arc and math.arc at the same time, but didn't need the definition of sin in math.arc but needed other stuff it provided, but wanted to use the definition of sin in moral.arc. You'd say that I should just load math.arc before moral.arc but if I forgot about this and reversed the order of loading then my code might use the definition of sin in math.arc unexpectedly."

In that situation, your system would require you to use the module name as a prefix, increasing verbosity by quite a bit.

And no, I wouldn't say "load them in the right order". I'd say "load them in the right order, OR use w/include, OR use w/exclude, OR use w/rename, OR use w/prefix". You have many options in my system to resolve conflicts: use the one you like the best.

In that particular case, I'd probably just use w/exclude to exclude "sin" from the "math.arc" library. Much less verbosity than your system.

---

"The purpose of a module system is to allow the programmer to control and manage name clashes, and your proposed mechanism, while I admit it can be used to accomplish the job, is much too low-level to be really useful for the kinds of use cases I envision."

Requiring module prefixes in the case of conflict is an easier rule to follow and it has some benefits, but I wouldn't call it higher level.

---

"And what would happen if a library happens to load another library that had naming conflicts with another library I'm using? My proposal doesn't suffer from this problem."

Simple: you use w/exclude or w/rename or w/include or simply load them in the right order. Your proposal does have that problem because in the case of conflict, you now need to prefix the variable with the module's name, in other words, saying Math::sin rather than just sin.

With your proposal, you have boilerplate in every Arc file (the "module" form), and you have to use the module name's prefix in case of conflict.

With my system, in the very common case that there isn't any conflict, there's zero boilerplate.

And in the case where there is conflict, you can usually get by just fine by simply using w/exclude or w/include. And in the quite rare case where you need to use the same symbol from two libraries, you can either use w/rename or w/prefix.

Using w/prefix is about the same amount of boilerplate as your system, except that there's no possibility for namespace name clashes.

The benefit of hyper-static scope (aside from being simpler to understand and implement) is that you don't need to always use prefixes. There's a sliding scale, with "simply load them in the right order" being the most concise, and "w/prefix" being the most verbose.

With your system, you're stuck with full verbosity every time there's a conflict. With hyper-static scope, you have many options to resolve the conflict, with varying amounts of verbosity and control.

Your system might work out well in other languages, but I don't think it's well suited for Arc, a language that emphasizes simplicity, axioms, conciseness, and raw power.

P.S. The system I'm describing has some similarities to Factor's module system and Racket's module system, except because it's based on hyper-static scope, it's much simpler and easier to implement.

-----

2 points by dido 4636 days ago | link

I don't know. The boilerplate that my proposal uses is also a form of documentation. So I know right away that the 'sin' function I'm using is from the Math module, or from the Moral module. If the boilerplate is too much in a given piece of code (e.g. you know that a large part of the code makes use of only the definitions in the Math module), then that's what import is for. Brevity is nice, but there is such a thing as too much brevity. Everything should be made as simple as possible, but not simpler.

I concede that your proposed system is much more general and flexible, so much so that my proposal could actually use it as a basis for its underlying implementation. Our disagreement seems to be more on actual convention and notation. You seem to feel it improper to suggest conventions for notation for a module system and place responsibility for this squarely on the users of third-party libraries. While you indeed propose many methods for accomplishing what a module system is supposed to accomplish, if I were to study someone else's code I'd need to know which method(s) were employed there. The well known Perl adage of there's more than one way to do it is a philosophy that not even Perl sticks to when it comes to organising libraries on CPAN. They have guidelines on the structure of libraries that are generally followed. Libraries that don't follow the guidelines generally see much less use because they cause trouble that users of such libraries need to work to get those libraries to play nice with the libraries that do follow the guidelines.

My real goal in proposing the design of a module system, as I state in the first paragraph of my original blog post, is to encourage the development of third-party libraries. I don't know that placing responsibility for the management of namespaces used by third party libraries entirely in the hands of the user of a library as you propose helps to further that goal.

-----

1 point by Pauan 4636 days ago | link

With my system, if you want to you can just always use w/prefix and it will behave like your system. But you have the option to use less verbosity.

By the way, you keep mentioning how the "burden is placed on the user of the library", and that's exactly right. The library author cannot and should not be expected to predict everything that can happen. The user of the library is the only one with enough information.

You talked about a system using many third-party libraries. Let's look at how that plays out. Because there's so many libraries being used, there's a pretty good chance of conflict. In your system, that would mean that when a conflict occurs, you need to go in and change all the uses of the variable to use the prefix.

And what if you upgrade the library, or if you swap it out for another library? Now you either gotta add prefixes (in case there wasn't any), or you gotta change the prefixes.

And because doing this prefix change is a huge pain in the butt, you'd either encourage users to always prefix their variables (ala Python), or you'd need a smart IDE to do it for the user. Either way, your code ends up being a lot more verbose.

With your system, all conflict resolution happens inside the actual code itself. Which means when conflicts happen or change, you have to change your code.

With hyper-static scope, you just change the imports at the top of the file. The code itself stays the same. This is vastly less verbose and vastly more maintainable. It also opens up the possibility of something like RubyGems, with dependency information kept in a separate file.

---

You mentioned knowing "whether a variable is from the math module or not", but ironically, hyper-static scope handles that case wonderfully well. Because in hyper-static scope, all variables are resolved at compile-time to a unique box.

Which means that it's trivial to lookup which variables a module uses, and which module the variable was originally defined in. This could be a command-line utility, or it could be built into some IDE.

As an example of that, check out this IDE I designed for Nulan:

http://pauan.github.com/nulan/doc/tutorial.html

If you click on a variable, it will highlight it. Try entering this:

  box foo = 1
  foo
  box foo = 2
  foo

Now try clicking on the first "foo", then the third "foo". Basically, it knows exactly which variable is which, because of boxes. And this is really really easy to do.

Factor also has a wonderful integrated IDE that can do this (and more). I mention Factor because although Factor doesn't use hyper-static scope, its module system is quite similar to the system I'm proposing. The biggest difference is that in Factor, all conflicts must be explicitly resolved, whereas with hyper-static scope, some conflicts can be resolved simply by changing the load order. I don't think either style is really superior, more a matter of taste.

I do admit that requiring module prefixes is a way to add self-documentation without the use of an IDE or whatever. But that comes at the high cost of verbosity and flexibility. I personally don't think it's worth it.

If you want that kind of documentation, you're free to use w/prefix, or just use a comment at the top of the file. You might say, "but then users will be lazy and won't do it", and, well, yeah, because it's a pain in the butt. Arc doesn't strike me as the language to force users to do things they don't want to do.

And if there were a command-line utility that would tell you which variable belongs to which module, you could just autogenerate the documentation whenever you want, rather than having it hardcoded into the file.

-----

1 point by dido 4636 days ago | link

"By the way, you keep mentioning how the 'burden is placed on the user of the library', and that's exactly right. The library author cannot and should not be expected to predict everything that can happen. The user of the library is the only one with enough information."

While I concede that this is true in general, that also does not mean that the author of a library should not be permitted to provide sensible defaults to allow someone to use the library with a minimum amount of fuss, and at the same time give the user the power to override these defaults when required.

"You talked about a system using many third-party libraries. Let's look at how that plays out. Because there's so many libraries being used, there's a pretty good chance of conflict. In your system, that would mean that when a conflict occurs, you need to go in and change all the uses of the variable to use the prefix."

In my system, every variable in a separate module in general has to have a prefix, just as Python does. That prefix is set by the library author but can be changed by the user of the library if required. You can dispense with the prefixes temporarily by using import in order to manage this verbosity.

You mentioned knowing 'whether a variable is from the math module or not', but ironically, hyper-static scope handles that case wonderfully well. Because in hyper-static scope, all variables are resolved at compile-time to a unique box.

Doesn't help. Your program knows, but you, the programmer, can't easily know this by mere inspection of the code or a snippet of code. You may even need to compare source files from different libraries in order to resolve this question fully under the system you propose, or use a special-purpose IDE or tools. I think that is much more important. With my proposed system, if you see Math::sin or (import Math ... (sin x) ...) then you'd know where it's coming from. An import form has only local effects that end at the closing parenthesis. Use of hyper-static scope in the way you propose on the other hand has unpredictable global effects that can be difficult to trace. I personally don't feel that it is too much of a high price to pay in verbosity and flexibility, and both of these can be ameliorated to a certain degree by using import.

Don't get me started on IDEs. If a language needs a special-purpose IDE in order to be usable, well, I consider that a very serious shortcoming. That's just another kind of forcing users to do things they don't want to do.

-----

1 point by Pauan 4636 days ago | link

By the way... I think it's useful to use this as a quick test of the viability of a module system:

http://lambda-the-ultimate.org/node/3991#comment-60418

Hyper-static scope easily handles all of that, as discussed here:

http://akkartik.name/2012-11-11-lexical-global-scope.html

(That discussion is actually a subthread for this discussion: http://arclanguage.org/item?id=16986)

-----

1 point by rocketnia 4635 days ago | link

"That discussion is actually a subthread for this discussion"

Er, is it the other way around? It looks like at least this post of mine came after the email discussion: http://arclanguage.org/item?id=16995

---

"Hyper-static scope easily handles all of that"

For whatever it's worth, I still technically disagree with the "all" here. I admit your approach handles the practical cases.

Your approach has developers hardcoding filenames within their source code. If a developer wants to use two files of the same name, they must find a way to segregate the files into subfolders, or they must rename a file and invade some library source code to rewrite the filename occurrences. Please let me know if I'm wrong about this. :)

The LtU post also goes over at least one use case where a library user wants to update the dependencies of the library without also updating the library itself. Fortunately, this time I think we can agree that this isn't the problem we're discussing. :) It's something I care about in a module system, but it's not directly related to name collision.

-----

1 point by Pauan 4634 days ago | link

"Er, is it the other way around? It looks like at least this post of mine came after the email discussion"

Yes, but the discussion started with the Arc topic and then moved to e-mail and then moved back to the Arc topic.

---

"I admit your approach handles the practical cases."

Well then! I'll consider that "all", since I only care about the practical cases.

---

"Your approach has developers hardcoding filenames within their source code. If a developer wants to use two files of the same name, they must find a way to segregate the files into subfolders, or they must rename a file and invade some library source code to rewrite the filename occurrences. Please let me know if I'm wrong about this. :)"

Yes. It is tied to the filesystem, or website URLs, or Git commits, or whatever. But, the filesystem already enforces a "no two files with the same name in the same folder" rule, so no biggie.

If you wanted to create something that manages dependencies at a more abstract level, that's fine, and you can build it on top of my system, but I personally don't see much use for that (yet).

If your worry isn't about filenames at all, and is simply about putting filenames into the source code, I think the answer is really easy: just do something like RubyGems, where you have a standard file called "dependencies" that imports all the dependencies in the right order. And then when you want to load the library, you'd just load the "dependencies" file. This doesn't require any changes to my system, since it's purely user-convention.

This still allows for putting the dependency information straight into the source code, which is useful for quickie scripts and such. But big projects and libraries would use the "dependencies" convention. And as Ruby showed, this kind of user-convention can be applied after the language is already in use. So it doesn't need to be baked in ahead of time, though there might be some minor transition pain. But I'll worry about that once libraries and projects become big enough that a "dependencies" convention becomes useful.

-----