My mistake, I said B-tree when I meant "binary tree"[1]. It has been corrected.
AVL is a particular kind of self-balancing binary search tree[2], which is what I want. Which kind of tree isn't so important, just so long as it's simple and fast enough.
---
* [1]: B-trees are a self-balancing n-ary search tree, and judging by the Wikipedia article, I probably won't use them in Nulan, at least not for a while.
* [2]: Not every binary tree is a search tree, and not every search tree is self-balancing.
I want all three properties because of the good worst-case performance and because I believe that it will be easier to parallelize if the trees are roughly balanced.
As for why I chose binary over n-ary, that's simply for the sake of simplicity. Since it'll be a core datatype, I don't want the programmer API being too complicated.
The key quote seems to be: "because environments are immutable, this has exactly the same shadowing effect as a nested $let!". Most interesting indeed.
Yup. Immutable data + mutable variables is quite amazing, it solved so many different problems I was having, all with a very simple and elegant system.
By the way... I mentioned a few times that my namespace system is "faster" but I didn't really explain why. The short answer is because all environments are immutable.
---
Now for the long answer...
Imagine a language like Scheme/Racket/JavaScript which uses pervasive lexical scope. A good optimizing Scheme compiler will be able to look at a function's lexical scope and determine what might change and what might not. The things that won't change could be inlined.
Now, that's great and all, but that property is completely destroyed by Kernel because of its pervasive (mutable) first-class environments. This not only makes it more difficult for humans to reason about programs, but it also means the compiler basically gives up, because it has almost no information about which names are used and which aren't. And because almost anything can change at any time, anywhere, it's much harder to do things like inlining.
So making a language with first-class vaus/environments pretty much forces you to run your code in an interpreter/JIT, because the cost of implementing a compiler probably isn't worth it (too much work for too little speed gain). It's also a large part of the reason why Lisps with vau were seen as "too slow" and why they were seen as "too powerful" because they had no fixed semantics.
---
But in Nulan, the environments are even more lexical than in Scheme. In particular, Scheme allows for forward references, but Nulan doesn't. That means that this code won't work...
$def! foo: -> (bar)
$def! bar: -> (foo)
...because the name `bar` doesn't exist in the lexical scope for the function `foo`. This means that functions can only depend on names that appear before them in the source code. I like this property already because it makes the control flow easier to understand.
This means the lexical scope of a vau never changes. Ever. So once a vau/function has been created, you know exactly what names are in its lexical scope, and in addition, if those names aren't vars, you also know exactly what their value is.
Not only that, but the dynamic scope of a function call never changes either. This means that you can actually execute/inline vaus at compile-time, just like macros, without removing their first-classedness.
And so the compiler/interpreter can use that extra knowledge to generate really fast code, because it knows precisely which names are in the lexical scope or not, and also knows precisely whether a name is a variable or not. If it's not a variable, it cannot change, thus it can always be safely inlined, and the default is to use immutability with variables being opt-in. This is like a compiler-writer's dream.
---
Now, you might be sitting there thinking I'm crazy for not allowing forward references, but that's not quite right. Nulan does allow for forward references, you just have to explicitly use a var:
$var! bar
$def! foo: -> (bar)
$def! bar: -> (foo)
Now there's a (mutable) variable `bar` in `foo`s lexical scope. Then the later definition of `bar` will mutate that variable, so there's no problem. If you don't want `bar` to change at a later date, you can reset it to a normal non-var like so:
$set! bar: bar
If this pattern is common enough, I can provide a vau that does it for you:
$prevar! bar
$def! foo: -> (bar)
$def! bar: -> (foo)
This means that Nulan can do everything other languages can do with mutation, you just need to be explicit about it. This makes mutation obvious in the language itself, which in turn makes the code easier to reason about, AND gives compilers a lot of wiggle room to make optimizations.
---
Note: I didn't choose to use immutability because of performance, by the way. I discovered immutable environments because I was trying to solve the namespace problem and only afterwards I realized it also had the side benefit of making code potentially much faster. So this wasn't a case of premature optimization.
As for why I was thinking about immutability in the first place... I'll be honest, it was because of Clojure. I saw a video of Rich Hickey describing the concurrency model in Clojure and I was instantly hooked. Immutable data + mutable variables is sooooooo good for concurrency.
And even though Nulan isn't concurrent (yet) I figured I might as well add it in. Even in a single-threaded program, I hoped it would lead to clearer code, while also making it forwards-compatible with multi-core. It turns out my hope was right, because I discovered the immutable environment idea, which has more than paid for itself.
Yes, but it's not ready yet. The parser is pretty rock solid, but the evaluator is broken. This is largely because I'm moving from using cons to using B-trees and the transition has only just started.
Do you mean moving the vote buttons to the right of titles? Since they're images, and since they're laid out using a table, you would have to reorder columns of the table.
I'm not sure how to test it in a hebrew encoding, but here's an idea. Can you look inside news.arc for a function called display-story, and turn something that looks like this:
(def display-story (i s user whence)
(when (or (cansee user s) (s 'kids))
(tr (display-item-number i)
(td (votelinks s user whence))
(titleline s s!url user whence))
(tr (tag (td colspan (if i 2 1)))
(tag (td class 'subtext)
(hook 'itemline s user)
(itemline s user)
(when (in s!type 'story 'poll) (commentlink s user))
(editlink s user)
(when (apoll s) (addoptlink s user))
(unless i (flaglink s user whence))
(killlink s user whence)
(blastlink s user whence)
(blastlink s user whence t)
(deletelink s user whence)))))
Into this:
(def display-story (i s user whence)
(when (or (cansee user s) (s 'kids))
(tr (tag (td colspan (if i 2 1)))
(tag (td class 'subtext)
(hook 'itemline s user)
(itemline s user)
(when (in s!type 'story 'poll) (commentlink s user))
(editlink s user)
(when (apoll s) (addoptlink s user))
(unless i (flaglink s user whence))
(killlink s user whence)
(blastlink s user whence)
(blastlink s user whence t)
(deletelink s user whence)))
(tr (display-item-number i)
(td (votelinks s user whence))
(titleline s s!url user whence))
))
Basically I'm swapping the two expressions that begin with tr. Can you try that and see if it works for you?
(The version I have may be slightly different from yours, so just pasting this fragment in may not work.)
Come back and tell us how it went. If it worked, a couple more callers of votelinks will need to be changed as well -- for comments and polls.
No, I think this idea works even in a traditional lisp without indent-sensitivity. The problem is that lisps have an if or cond form that interleaves expressions with very different semantics. Lots of other forms have multiple kinds of expressions, but usually the first expression is of one type (bindings) from the rest, or something like that. cond is special because it continues to alternate semantics all the way down, and as the individual expressions grow complex it gets hairy to read. I think judicious use of colons in this manner make if/cond more readable. Especially to non-lispers.
I deliberately chose the example in the commit message to be fully-parenthesized. It's taken from news.arc, and I was trying to show that the colons help distinguish tests and branches. Was that not helpful?
Ah, I didn't realize you disabled indent-sensitivity inside of parens, so I thought that example's indentation might have been important.
You could be onto something with ":". What do you think of these styles? ^_^
(Sorry for the huge examples.)
(if : !user
(submit-login-warning url title showtext text)
: (~and (or blank.url valid-url.url) ~blank.title)
(submit-page user url title showtext text retry*)
: (len> title title-limit*)
(submit-page user url title showtext text toolong*)
: (and blank.url blank.text)
(let dummy 34
(submit-page user url title showtext text bothblank*))
: (let site sitename.url
(or big-spamsites*.site recent-spam.site))
(msgpage user spammage*)
: (oversubmitting user ip 'story url)
(msgpage user toofast*)
(let s (create-story url process-title.title text user ip)
(story-ban-test user s ip url)
(when ignored.user (kill s 'ignored))
(submit-item user s)
(maybe-ban-ip s)
"newest"))
(if !user
(submit-login-warning url title showtext text)
: (~and (or blank.url valid-url.url) ~blank.title)
(submit-page user url title showtext text retry*)
: (len> title title-limit*)
(submit-page user url title showtext text toolong*)
: (and blank.url blank.text)
(let dummy 34
(submit-page user url title showtext text bothblank*))
: (let site sitename.url
(or big-spamsites*.site recent-spam.site))
(msgpage user spammage*)
: (oversubmitting user ip 'story url)
(msgpage user toofast*)
(let s (create-story url process-title.title text user ip)
(story-ban-test user s ip url)
(when ignored.user (kill s 'ignored))
(submit-item user s)
(maybe-ban-ip s)
"newest"))
When I was working on Penknife, I designed every "if" macro to insert parentheses for its else clause. Mainly I did this so I could put different kinds of conditionals like 'if and 'iflet together in what looked like a single branch, but I liked its effect on self-documentation, which is similar to the use of your ":":
(if !user
(submit-login-warning url title showtext text)
if (~and (or blank.url valid-url.url) ~blank.title)
(submit-page user url title showtext text retry*)
if (len> title title-limit*)
(submit-page user url title showtext text toolong*)
if (and blank.url blank.text)
(let dummy 34
(submit-page user url title showtext text bothblank*))
if (let site sitename.url
(or big-spamsites*.site recent-spam.site))
(msgpage user spammage*)
if (oversubmitting user ip 'story url)
(msgpage user toofast*)
let s (create-story url process-title.title text user ip)
(story-ban-test user s ip url)
(when ignored.user (kill s 'ignored))
(submit-item user s)
(maybe-ban-ip s)
"newest")
Nowadays I like the idea of implementing Arc's (a:b ...) ==> (a (b ...)) syntax in the reader, and that would make my Penknife design pattern unnecessary:
(if !user
(submit-login-warning url title showtext text)
:if (~and (or blank.url valid-url.url) ~blank.title)
(submit-page user url title showtext text retry*)
:if (len> title title-limit*)
(submit-page user url title showtext text toolong*)
:if (and blank.url blank.text)
(let dummy 34
(submit-page user url title showtext text bothblank*))
:if (let site sitename.url
(or big-spamsites*.site recent-spam.site))
(msgpage user spammage*)
:if (oversubmitting user ip 'story url)
(msgpage user toofast*)
:let s (create-story url process-title.title text user ip)
(story-ban-test user s ip url)
(when ignored.user (kill s 'ignored))
(submit-item user s)
(maybe-ban-ip s)
"newest")
This particular style really would conflict with your keyword arg syntax, and I don't expect it to work in unparenthesized expressions, but maybe you can get something out of it anyway.
Thanks for those examples! Yes, I didn't want to hardcode ':' for just this one cond use case I had in mind. I wanted to let people highlight the tests rather than the branches if they preferred.
One nit: I wish the first two gave branches more indent than tests; I think I'm comparing them to switch statements.
In this fragment from example 1:
(let dummy 34
(submit-page user url title showtext text bothblank*))
: (let site sitename.url
(or big-spamsites*.site recent-spam.site))
..and in this one from example 2:
(let dummy 34
(submit-page user url title showtext text bothblank*))
: (let site sitename.url
(or big-spamsites*.site recent-spam.site))
..the colon isn't separating the cases clearly enough IMO. There's a confounding vertical alignment that distracts the eye.
The discussion on readable-discuss suggested example 2, but it would require editor support to ensure indenting a line doesn't move the ':' at the start. Autoindent settings would also interfere. But they're like #ifdef's, so there's some precedent for doing things this way. Keeping the ':' in column 1 would eliminate the need for clarifying that the colon "does affect the indentation of a line.." That might make them more intuitive.
I hadn't considered the latter two ideas; they are very compelling indeed. I'm going to think about them more.
Ah, I didn't realize you disabled indent-sensitivity inside of parens, so I thought that example's indentation might have been important."
Actually even that is irrelevant ^_^. Even before I disabled indent inside parens those examples wouldn't insert parens anywhere because the inner clauses are all fully parenthesized. Wart has never wrapped a line in parens if it already began with parens.
Basically paren insertion is extremely conservative and avoids messing with code as much as possible. Now that I've disabled it inside parens this is even more true.
The colors on this machine aren't quite right, but it shows what I care about:
a) Comments since they're never evaluated
b) Literals since they eval to themselves
c) Parens and ssyntax -- mostly as delimiters, but with backquotes distinguished
Everything else is unhighlighted. If the language does its job I really shouldn't be thinking about whether something's a macro. And local variables ought to be the default, so why add a little salience to Every Single One?
---
Wart comes with the vim settings for this highlighting: http://github.com/akkartik/wart/blob/2e01126102/vimrc.vim. It's very smart about ssyntax. The colors really indicate precedence. Notice in the second statement how some colons are colored like ssyntax, but not others. Or how the exclamation in mac! at the bottom isn't colored like ssyntax.
But after all that I don't want to make too many assumptions about how a new reader will view one's code. It needs to be visually balanced even without highlighting. Your typography rules remind me a little of early wart. See the if macro at the end of http://www.arclanguage.org/item?id=15137 -- and your comment on http://www.arclanguage.org/item?id=15140 :)
"If the language does its job I really shouldn't be thinking about whether something's a macro."
This is the same argument we had before... it's just not true. Macros/vaus behave fundamentally different from functions, they are not the same thing. By making them stand out, it gives your eyes something to grab onto.
Humans are wonderfully good at noticing patterns, but only if there's enough information there to pattern match on. If you don't provide this information in the syntax, it adds additional mental overhead.
You now have to memorize whether something is a vau or not (for common things like $let this isn't a problem, but for things less commonly used it can be a pain). The same goes for locals: if it isn't apparent in the syntax whether a variable is local or not, you have to mentally scan up the scope chain every single time you glance at code.
One of the problems with Lisp is that due to its lack of syntax, there's very few patterns that your mind can pick up on, so you have to do a full-blown mental parse of the source code just to determine whether something is a local or a vau or whatever.
It might not seem like much, but all the tiny extra mental overheads do add up. I believe that once you get used to my syntax, it's easier to read source code, because just by glancing at it your mind can notice all the little patterns.
Of course, there might be better criteria other than fn/vau/global/local/predicate/mutation... if so, I'd be interested in hearing it[1]. But I do think, whatever criteria you choose, it's important to have it be visually apparent so our poor human brains don't have to work so hard to parse our code. We are visual creatures, let's give our minds some visual feedback to chew on.
---
"And local variables ought to be the default, so why add a little salience to Every Single One?"
You forget that most functions/vaus are global. In fact, in my language, roughly 1/2 of the variables are globals, with the other 1/2 being locals. Out of those globals, roughly 1/3 are vaus, with the other 2/3 being fns.
You're right, there is a fine line between adding syntax to make the source code more readable, and adding syntax so it ends up looking like Perl. I've tried to add in syntax only when I feel there's a significant benefit from doing so.
I'll note that my language doesn't have that particular problem mentioned in that particular post because my language doesn't have quasiquote/unquote/unquote-splicing, so that example would just be `@Body`
---
By the way, I used to be really off-put by how Kernel uses `$?!` in symbols to give them special meaning, and I also really disliked how Shen has local variables start with a capital letter... but after trying it out for a while, I got used to it and found that it actually wasn't that bad after all. Now I think it's an overall net win.
---
For comparison, here's how my syntax highlighting for my language currently looks:
* [1]: In particular, I just realized that it might be better to use $ for constructs that introduce additional binding names. This might be more useful than a general vau/fn distinction.
Then again, after looking through the source code, there were only a handful of vaus that didn't introduce new bindings: and, catch, hook, if, or, and quote
So, given how most vaus apparently exist for name binding, I think it's best to just use the general vau/fn distinction.
"Macros/vaus behave fundamentally different from functions, they are not the same thing. ..it adds additional mental overhead."
Functions, macros, they're just ways to get certain behavior in the most readable way possible. Perhaps they add mental overhead in kernel because it's concerned about hygiene and such abstract matters.
Wanting to track your macros is OCD like wanting to avoid namespace pollution is OCD. Just relax, use what you need, remove what you don't need, and the function/macro distinction will fade into the background.
"Functions, macros, they're just ways to get certain behavior in the most readable way possible."
Sure. And their behavior is different: macros/vaus don't evaluate their arguments, functions do. That's because they're used for different purposes, so distinguishing between them is important and/or useful.
---
"Perhaps they add mental overhead in kernel because it's concerned about hygiene and such abstract matters."
I don't see what hygiene has to do with it... we're discussing about making it easy to tell at a glance whether a particular variable is a function or a vau, that's all. That's true regardless of whether the vau is hygienic or not.
I'll also note that I have not actually programmed in Kernel, so all my talk about "mental overhead" is actually referring to Arc, which is a distinctly unhygienic language.
In any case, my gut says that making a distinction between vaus and functions is important, so that's what I'm doing.
---
"Wanting to track your macros is OCD like wanting to avoid namespace pollution is OCD. Just relax, use what you need, remove what you don't need, and the function/macro distinction will fade into the background."
I do indeed worry about namespaces, which is why my language is going to have fantastic namespace support, most likely built on top of first-class environments.
Not only does this allow people to write solid libraries that don't need to worry about collisions, but it also has the massively major benefit that you know exactly what a variable refers to, because each module can be studied in isolation. You can't do that when everything is in one namespace.
So this has the same benefits that lexical scope and referential transparency give you: you can study different subparts of the system in isolation without worrying about what another part is doing.
Incidentally, that's why dynamic scope is so bad: it's not enough to understand what a single function is doing, you also need to understand what the rest of the program is doing, because some other random part of the program might change the dynamic variable.
That's why "lexical by default, marking certain variables as dynamic" is superior to "dynamic by default": it increases locality because you don't need to jump around everywhere trying to figure out what everything does, you can just focus on one part of the system at a time.
That's the whole point of functional programming, and my language is intentionally designed as a functional language. In fact, I plan for all the built-in data types to be immutable as well, for the exact same reasons. This should also help immensely with concurrency, similar to Clojure.
"I'll also note that I have not actually programmed in Kernel, so all my talk about "mental overhead" is actually referring to Arc, which is a distinctly unhygienic language."
That is really interesting, that our respective experiences are so different.
I'm with you on "lexical by default" -- I'm not totally crazy :) But the simplest possible mechanism that provides the similar advantages of namespaces is to just warn when a variable conflict is detected, when a global is defined for a second time.
I'm trying hard to introspect here, and I think the difference between lexical scope and namespaces for me is that when I'm programming by myself I don't need a second namespace, but I do still find dynamic scope to be error-prone. My entire belief system stems from that, that one should program as if one was working alone. Everything that helps that is good, anything that isn't needed is chaff.
I like the highlighting; the color makes the typography less jarring. But you're right, it's one of those things one should familiarize oneself with before judging.
Oh it's very simple. The operators + - * / < > <= >= are the only infix operators (for now). They have the usual precedence rules that other languages use.
How they work is, they take one expression on the left, and one expression on the right, and then wrap em in a list, so that `X + Y` becomes `(add X Y)`, and then "add" is the actual add function.
So they're just syntax sugar for common infix operations, that's all. That's why the last example passed "mul" to "sum" rather than "*".
I got the sense that the unbalanced parens are just a pedagogical device (though I'm not sure they helped me understand it any better). I didn't actually see any code samples with unbalanced parens. The final factorial example still had balanced parens which would be parse errors according to his earlier slide.
Above all, I'm left with a frustration that there isn't code to play with so I can clarify my confusions on my own and not get held up by ambiguities in some presentation. This is a common complaint of mine. Why do people do this? Why are they so concerned with getting it right before they're willing to let the world see it? If you throw it out earlier, who knows, somebody might come and contribute earlier. I'm more motivated to contribute when something is half-baked. Once you've figured it all out, it hardly seems worthwhile :)
"This is a common complaint of mine. Why do people do this? Why are they so concerned with getting it right before they're willing to let the world see it? If you throw it out earlier, who knows, somebody might come and contribute earlier. I'm more motivated to contribute when something is half-baked. Once you've figured it all out, it hardly seems worthwhile :) "
This part of what you're saying could well be reasoning in favor of discussion before code. Coding is a process of developing ideas, but so is discussion. This person has let the world see their ideas in the form of a presentation, rather than obsessing over getting them "right" in the form of runnable code first.
Unless... you're not even asking for runnable code? Interesting. Are you asking for people to be comfortable enough to do public brain dumps of all their works-in-progress, regardless of how useful they expect them to be?
No I want runnable code. But isn't all code somewhat runnable? Otherwise it wouldn't be code. Almost any project is runnable within a few hours.
Your argument assumes that the presentation is less work than code, but I don't think that's true. He's clearly put hours of effort into presentation, but there isn't enough for me to even be clear on what he's proposing. Code would be unambiguously concrete in this respect. Even if it only works some of the time, if it has bugs, etc. I'd be able to get a sense of how it ought to work.
"Almost any project is runnable within a few hours."
"Your argument assumes that the presentation is less work than code, but I don't think that's true."
I take this a little personally, because there are many projects where I still don't even know what I want several years in. :) Well, programming is all about knowing what one wants, but I mean I don't even know these projects well enough to identify the core program I should start with. But I like to think I thrive on these ideas, because interesting big projects are the main reason I even give a second thought to little one-day projects.
Also, programming has a skill aspect to it. Unless someone's used a certain tool or technique before, it can be frustrating and intimidating. I personally find several things frustrating that others take for granted, like Emacs, Vim, manual memory management, the command line, and yes, riding a bicycle. :-p If someone's not ready to code up even a hackish language yet, I can relate.
I certainly didn't mean to make it personal. I didn't even think I was talking about you.
I vaguely sense that we're using very different meanings for words like "runnable", "less work", "program", "project" and "right". But now I'm afraid to pick at this further.
I'm not claiming there should be no discussion without code, or that people must have working code when making a proposal, or anything nearly that strong. In this case from the certainty and polish of the presentation I assumed he knows what he wants. And he's referred to code so we know it's not a pure spec. So the bottleneck seems to lie in me understanding his proposal. And I was suggesting that sharing whatever code he has might help me over that hump. Showing code can only ever help, never hurt.
"I certainly didn't mean to make it personal. I didn't even think I was talking about you. [...] now I'm afraid to pick at this further."
Oh, sorry. I'm personally invested in this topic, but I'm not offended. But come to think of it, my post was a few claims fluffed up with personal foibles in place of other justification, and thanks for not being eager to refute the acceptableness of my foibles. :-p
---
"In this case..."
I don't have much of an opinion in this particular case. I was spurred on by the "common complaint" that people don't share their code in progress, and I'm interested in what kind of overall strategy we should pursue in response.
- Social networks for code sharing (e.g. package managers, HTTP, GitHub)?
- Collaborative development of large-scale online worlds (e.g. Wikipedia)?
- Socially encouraging or discouraging people to program depending on their personality?
- Investigating what kinds of programming problems are so mathematically exotic that meaningful code is exactly the thing that's hardest to develop?
- Different laws and licenses related to sharing code?
---
"And he's referred to code so we know it's not a pure spec."
I don't remember that part. I did skip a few boring parts in the video. ^_^;;
Hmm, maybe I even consider runnable code to be relatively boring and forgettable. XD; Probably depends on whether it's a product I'm eager to use right away. ^_^