Random thought on my way home today: I get the name ~= for inequality, but it's awfully similar to Perl's =~ for regular expression matching. I don't know if that's a deal-breaker, though. My most-used languages don't actually have the =~ operator, but I do get a little "syntax envy" for it. So I think it's cool that there promises to be support for an infix =~, regardless.
(I started this post thinking that the RE-matching operator was ~=, but then I looked it up. Everything went better than expected.)
Yeah, I considered the issue of perl[1]. I might use =~, but I'll probably not use =~ for regex match if I use ~= for inequality. Different things should look different, and all that. The simple option would be to just use match like in javascript. But maybe something else will present itself. Maybe just define equality between regex and string as match?
Suggestion: wrap the memory allocation output of 002main.cc in some sort of debugging guard; maybe your makefile could have an ifdef DEBUG that would set some CFLAGS.
7083 4913331 wart> (prn "world") # this is a comment!<CR>
<CR>
<CR>
<CR>
what<CR>
7083 4913753 wart> (prn "what was that?")<CR>
020lookup.cc:26 No binding for at
Was it defined using indentation? Wart ignores indentation inside parens.
Well, then...the misadventures of rolling a parser by hand? Or maybe just a bug in the REPL, because loading doesn't seem to barf. Also,
$ ./wart
make: `wart_bin' is up to date.
7083 4912909 wart> (prn "Why two Enters, by the way?")<CR>
(prn "It's not like this sexp is evaluated.")<CR>
Why two Enters, by the way?
"Why two Enters, by the way?"
7083 4913331 wart> (prn "Then there are errors down here.")<CR>
004parenthesize.cc:36 Unbalanced )
dying
I probably don't understand the significant-whitespace deal. On that note: if indentation's going to be important, I'd take a leaf from Python's REPL and have some leader characters on line continuations the same width as the initial prompt, so everything lines up. Something like
wart> # return the index of x in seq, or nil if not found
>>>>> def pos(x seq n)
>>>>> default n :to 0
>>>>> if seq
>>>>> if (car.seq = x)
>>>>> n
>>>>> (pos x cdr.seq n+1)
In all, feedback from my first foray: the REPL needs some work; silencing make after the first run, hiding debugging figures (unless I make DEBUG), adding a PS2, and figuring out why the parser breaks in odd cases would go a long way toward the user experience. :)
It's not much feedback, but it's all I've got at the moment. On topic: I have nothing against the syntax you've chosen for your update, in principle. Did the double-character ambiguity get resolved? If so, you could forgo <- in favor of the typical = & ==. Maybe Haskell's /= is more appealing than ~=, unless you use that for destructive division. # frees up ; for some potentially interesting infix uses (invokes imagery of unquoting, but other symbols kind of have that covered...).
I used to be on a mac, yes, but I don't think that's why I use zsh :) It's just an apt-get away, and it spares me having to worry about bash kludges like "$@" and whatnot. But you're right, I should just use sh to minimize friction. It's not like it's a complex shell script.
Depending on how you parse number literals, there are the examples at the end of http://arclanguage.org/item?id=10149 which I used for stress-testing ssyntax/number highlighting.
Just in case there are any that are useful, I also used http://pastebin.com/YqxZydyw to test syntax highlighting. A lot of the tests have to do with recognizing Scheme numeric literals, though.
Some thoughts as they come to me (so forgive me if they're poorly formulated; I should be going to bed anyway):
1) I was appalled up until the "disjoint characters" part. Maybe open with that next time, so I know there's a disciplined way of telling what might be an infix operator. Don't scare me like that! :)
2) The global earmuffs are easy to dismiss. Angle brackets, I can live without (just change a->b names to use the word to). But
uppercase-char-p
=> (- uppercase (- char p)) ; probably not what you want
Ouch. Hyphens are by far my favorite separator in identifiers. Alas, infix will pretty much always ruin them.
3) operators surrounded by whitespace have lower precedence than operators that are not
So is this two levels of precedence, or as many levels as there are spaces? If I read right, it's the former: operators surrounded by any space versus those surrounded by no space. I like this idea, because it's a disciplined way of doing the http://arclanguage.org/item?id=16726 stuff. If you need more than two levels, cut your tomfoolery out and put in some parentheses!
4) The whole thing smacks a lot of Haskell's infix operators: special characters only, can still use them in prefix & define your own (as long as you wrap the name in parentheses!), etc. The key difference is the precedence & associativity thing (where wart is more like Smalltalk). Is this for simplicity/generality, or are there any technical reasons to avoid precedence rules? Because Haskell's way still seems fairly disciplined, by letting users assign numbers to the level of precedence & specify associativity. And those at least gives a certain level of control to today's Exacting Programmer On The Go (TM).
5) Haskell has a syntax for turning prefix operators into infix ones.
foo a b = a + b
1 `foo` 2 == foo 1 2
Thoughts?
6) Your point about range comparisons is an interesting one, because a long time ago I was mulling over anaphora's usefulness but lack of prevalence. Namely, it's undermined by t. If there's only one canonical false value, I was thinking it would pay to prefer returning a datum to indicate boolean truth, because then at least you can use your function in anaphoric contexts. But then, I was wondering how confusing it would make debugging code or pretty-printing results. Imagine plugging in a predicate and getting a non-boolean out of it, so it looks like the result of a computation more than an answer to a question.
7) I'm wary that the interpreter is getting more complex, so this might be temporary.
That's what worries me the most. Minimalism and special syntax don't often go well together (usually just to the extent that the syntax is itself minimal). At the same time, your efforts do seem a more measured (and user-definable) approach than the Readable guy's, so I've definitely got to hand it to you there.
8) I'd look forward to rocketnia's feedback. He has a knack for turning up odd edge cases. :P
9) Can we represent some ssyntax in infix?
Don't see why not, for the two-argument ones (&, :, ., and !, get notwithstanding). and and compose are fully-associative, so no problem. As long as the user expects precedence to follow the regular infix rules, nothing should be a surprise, even if it doesn't work like Arc's ssyntax does. E.g., using Arc's precedence, a:b.c and a.b:c will give compose top priority regardless:
arc> (ssexpand 'a:b.c)
(compose a b.c)
arc> (ssexpand 'a.b:c)
(compose a.b c)
The expectation in wart would (necessarily) be whitespace-then-left-associativity, so to match Arc's semantics, you'd have to use
a : b.c ; or just a:b.c, with left-associativity
=>
(compose a (b c))
and
a.b : c
=>
(compose (a b) c)
respectively. But this is more flexible, too, because you have the option to do
a . b:c ; hope you don't have dotted lists? Or just use a.b:c
=>
(a (compose b c))
10) I think it might even be a lot to ask for ssyntax and infix stuff to coexist. Ssyntax is already a finicky little sub-language of its own (in Arc); if it can be replaced with a more general mechanism, so much the better.
Could you explain, specifically, how infix operators react to the presence/absence of parentheses? Ignoring the treatment of tokenization (what with whitespace vs nonwhitespace operators), is the context-sensitive grammar roughly like so?
( infix a b ... ) --> ( infix a b ... )
( a infix b ... ) --> ( infix a b ... )
a infix b ... --> ( infix a b ) ...
where the last rule would create the behavior
( ... a b infix c ... ) --> ( ... a ( infix b c ) ... )
That's as much as I can gather from the examples, but I'd like having a clear mental model.
"The key difference [to Haskell] is the precedence & associativity thing (where wart is more like Smalltalk). Is this for simplicity/generality, or are there any technical reasons to avoid precedence rules?"
Hmm, I started out from the perspective in http://sourceforge.net/p/readable/wiki/Rationale that precedence violates homoiconicity. But if it happens in the reader and macros always see real s-expressions I suppose there isn't a technical issue.
My only other response: 9 levels of precedence?! Cut your tomfoolery! :)
---
I momentarily considered haskell's backticks, but there's a problem finding a reasonable character. And I wanted to not make the language more complex.
"If you need more than two levels, cut your tomfoolery out and put in some parentheses!"
Exactly! The discussion at http://arclanguage.org/item?id=16749 yesterday was invaluable in bringing me back (relatively) to the fold. And it was also easier to implement :o)
Thanks for all those comments! After mulling them I think I'll feel better if I can eliminate ssyntax in favor of infix operators. But there's two challenges to that:
I was trying to think of alternatives, thought "maybe a more complex symbol for one of the uses, like ..?", then wondered about a potential edge case. Really, I'm just thinking of the parsing algorithm---or, rather, lexing. If . was defined as the ssyntax is, would a..b expand into ((a) b)? Without spaces, it's fairly clear that certain arguments are "empty", since it could conceivably be tokenized as two .s. But a++b probably wouldn't tokenize to two +s. Suppose both . and .. were defined; how would a..b be resolved? Longest-operator-first?
f:g vs :syms
Could always go with another symbol for function composition. | comes to mind, but it's more like "reverse compose" at a Unix shell. On the other hand, as far as the function composition operator is concerned, I've seen mathematicians define it both ways---$(f \circ g)(x) = f(g(x))$ and $(f \circ g)(x) = g(f(x))$. No technical reason you couldn't use a pipe, just conventional.
Check out the details below, and give me your reactions. Is this too ugly to be worthwhile?
Excluding tests, this change reclaimed ~50 LoC. In all, this whole experiment has costed 225 LoC. I'm going to play with it for a bit before I decide to back it out.
---
Compromises:
1. In wart, unlike arc, you could mix unquote with ssyntax: car.,x, etc. This had to go.
2. You can no longer use ssyntax with operators: ++. used to expand to (++); now it's just a three-character op. Haskell's prefix syntax is now required to escape infix.
1. Periods can be part of operators, but dotted list syntax now uses ..., which is never an operator.
2. Period at end of symbol calls it. prn. is (prn), not (prn .)
3. Colon at start of symbol is part of the symbol. This was always true, for keyword args. It means I can't define :~ to support f:~g; it just didn't seem worth another special-case.
4. Bang at the end of a symbol is part of the symbol, for mac!, reverse!, etc.
5. Bang has another special-case. In keeping with left-associativity, prefix ops are always highest-precedence:
~odd?.n ; => (. (~ odd?) n)
However, ! at the start of a symbol is _lowest_ precedence:
1. Periods can be part of operators, but dotted list syntax now uses ..., which is never an operator.
Seems a worthwhile trade-off. Dotted lists are used infrequently enough, and an ellipsis does just as well as a single dot.
2. Period at end of symbol calls it. prn. is (prn), not (prn .)
Hm. So this is like a vestigial instance of ssyntax?
3. Colon at start of symbol is part of the symbol. This was always true, for keyword args. It means I can't define :~ to support f:~g; it just didn't seem worth another special-case.
Yeah, wouldn't want a special case on top of a special case! :)
4. Bang at the end of a symbol is part of the symbol, for mac!, reverse!, etc.
Have you considered a non-operator character for this use, to ditch the special case? I'm partial to mac=, reverse=, etc. I mean, since = isn't used for comparison anyway. And assuming that = is actually not an operator character. Did you ever decide if you wanted = to be an infix operator (and thus character)?
5. Bang has another special-case.
Whoa. Did I miss where this infix notation extended to prefix operators? Or does this work the same way ssyntax did? And if so, in what sense has ssyntax been removed? :)
Is this too ugly to be worthwhile?
Hm...parsing is getting too complicated for my tastes. But then, my taste is for parentheses. :P
Still, carving out special cases so ssyntax still "mostly works" isn't quite what I envision as a way to unify away ssyntax. Basically, is "traditional" (inasmuch as Arc establishes tradition) ssyntax prohibitively useful? Or can we do without some of its uses in the name of a more general infix notation without the complications of special symbol parsing?
Ok, I've tried to be ruthless and take the ssyntax stuff out. '!' is now always part of regular symbols, just like '?'. There's no prefix version like !a. And there's also no infix version, like f!sym.
It turns out this doesn't actually bring back any parens. !x becomes no.x or not.x. And in some situations I can replace a!b with a'b. (Maybe that's too clever.)
I've also dropped support for turning x. into (x). Not even traditional arc has that. Now x. turns into (x .).
Only remaining special-cases: '...', and ':' at start of sym is part of the sym.
Whoa. Did I miss where this infix notation extended to prefix operators?
Good point. This happened as part of the elimination of ssyntax, but I figured it was intuitive to extend left-associativity to prefix ops. However, now I see that:
(f a + b) => (f (+ a b))
but:
(- a + b) => (+ (- a) b)
Is that weird?
Thanks for the comments! This really seems like an improvement over my original idea.
Thanks for the comments! This really seems like an improvement over my original idea.
I'm glad you think so. I try to make my suggestions as nonprescriptive as possible, though (in full disclosure) I'm liable to lead you in circles back to prefix notation if you follow them too far. :P
It was that or lose <=, etc.
Oh, duh. Move along, nothing to see here!
Only remaining special-cases: '...', and ':' at start of sym is part of the sym.
I'm really okay with ..., because it doesn't feel like a "special case" as much as it does a built-in keyword; I wouldn't expect to be able to redefine fn or if, either. I don't really have an opinion on the :keyword symbols.
(f a + b) => (f (+ a b))
but:
(- a + b) => (+ (- a) b)
Is that weird?
Maybe, maybe not. It's not like every other language doesn't do mixfix with their "infix" notation. I just wasn't sure how it worked. Do you declare that certain operators are prefix? Or are they all potentially prefix, like
( mixfix a b ... ) --> ( ( mixfix a ) b ... )
where mixfix is any operator, a and b are any normal token, and ... is 0 or more tokens? Or something like that?
x-1.0
What's the intuitive way to parse this?
I'd say as subtraction of a float: (- x 1.0). If nothing else, I can't imagine a reason to do ((- x 1) 0).
Is it worth getting right, or should we just say, "don't use floats with infix"?
My gut reaction is that it's worth getting right, because programming languages shouldn't be ambiguous.
I notice that a lot of these problems seem to come from using the dot. Thinking back about ssyntax now, it occurs to me that the dot is probably the least-used among them, in Arc. If I were to guess from my own code, I'd rank their usage descending as ~, :, !, ., &. But hey, we can put numbers to that:
Mind you, it's been awhile, so I have no clue what all is in my personal ~/arc directory. Probably various experiments and output from things like sscontract (http://arclanguage.org/item?id=11179) and so on. All the same, the dot is low on the totem pole. I personally wouldn't be heartbroken to have to write (f x) instead of f.x, and you could reclaim the regular dotted list syntax. Would it be worthwhile to backtrack at this point and get !, :, and ~ functionality without worrying about .? There were some existing issues with ! and : (name collisions). ~ is prefix, but if you have a way of extending the infix notation for subtraction, surely it would apply to ~? Related thought: f ~ g could replace the f:~g you were worried about before.
Anyway, just some random thoughts off the top of my head. Do what you will with them.
Yeah you're right that using period as both an infix op and inside floats is kinda messy. I use it a lot more than you, so I'm still going through the five stages of grief in giving it up. In the meantime I've hacked together support for floats. Basically, the reader tries to greedily scan in a float everytime it encounters a sym-op boundary. Some increasingly weird examples:
Perhaps this is reasonable. We have a rule that's simple to explain, whose implications can be subtle to work out, but which programmers are expected to exercise taste in using. That could describe all of lisp.
"We have a rule that's simple to explain, whose implications can be subtle to work out, but which programmers are expected to exercise taste in using. That could describe all of lisp."
I don't think the syntax for numbers is very easy to explain. That's the weak link, IMO.
If it were me, I'd have no number literals, just a tool for translating number-like symbols into numbers. Of course, that approach would make arithmetic even less readable. :)
I think the next simplest option is to treat digits as a separate partition of characters like the partitions for infix and non-infix. Digits are sufficient to represent unsigned bigints with a simple syntax. Then most of the additional features of C's float syntax could be addressed by other operators:
-20.002e23
==>
(neg (20.@1002 * 10^23))
This hackish .@ operator, which simulates a decimal point, could be defined in Arc as follows:
(def dot-at (a b)
(while (<= 2 b)
(zap [/ _ 10] b))
(+ a (- b 1)))
You could avoid the need for this hack by treating . as a number character, but then you lose it as an operator.
"Basically, is "traditional" (inasmuch as Arc establishes tradition) ssyntax prohibitively useful?"
I don't think so. Nulan completely ditched Arc's ssyntax and only uses significant whitespace, ":" and ";". Yet, despite that, it's capable of getting rid of almost all parentheses.
Oh, by the way, ":" in Nulan has a completely different meaning from ":" in Arc. I just chose it because I thought it looked nice.
Actually, there's one major remaining question. This no longer works:
x-1.0
What's the intuitive way to parse this? Is it worth getting right, or should we just say, "don't use floats with infix"? Especially since wart recognizes whatever formats the underlying C recognizes:
-2.14e-3
It'll get complex fast to mix that with infix operators..
Did you ever decide if you wanted = to be an infix operator (and thus character)?
Yes, it's always been infix, so wart lost def= and function= when it gained infix ops. It was that or lose <=, etc. The question in my mind was more of style: even if assignment can be in infix, should I always use prefix?
Whenever I see things like the linked presentation (turns out I've seen it before), I can't help but think it's a solution in search of a problem. Simple parsing & evaluation models lead to expressive languages, and (I intuit) that's why their notations stick around despite people's efforts. For time immemorial, projects like these try to take a simple, consistent notation and pile on an abominable number of special cases with line noise more offensive than just parentheses [1]. It's not only unfitting for Lisps, but it's limiting for language design. There's only so much you can do with abbreviations, but soon enough there's feature-creep. I see that in his estimation (http://sourceforge.net/p/readable/wiki/Retort/), the special cases are sparing. But really? He needs three kaleidoscoping levels of syntax, knee-jerks to using the different brace types to supply soothing Algol-isms, introduces an indentation scheme that needs clarification regarding edge-cases with traditional syntax, and still finds a way to need special characters with special rules for "advanced" features. I guess we can agree to disagree...
In the end, I think it's a mistake trying to make Lisp notation something it's not. Much better is trying to improve what you've got within the structure of the language, like Arc turning
(let ((a 1) (b 2))
(+ a b))
into
(with (a 1 b 2)
(+ a b))
People are drawn to the idea of finally making Lisp "accessible" (I guess fixing society's use of infix notation is harder :P) while still retaining homoiconicity. Yet most seem to forget that homoiconicity requires simplicity, not Lisp syntax. People get stuck thinking inside the Lisp box, but again, simple parsing & evaluation models lead to expressive languages. For instance, combine a stack machine model (http://docs.factorcode.org/content/article-evaluator.html) with a simple parsing algorithm (http://docs.factorcode.org/content/article-parser-algorithm....) and you get a homoiconic language. Of course, it's a postfix language, so its math is unparsable to mere mortals...
[1] I admit he has a point about existing abbreviations that I'm simply used to, like quote/quasiquote/comma/splice and dotted lists. I think where those notations have stood the test of time is that they're subtle. I can use the quote family and the code still looks like Lisp, instead of looking like Lisp trying to look like C.
shader, fallintothis, will you email me? Email in profile. I'm going to post my solution to infix :) in a few days or weeks, and I would like to ping y'all to make sure I can get your criticisms.
(I'm not sure if you come here often or go periods without visiting.)
I think some moderate use of parentheses is okay, but I don't really like using them everywhere. Though, I wouldn't really mind an Arc dialect that had good syntax for functions and a Nulan-style type system, even if it did use parens.
Mildly better, but it still suffered from the lisp problem that arithmetic becomes hard to read.
Meh. The prefix version reads fine, to me. I think infix notation is too much of a sacred cow ("but math!"). People bend over backwards to try to selectively write infix notation in both prefix & postfix languages, but not much ever seems to get saved in the process. Elegant parsing algorithms are typically general, so why not stick with them? Prefix math isn't that bad.
At the same time, I'm not sure that perform is general enough beyond this problem. Wouldn't it be more general to have syntax for infix expressions (prefix be damned)? You could use a macro for rudimentary infix parsing, e.g.,
(= priority* (obj + 1 - 1 * 2 / 2 expt 3))
(def operand (tok) (~priority* tok))
(def priority>= (a b)
(>= (priority* a) (priority* b)))
(mac infix toks
(let (ops prefix) nil
(each tok toks
(if (operand tok)
(push tok prefix)
(do (while (and ops (priority>= (car ops) tok))
(push (list (pop ops) (pop (cdr prefix)) (pop prefix))
prefix))
(push tok ops))))
(while ops
(push (list (pop ops) (pop (cdr prefix)) (pop prefix))
prefix))
(pop prefix)))
(def zero (n) (is n 0))
(def sq (n) (* n n))
(def fib-iter (a b p q n)
(if (zero n) b
(even n) (fib-iter a
b
(infix sq.p + sq.q)
(infix sq.q + 2 * p * q)
(/ n 2))
(fib-iter (infix b * q + a * q + a * p)
(infix b * p + a * q)
p
q
(- n 1))))
But in the limit, you'd still want/need something built into the parser to deal with things like parens, whitespace, infix/prefix operator mixing, etc.
What does leap out at me, though, is the if where each clause is a function repeatedly applied to the same argument. I know I've seen versions around on the forum before with different names, but it's essentially just a variant of case:
(mac where (expr . args)
(withs (var (uniq)
ex (afn (args)
(if (no (cdr args))
(car args)
`(if (,(car args) ,var)
,(cadr args)
,(self (cddr args))))))
`(let ,var ,expr ,(ex args))))
(def fib-iter (a b p q n)
(where n
zero b
even (fib-iter a
b
(infix sq.p + sq.q)
(infix sq.q + 2 * p * q)
(/ n 2))
(fib-iter (infix b * q + a * q + a * p)
(infix b * p + a * q)
p
q
(- n 1))))
Beyond that, I'm not sure there's much more to simplify. I mean, the function already takes 5 variables---there's only so much you can do.
</ Really, I just wanted an excuse to write an infix-to-prefix macro :) >
:) Yeah, that's a nice example where infix-with-precedence looks nicer than curly infix (without precedence). http://sourceforge.net/p/readable/wiki/Rationale points out that precedence breaks homoiconicity. And without precedence you end up with a lot more redundant whitespace and tokens to do nested infix. Hmm, perhaps if the infix tokens are limited (Nulan's idea http://arclanguage.org/item?id=16487), violating homoiconicity for just them may be more reasonable. What if user code can extend and locally override the infix ops, but not change their relative precedence? I'm going to look for more examples from numerical methods and graphics.
Also, I think you underestimate how generally useful perform can be :) What if I called it poly for polynomial? I'll keep an eye out for more examples. Adding precedence doesn't change how uniform complex expressions have to look if you can't group them. It isn't as big a deal here, but I think more complex expressions with the infix macro would get less readable. What I really want is to use juxtaposition to indicate higher precedence:
(+ b.p a.q a.p)
---
Whoa, whacky idea alert. What if we use whitespace to indicate operator precedence?
(fib-iter {b * q + a * q + a * p}
{b * p + a * q}
p
q
{n - 1})
Here the + operators have three spaces rather than one around them, so are at a lower precedence and get translated into (+ (* b q) (+ (* a q) (* a p))). Assume that operators at the same precedence are always applied right-associatively.
Whoa, whacky idea alert. What if we use whitespace to indicate operator precedence?
Ha! File that under "never would've thought of it". Still think prefix does just fine, but I like that idea for its quirkiness.
Also, I think you underestimate how generally useful perform can be :)
Superficially contemplating perform (namely the fact that it's a macro) made me think of one thing I wish I had while writing infix. pop is a macro, so I couldn't pass it to map, proper. What I needed was a macro to map macros:
; Goofy-looking name, but bear with me
(mac macmap (m . exprs)
`(list ,@(map [list m _] exprs)))
(push (macmap pop ops (cdr prefix) prefix) prefix)
Which got me thinking: why did perform need to be a macro again? Well, obviously
(sum [apply * _] (list (q q) (2 p q)))
tries to eval (q q) and (2 p q), and writing (list q q) defeats the purpose of brevity. (Side note: in vanilla Arc, you couldn't use q.q, because ssyntax isn't expanded before macros.)
Then I actually double-checked your original definition of perform and realized (macmap pop ...) = (perform list pop ...). So I've found my own example! But I think we might could do one better.
What if I called it poly for polynomial?
poly is too specific, if this operator hopes to be general. The operation itself smacks more of distributivity---almost like distributing an operator over cons. Emphasis on "an" operator, not a pair of them (lest it read like "distribute addition over multiplication"). But because (macmap ...) = (perform list ...), maybe it's just a better name for macmap:
(mac distribute (op . exprs)
`(list ,@(map [list op _] exprs)))
; Thus
(distribute pop ops (cdr prefix) prefix)
If we were to make perform a single-operator macro, a similarity I should've noticed earlier appears: it's almost the same as macmap, except
(mac distribute (op . exprs)
`(list ,@(map [cons op _] exprs))) ; cons vs list!
; Thus
(apply + (distribute * (q q) (2 p q)))
Not that the above is the best rewrite of the original, but if it were...is there a way to reconcile the cons vs list difference, whether by different names or by merging it into a super-distribute?
I don't have have much to contribute in response except tangents.
Well, birds of a feather...
1. Still, way too clever for my taste
Oh, certainly. It was a result of me golfing an earlier version. :)
6. Can I prevail on you to switch from zero to zero??
In a heartbeat. I wish Arc used question marks (judiciously, mind you; don't need names like Scheme's char<?). But it doesn't, so I kowtow to its conventions.
I've doodled this idea independently a couple of times, but I see a few issues:
1. Line breaks throw a monkey wrench in the works (though it might be reasonable to prohibit line breaks within { ... }).
2. In many contexts, it can be tough to count spaces. :)
3. Sometimes it's nice to tabulate similar lines of code using extra spaces, and in those cases this feature would be a minor annoyance, since it would force some subexpressions to bail out to parentheses.
// Horizontal layout interferes:
var wargle = 2 * x + 0.03 * y + 200;
var tricennarious = 13 * x + 0.4 * y + 50;
// Fix:
var wargle = 2 * x + (0.03 * y) + 200;
var tricennarious = 13 * x + (0.4 * y) + 50;
var wargle = 2 * x + 0.03 * y + 200;
var tricennarious = 13 * x + 0.4 * y + 50;
Adding tabulation in addition to whitespace precedence would end up making expressions even more loose and baggy than otherwise. The writer would have to count spaces, but it's fairly clean when reading.
---
Thanks for the pointers! I'm going to go look at those now.
Of course, because Arc uses linked lists, this is actually considerably slower than linear iteration through the list, because every list access is multiplied by O(n).
True that. ;)
An interesting exercise nonetheless.
To this end, here are a few Arc tricks.
- Instead of
(is (len stack) 1)
you can use
(single stack)
which is defined as
(def single (x) (and (acons x) (no (cdr x))))
I.e., a linked list has length 1 if it doesn't have a cdr. This makes (single stack) read nicely, but we already know stack is a linked list, so we could also just say
(~cdr stack) ; == (no:cdr stack) == (no (cdr stack))
- Instead of
(is needle (car stack))
you could say
(caris stack needle)
which reads a little nicer, though the definition once more has an acons check:
(def bsearch (stack needle (o f <) (o offset 0))
(if (single stack)
(and (caris stack needle)
offset)
(let (fst lst) (split stack (round (/ (len stack) 2)))
(if (f (last fst) needle)
(bsearch lst needle f (+ offset (len fst)))
(bsearch fst needle f offset)))))
- Since "offset is used internally to keep where we are in the whole list", we could make the parameter internal to an anonymous function, then call that function on the "external" arguments. Using afn (anaphoric fn; http://en.wikipedia.org/wiki/Anaphora_%28linguistics%29), we can recurse in the anonymous function using the name self. This winds up being simple, but kind of ugly.
We wind up needing to supply the initial arguments to afn all the way at the bottom, wrapping the whole thing in another set of parentheses for the call. This prompted utlities like http://arclanguage.org/item?id=10055.
>This makes `(single stack)` read nicely, but we already know stack is a linked list, so we could also just say
(~cdr stack) ; == (no:cdr stack) == (no (cdr stack))
I wouldn't want to go to this optimization until finding out that it would significantly increase performance, but that argument hinges on thinking it's slightly less readable than `(single stack)`.
Either of these are neater than using `and`, which for clarity is better only used as a predicate, in my opinion.
Thanks fallintothis, zck, and shader for your improvements. I'm trying to take everyone's suggestions into account to make a new version, which I'll put in Anarki when I'm satisfied with it.
Can anybody think of any situations where this overloading would cause problems?
Admittedly, I can't think of any, but the scope of the change makes me uncomfortable. For any function f, (sref f v k) turns into (f k v)? Seems heavy-handed.
An alternative is to use defset to special-case on prototypes, but that relies on knowing the name of your prototype object. So, you'd have to make the macro explicitly assign to a name and defset that, which is kind of ugly:
(mac defproto (name parent . attrs)
(w/uniq lookup
`(let ,lookup (obj ,@attrs)
(= ,name (fn (attr . args)
(if args
(= (,lookup attr) (car args))
(or (,lookup attr)
((only .attr) ,parent)))))
(defset ,name (attr)
(w/uniq g
(list (list g attr)
`(,',name ,g)
`(fn (val) (,',name ,g val)))))
,name)))
arc> (defproto a nil x "foo")
#<procedure: a>
arc> (defproto b a y "bar")
#<procedure: b>
arc> (defproto c b z "qux")
#<procedure: c>
arc> (map [list _!x _!y _!z] (list a b c))
(("foo" nil nil) ("foo" "bar" nil) ("foo" "bar" "qux"))
arc> (= b!x "changed")
"changed"
arc> (list a!x b!x c!x)
("foo" "changed" "changed")
By separating out the setter logic, you could address the (a 'x 10) issue:
(mac defproto (name parent . attrs)
(w/uniq (lookup attr-setter)
`(let ,lookup (obj ,@attrs)
(= ,name
(fn (attr (o default))
(or (,lookup attr default)
((only [_ attr default]) ,parent))))
(= ,attr-setter
(fn (attr val)
(= (,lookup attr) val)))
(defset ,name (attr)
(w/uniq g
(list (list g attr)
`(,',name ,g)
`(fn (val) (,',attr-setter ,g val)))))
,name)))
arc> (defproto a nil x 5)
#<procedure:zz>
arc> (a 'x)
5
arc> (a 'x 10)
5
arc> (a 'y)
nil
arc> (a 'y 10)
10
arc> (a 'y)
nil
I'm not sure how to address the issue of using fns as protoypes. As Arc's type system stands, there may not be a much better way.
Since ac crawls over the s-expression transforming everything into Scheme, whenever it sees a fn (which is what creates lexical bindings), it keeps track of what variables are being introduced in that scope with the env argument. E.g., see the last line of the function
(define (ac-complex-fn args body env)
(let* ((ra (ar-gensym))
(z (ac-complex-args args env ra #t)))
`(lambda ,ra
(let* ,z
,@(ac-body* body (append (ac-complex-getargs z) env))))))
So it's entirely possible to compile things as function calls (instead of macro calls) when a variable's lexically bound. But this relies on how macros can't be lexically bound in Arc (no macrolet equivalent). On the other hand, first-class macros might be defined at runtime, and the compiler can't figure out whether something that looks like a function call is actually a macro call, hence forcing interpretation: http://arclanguage.org/item?id=11517.
It is theoretically possible to do a form of JIT compilation, where you compile the normal function (presuming function calls), but keep the source around and dynamically compile and link in branches of code for when the argument is a macro.
Basically, you'd have a jump table based on the argument. If it was a function, you'd jump to the function version. If it was a macro, you'd look it up in a hash table to see if it had been used before. If not, compile the function again and add the new one to the hash table.
Obviously, this wouldn't be as fast as a normal function call, since you'd have to do runtime checking of one of the arguments, but if you were careful the common case could be quite fast, with only the macro calls being slow and only new macro calls taking more than a small amount of time.
Of course, this is only one possible implementation. It is true that any language that supports macros could not be entirely statically compiled, but then again as far as I know any higher-order functional language requires runtime compilation as well. It partly depends on how you define "compilation."
It is true that any language that supports macros could not be entirely statically compiled
Actually, in a language with no side effects, I don't think there's anything keeping run time level code from being used at compile time, as long as there's already enough information to compile it. Who would ever know? ;) This means you have the most unique benefits of macros, that of being able to program them in the same language and environment as the rest of the program.
This is the basis behind Blade, an extensible statically compiled language I was making before I realized the lack of side effects made it really hard for me to write code for. :-p
> before I realized the lack of side effects made it really hard for me to write code for. :-p
So that's why you stopped working on Blade. I was curious! ^_^ Can you speak more about your realization that side effect-free programming is impractical? I've sometimes found myself tempted by pure FP, you see.
[PRO TIP: Click "link" (or "parent") on this comment, so you can view it (mostly) by itself rather than squashed against the right side of the page.]
For me, when I was writing my spider solitaire program, I had a couple of global variables, the-deck and the-piles, which represented the state of the game. the-deck was a list of randomly permuted cards, which were integers from 0 to 51 (or 0-25 with two suits, or 0-12 with one suit). the-piles was a list of piles, which were lists of cards (the zeroth element was the top of the pile), and the cards were of the form (list card revealed?), where "card" = 0-51 and "revealed" = t/nil. (Yes, a card is an integer in the deck, but a list of an integer and t/nil on the board. This does not need to be changed.)
To flip over the top card of the nth pile, I went:
(= (cadar the-piles.n) t)
To deal a card (face up) to top of each pile from the deck, I said:
(forlen i the-piles
(push (list pop.the-deck t) the-piles.i))
To move a chain of n cards from (the top of) one pile to another, I went:
(let (a b) (split the-piles.i n)
(= the-piles.i b
the-piles.j (join a the-piles.j)))
Also, I created a global variable called "prev-states", and every move the user performed would
(let (a b) pop.prev-states
(= the-deck a the-piles b))
. Now, what would you do in a "pure functional programming" language? If there were no assignment whatsoever, not even to global variables, I'd probably effectively implement global state by making every function take an extra argument representing global state, and shifting code around in other ways. If you could assign to global variables but all data structures were immutable (I think Clojure is like that, correct me if I'm wrong), then I'd have to keep rebinding the entire "the-piles" list when all I wanted to do was alter one or two piles. Revealing the top card of the nth pile would look something like this:
(= the-piles
(join (take (- n 1) the-piles)
(list:cons (list (car the-piles.n) t) (cdr the-piles.n))
(drop n the-piles)))
(Note that "push" and "pop" are legal because they just do global assignment; they don't alter any data structures). And then moving a chain of n cards from i to j would look like this:
(= the-piles
; ... oh god I have to know whether i>j
;(join (take (dec:min i j) the-piles)
; ... egad I may as well just say (if (< i j) ...
;(if (< i j)
; (join (take dec.i the-piles)
; (list:drop n the-piles.i)
; (cut the-piles i dec.j)
; (list:join (take n the-piles.i) the-piles.j)
; (drop j the-piles))
; (join ... egad the repetition is intolerable
; hmm, this'll work
(with (new-i (list:drop n the-piles.i)
new-j (list:join (take n the-piles.i) the-piles.j))
(if (< i j)
(join (take dec.i the-piles)
new-i
(cut the-piles i dec.j)
new-j
(drop j the-piles))
(join (take dec.j the-piles)
new-j
(cut the-piles j dec.i)
new-i
(drop i the-piles)))))
On the plus side, I would no longer need to use "deepcopy" when creating a backup of the game state.
(push (list the-deck the-piles) prev-states)
Which is pretty much the only benefit I'd get from outlawing modification of data structures, which I assume is what "pure functional programming" would mean. Don't even talk to me about how this would look if I had to simulate global variables...
Functional programming is a useful tactic in some cases. For example, I use a "chain-length" function, which might, say, take the list (7♥ 8♥ 9♣) and return 2, the number of revealed cards of consecutive rank of the same suit. I first thought about making it take "i" as an argument, referring to the ith pile in the-piles, but I made it take the list "xs" instead, and that allowed me to use it for other purposes: e.g. determining whether a move of n cards from pile i to j will create a longer single-suited chain. (Like, moving that 7♥ 8♥ chain onto a 9♥. I have a couple of AI-assistance procedures that look for moves satisfying conditions like that.) The expression is:
(is (+ n chain-length:the-piles.j)
(chain-length:join (take n the-piles.i) the-piles.j))
Had I written "chain-length" to refer just to the ith pile in the global "the-piles" variable, I'd... basically have to rewrite most of it. So factoring out things into individual functions (which, btw, is, I think, in the category of things called "functional programming", but isn't usually the same as making things side-effect-free) is frequently a good thing--it makes the program easier to write, to read, to understand, and to maintain.
But the goal, the reason you'd want to do it, is not because it's called "functional programming"--it's to make the program easier to write/read/understand/maintain. And if something called "functional programming" makes it harder to do those things, then I would advocate throwing it out. I think preventing all side effects, or all side effects except those on global variables, falls firmly into the latter category; I think my example demonstrates this.
; Move a chain of n cards from i to j.
(zap [let (taken leftovers) (split _.i n)
(copy _ i leftovers j (join taken _.j))]
the-piles)
; Reveal the top card of the nth pile.
(def copdate (orig . kvs)
(apply copy orig (mappend [list _.0 (_.1:orig _.0)] pair.kvs)))
(zap [copdate _ n [cons (list _.0 t) cdr._]] the-piles)
In principle I agree, disabling the programmer's ability to use mutation without reason is just frustrating. But I don't expect an example to help show this; most examples we could come up with would probably have easy-in-hindsight solutions like this. I think the point is that mutation gives us more easy ways to do things, so that we can quickly experiment and stuff.
I see you want to push the example to its limit a bit. ^_^ Well, I assume this hypothetical language would be stocked with utilities that made up for what its pureness missed out on. How believable are these?
That's definitely what I meant, and I was going to just say so, but whan I saw the ":)" I realized akkartik probably knew that and meant something like "that may not be 'oh god' complicated, but it's still not nearly as convenient as the mutating code." I don't know if that's a correct interpretation, but it made it more interesting to respond. ^_^
Yeah, I like the way you handle the-deck and the-piles. And I like this style of programming in general (i.e. a few global variables to keep state, functions corresponding to verbs in the problem space that manipulate the globals in an intuitive way). You've reminded me of how preventing all side effects could get awkward by ruling out this style of programming entirely.
To clarify, I'm not mostly interested in eliminating side effects because of some obsession with purity, but rather for performance sake. I've been pursuing so many lang design choices like fexprs/first-class macros that are supposed to really cripple performance, that I'm starting to worry my hobby implementations aren't even going to run. I'd heard that Haskell and Clojure get a big performance boost from enforced immutability/no side effects because it guarantees that most operations can safely be run concurrently, so I've just considered it an option.
I'm a real novice when it comes to optimization, and I'm certainly interested in techniques that could help me get decent performance without having to sacrifice mutability. TCO, strategic memoization, compiler hints? Any tips on this front could be really useful to me.
Aww, you know I can't resist talking about my hobby horses. XD
Blade is written in Groovy right now, and to make partial evaluation possible, everything is done in continuation-passing style. Partial calculations' continuations are held in stasis until the definitions they're waiting for become available. I guess it's related to cooperative multithreading and single-assignment variables, but I'm kinda playing it by ear.
Back when my work on Blade was petering off, the main reason was that Groovy's syntax wasn't very friendly for continuation-passing style. I really missed macros, and I wasn't in the mood for writing a complicated Groovy AST transformation. I wanted to just port it to Arc or something, but I had trouble writing Arc code itself without getting hung up in all the abstraction leaks (particularly unhygienic macros) and absent features (particularly OO-style type-based dispatch and weak references).
Meanwhile, thanks to ranting about Blade here, I realized it wouldn't be a good REPL language. Single-assignment has gotta paint people into corners at a REPL, and the alternative would be to compile the whole application over and over, using some meta-runtime to orchestrate the compiles. But what runtime?
Also, I began to suspect Bllade's design itself would result in something ugly to use and possibly even unusable. A declaration's behavior would depend on definitions, and a definition's value would be what the declararions said it was, and naturally there'd be deadlocks. The deadlocks were easy to avoid in toy examples, but I didn't know what limitations and gotchas Blade would develop once it matured. (I still wonder about that.)
So a plan fell into place: I could pursue a language that had similar high hopes for extensibility as Blade, but in a REPL-friendly form. That language's evaluation model would be much more familiar to me, and it would be closer to any of the languages I might choose to implement it in, so I'd probably be able to implement it faster and more confidently. Then I could use the resulting language--and the lessons I learned from it--to experiment with different variations on CPS and finish Blade.
So I started working on that language, and I called it Penknife. ^^
Altogether, I don't think this was based on any bad experience with pure functional programming. It was more about my expectation of those bad experiences, together with the frustration of writing CPS code in longhand Groovy.
> Meanwhile, thanks to ranting about Blade here, I realized it wouldn't be a good REPL language. Single-assignment has gotta paint people into corners at a REPL,
> So a plan fell into place: I could pursue a language that had similar high hopes for extensibility as Blade, but in a REPL-friendly form.
So the REPL was a large part of it. I've heard too that single-assignment is problematic for REPLs, but I don't quite understand why. Can't you just use let everywhere you'd use = and def ordinarily?
; Scribbling at the REPL
arc> (= x 5)
5
arc> (def add1 (n) (+ n 1))
#<procedure: add1>
arc> (add1 x)
6
; Single-assignment version
arc> (let x 5)
nil
arc> (let add1 [+ _ 1])
nil
arc> (let x 5
(let add1 [+ _ 1]
(add1 x)))
6
REPLs should provide a convenient way to grab commands from the history anyway, so you can just keep building on your previous let. It is more cumbersome, but you also avoid some of the nasty surprises that come after you've accumulated a complex state in your REPL.
Could this style of REPL'ing address the problems posed to REPLs by single-assignment, or are there larger problems I'm ignoring?
Update: In the particular REPL example I've shown, there's no need for the style transformation since nothing gets redefined. Hopefully you can still gather what I mean from it! ^_^
Can't you just use let everywhere you'd use = and def ordinarily?
Well, single-assignment can have variables that are introduced without values and then assigned to later. It's a bit of a relaxation of pure FP that (I think) doesn't sacrifice referential transparency. Hmm... actually, a lambda that assigns to a binding outside itself is not referentially transparent, so oh well. :-p
In any case, Blade isn't directly supposed to be a single-assignment or pure sort of language. I just want to view the space of Blade declarations as a completely orderless set, letting the compilation of each one run in parallel semantically, and I still want it to have an intuitive kind of determinism. Inasmuch as I can orderlessly orchestrate side effects in ways I find bearably intuitive, I'll probably add side effects to Blade.
Blade side effects would also be limited by the fact that I want to build up to an IDE that rewinds the compile when you edit. In fact, I'm almost certain my current plan won't work well with that. I expect custom declaration syntaxes to be supported by core-syntax declarations that happen to branch based on the declarations they find by reading the project tree, but then editing any part of a file would invalidate the branching itself and cause recompilation of all the custom-syntax declarations in that file. Seems there'll need to be a more edit-resistant kind of branching. I already expect a bit of IDE cooperation with this--it doesn't have to be as hackish as diffing plain text--but that's where my ideas leave off.
...Actually, I've got a bit of an idea already. XD A declaration that observes a file's contents in order to branch can limit its observation to some method that doesn't rely on the contents of the individual declarations, and then each of the branches can observe just its own part of the file. The core syntax already works like this, essentially splitting the file based on appearances of "\n\n[".
Anyway, you've gotten me thinking and ranting. ^_^
---
Would this style of REPL'ing address the problems posed to it by single-assignment, or are there other problems I've ignored?
The kind of problem I expect to happen is that I'll define something once, I'll define something based on that, and then I'll realize the first thing had a bug, but I won't be able to redefine it as simply as I can in Arc. Once I do enter the correct version, I'll have to paste in all the things that depend on it too.
To draw a connection, it's similar to how when you want to redefine a macro in Arc, you have to re-enter all the code that used the macro in order for it to use the new one, whereas fexprs wouldn't have this problem.