Now it's my turn to comment back after awhile, ha. Thanks for the replies!
I don't have much to add. Your explanation of the type system, such as it is, makes more sense now. Could sit here splitting hairs all day over whether to call it "weakly typed" or "untyped" or whatever other term, but it makes sense at the end of the day when you put it in terms of the (val 0xXX 0xXX) stuff.
Am I right in basically reading that like (val [value] [tag])? And then, the idea is that all operators (not just +) operate on the "value" part of these "value x tag" tuples? Destructive operators preserve the tag, because they're also just operating on the "value" part, but of the same structure. Whereas nondestructive operators need to create a separate (val ...) structure, and don't set the tag. So would
{'b' 'a' +}
evaluate to 0xc3, since that's 0x61 + 0x62 and the type tag is unpreserved, despite being (presumably) the same (since they're both strings)?
Sorry to ask all these questions that would probably be better answered by me playing with the language myself or looking at the docs. It's an easy---but surely annoying---way to make conversation. Feel free to "RTFM" me. :)
P.S.:
"cloud" computing (scare-quotes because cloud is a particularly obtuse buzzword, IMO).
Ugh, that's awful. I want to print out error messages that, in part, can be copied and pasted into the REPL. Right now, since double quotes are escaped, I'm printing single quotes, which obviously don't work. (https://bitbucket.org/zck/unit-test.arc/issue/40/make-error-...)
I guess this is a reason to switch to Anarki. But even that doesn't help too much given that I want unit-test.arc to work with arc3.1.
To play devil's advocate (and/or nihilist), why do you need that? Error messages in Arc are already universally this way, so it's expected behavior that the messages look crappy. :P
On a more constructive note, is there some part of your code (or could there be, with some rewriting) where you could specifically catch the exception whose message you want to display nicely? Seems like you could probably work out some way to wrap something like:
arc> (on-err (fn (c)
(pr "Error: ")
(prn (details c)))
(fn () (err "a \"b\" c")))
Error: a "b" c
It's a bit of hack, but if you need to subvert the language's defaults in an implementation-conforming way, them's the breaks.
Side note: paging through your source code, I notice you have a few functions to support to-readable-string. I believe it could be greatly simplified, because write's job is already to print out values in ways that read can parse back in. Thus, save for the single-quote stuff, I think you could boil it all down to tostring:write. Play around with it and see if it's what you want:
arc> (tostring:write (list 'a "b" (obj c 'd)))
"(a \"b\" #hash((c . d)))"
> To play devil's advocate (and/or nihilist), why do you need that? Error messages in Arc are already universally this way, so it's expected behavior that the messages look crappy. :P
Because I'm not satisfied with crappy behavior. It makes readability worse, and debugging harder.
I'll take a look to see if I can catch it better; reading over my code, some of it does seem to be convoluted, if working. I've never quite been happy with how the failure messages are calculated; if that changes, I can easily change how we handle printing the errors.
Other than the double-quote issue, I don't recall other big reasons to use to-readable-string. I'd certainly be a fan of removing code, if I can make it work. It would bring my macro:function ratio below 1:1 again, which would be awesome. :)
The reason to use 'write rather than 'display is precisely so that things can be pasted into the repl. All you should have to do is 'read and 'eval the entire error string.
But there's definitely something busted about this. In anarki:
arc> (= x "abc")
arc> (err x)
Error: "abc" ; ok, looks good
arc> (= x 'abc)
arc> (err x)
Error: "error: abc" ; whaa..
There's also a long-standing issue that's bothered me before:
arc> (= x "abc")
arc> (err x " shouldn't be a string")
Error: "abc \" shouldn't be a string\""
You're right, that's pretty ghastly. It should print:
Error: "abc" shouldn't be a string
or:
Error: "\"abc\" shouldn't be a string"
I'm still mulling what the culprit is here, but I don't think it's write vs display.
The culprit is that 'err is defined to be Racket's 'error. It looks like every single use case of 'error is discouraged for one reason or another in the Racket reference:
- (error sym) creates a message string by concatenating "error: " with the string form of sym. Use this form sparingly.
- (error msg v ...) creates a message string by concatenating msg with string versions of the vs (as produced by the current error value conversion handler; see error-value->string-handler). A space is inserted before each v. Use this form sparingly, because it does not conform well to Racket’s error message conventions; consider raise-arguments-error, instead.
- (error src frmat v ...) creates a message string equivalent to the string created by
(format (string-append "~s: " frmat) src v ...)
When possible, use functions such as raise-argument-error, instead, which construct messages that follow Racket’s error message conventions.
Er, I knew it was weird for me to say "'err is defined to be Racket's 'error," but I just realized, that factoid was in the original post of this thread. :-p
Then of course, some languages only have two distinct boolean values, and it's an error to ask the truth value of anything else (think Java).
I dunno. When using Python, I lament that the overloading is too inexplicit. But languages with exact boolean types feel too restrictive. (Such is the curse of using multiple languages: while programming in X I want to selectively cherry-pick features from !X.) I make it by fine in Python, though. It can certainly help with code golf, but often it just makes me paranoid about my if conditions.
Usually I find myself leaning towards the "one universal false value, everything else is true" camp, just because it's highly predictable yet flexible. With distinct boolean types, you have the consistency/predictability, but not so much flexibility. In Arc (and similar), there happens to be the falsehood-creep into the empty list. I'm not sure I really like that, because it isn't maximally consistent: why aren't other empty sequences false, too? Just do away with the question by having a canonical false value all its own. Then you still get some of the code-golf benefits of having everything else be true.
"In Arc (and similar), there happens to be the falsehood-creep into the empty list. I'm not sure I really like that, because it isn't maximally consistent: why aren't other empty sequences false, too? Just do away with the question by having a canonical false value all its own. Then you still get some of the code-golf benefits of having everything else be true."
This is exactly what my preference would be too. Thanks for saying it first. :)
Well, this ended up leading in different directions than I expected, so I'll be more specific about my opinions here.
I like the idea of the main (if ...) semanics being just another equality check or dynamic type check: "Is this nil?" If falsiness overlaps with multiple other dynamic types, then we end up having confusing crosshatching where one extension wants to do X with any falsy value and another extension wants to do Y with any list.
Secondarily, I also see some benefit in distinguishing between () and #f, because then it's possible to dispatch on whether something is a list or a boolean. But I'm also happy if we don't have booleans at all, because then "Is this nil?" can just be a special case of "Is this a list?"
An interesting turnaround happens with this philosophy, too: instead of treating "the" empty sequence as false, you can treat false as though it's an empty sequence. This is what Factor does: http://docs.factorcode.org/content/article-sequences-f.html
So maybe if Arc spelled the empty list like () and nil was the singleton false value (so that (is nil ()) was nil), then map/each/etc. could still work on nil just fine. It's just that (if () 'a 'b) would evaluate to 'a instead. Not saying it's the best way, but it's certainly an option.
Interesting. One quibble with this idea: it doesn't matter as much that map et al work on nil if nil isn't at the end of each list.
So perhaps the reason for empty list to be special is that so many list algorithms are recursive in nature, and it's nice to be able to say "if x recurse" rather than *if !empty.x recurse". Hmm, the empty array or empty string isn't included in every array/string respectively, so perhaps it's worth distinguishing from nil in some situations..
I just ran into a case where I wished the empty list wasn't the same as the false value. When implementing infix in wart (http://arclanguage.org/item?id=16775) I said: "Range comparisons are convenient as long as they return the last arg on success (because of left-associativity) and pass nils through."
(a < b < c)
=> (< (< a b) c) ; watch out if b is nil
(< nil x) ; should always return nil
Ok, I'm now experimenting with a new keyword in wart called false.
a) There's still no boolean type. The type of false is symbol. (The type of nil has always been nil; maybe I'll now make it list.)
b) if treats both nil and false as false-y values.
c) nil and false are not equal.
d) Comparison operators now short-circuit on false BUT NOT nil.
I can mostly use either in place of the other. But I'm trying to be disciplined about returning false from predicates and nil from functions returning lists.
Wart now has four hard-coded symbols: nil, object, caller_scope and false.[1]
Thoughts? It was surprisingly painless to make all my tests pass. Can anybody think of bugs with this kinda-unconventional framework? If you want to try it out:
$ git clone http://github.com/akkartik/wart
# Optionally "git checkout 0ff47b6bce" if I later revert this experiment.
$ cd wart
$ ./wart
ready! type in an expression, then hit enter twice. ctrl-d exits.
[1] fn is just a sym with a value:
let foo fn (foo () 34)
=> (object function {sig, body})
Technically, my first thought was that something was broken. Hitting C-d as soon as I got the prompt:
$ time ./wart
ready! type in an expression, then hit enter twice. ctrl-d exits.
=> nil
real 0m29.200s
user 0m27.602s
sys 0m0.000s
Anyway, I was going to test to see if you had Arc's t; but it doesn't look like it:
(if t 'hi 'bye)
020lookup.cc:28 no binding for t
=> bye
Note that it's trivial to add:
(<- t 't)
=> t
(if t 'hi 'bye)
=> hi
The reason I thought to try this was because I initially balked at maintaining false and nil at the same time with the same truth values. Then I thought of t, and suddenly the pieces clicked together: at least in part, it seems like you just want a Python-like system anyway.
Once I got the landscape laid out in my head, I started objecting to it less, because I could make sense of it. You're most of the way there:
- false is a separate, canonical false value.
- t (if you chose to have it) is a separate, canonical truth value.
- nil is an empty list, but empty lists are false.
Compare to Python's True, False, and []. The major differences being:
1. No first class boolean type. In wart, this produces more of a disconnect between t and false. t (i.e., 't) is just a normal symbol whose truth value is incidental. But false is a special, unassignable keyword.
(<- false 'hi)
=> hi
false
=> false
Python lacks symbols (you can't just say True = 'True), so this disconnect between symbolic value and keyword doesn't exist. There is still, however, a different sort of disconnect in Python because the "first class" boolean type gets contaminated by the int type:
2. You don't take Python's next logical leap. Since you already make the empty list false, other values become fair game, such as the thread's original idea (make 0 false), the empty string, other empty data structures, etc. But like I said before, I make do in such systems. Keeping nil falsy is really just your prerogative, if you want to avoid calls to empty? that much. ;)
Thanks for trying it out, and for the comments! Yeah it's gotten slow :(
I hadn't realized how close to python I've gotten. Seems right given how the whitespace and keyword args are inspired by it. On rosetta code I found a cheap way to get syntax highlighting was to tag my wart snippets with lang python :)
I've been using 1 as the default truth value, and it's not assignable either. I was trying to avoid an extra hard-coded symbol, but now that I've added false perhaps I should also add true.. I'm not averse to going whole-hog on a boolean type, I'd just like to see a concrete use case that would benefit from them. pos seems a reasonable case for keeping 0 truth-y, and the fact that lists include the empty list seems a reasonable case so far to keep nil false-y. But you're right, I might yet make empty strings and tables false-y.
(True, False = 0, 1 :( That's the ugliest thing I've ever seen python allow. At least throw a warning, python! Better no booleans than this monstrosity.)
"pos seems a reasonable case for keeping 0 truth-y"
While I personally like 0 being truthy, I don't see this as a convincing reason.
I'd treat 'pos exactly the same way as 'find. They're even conceptually similar, one finding the key and the other finding the value. For 'find, the value we find might be falsy, so truthiness isn't enough to distinguish success from failure. The same might as well be true for 'pos.
---
"But you're right, I might yet make empty strings and tables false-y."
What if the table is mutable? That's an interesting can of worms. :)
JavaScript has 7 falsy values, all of which are immutable. If we know something's always falsy, we also know it encodes a maximum of ~2.8 bits of information--and usually much less than that. It takes unusual effort to design a program that uses all 7 of those values as distinct cases of a single variable.
This means if we have a variant of Arc's (and ...) or (all ...) that short-circuits when it finds a truthy value, we don't usually have to worry about skipping over valuable information in the falsy values.
If every mutable table is falsy as long as it's empty, then a falsy value can encode some valuable information that a practical program would care about, namely the reference to a particular mutable table.
---
"(True, False = 0, 1 :( That's the ugliest thing I've ever seen python allow. At least throw a warning, python! Better no booleans than this monstrosity.)"
The PEP describes the design and rationale of introducing booleans to Python this way. Version 2.3 implements this. Version 2.2.1 preemptively implements bool(), True, and False to simplify backporting from 2.3.
Notably, the variable names "True" and "False" were chosen to be similar to the variable name "None", and all three of these are just variables, not reserved words.
Later, version 2.4 made it an error to assign to None:
I've added some messages to at least set expectations on how slow it is:
$ wart
g++ -O3 -Wall -Wextra -fno-strict-aliasing boot.cc -o wart_bin # (takes ~15 seconds)
starting up... (takes ~15 seconds)
ready! type in an expression, then hit enter twice. ctrl-d exits.
This is interesting, but my first reaction is that push wouldn't necessarily need to be a macro at all with the right reference-passing/modifying semantics. For instance, Arc's scar & scdr manipulate the contents at an address:
Coming up with a simultaneously clean, terse, and simple pointer model is a different exercise, I suppose. And not one I'm particularly qualified to participate in (I'm no C guru).
Have you found any other use-cases for this qcase idea?
Thanks for that idea! No I don't have a second use case that requires dispatching on the value of macro args. I'll keep an eye out and report back.
I'm a little leery of using pointer-based semantics. Probably because of my C background :p Your analysis was very thorough, and I like that push doesn't silently modify the values of other variables besides the one requested. I hadn't noticed that subtlety before. Yet another reason to avoid scar and co.
I'm a little leery of using pointer-based semantics. Probably because of my C background :p
Yeah, it's an interesting problem. One that I should probably understand better... I mean, I "get" it: the RAM machine model is simple. But I've not programmed enough C (or similar) to understand how far the semantics would permeate day-to-day programs.
There's the part of me that's like "oh, just have primitives to reference & dereference addresses; easy". I seem to recall Lisps that use something like (ref x) and (deref x), and the explicitness pleases me, coming from "modern" languages where you have to remember odd rules about whether/where things are passed as pointers. But then I read C code and see all these pointers-to-pointers all over the place---in spots of code where I normally wouldn't think of pointers at all. Then again, that might be endemic to C, where you're probably just being destructive by default.
I've always wondered how ML works out in this respect (http://homepages.inf.ed.ac.uk/stg/NOTES/node83.html). Its system seems elegant on first glance, but I have no experience in it. I seem to keep finding myself regretting this lately...I should pick up a project to scratch my ML-learning itch.
It's not often the case, but when eval is right, it's right.
Good work! Looks like you have all the angles covered in your point-by-point breakdown at the end, and evaluating is really all the CL version is doing too. Honestly, the biggest "gotcha" for me is point 2, but you neatly deal with that. The tradeoff is totally worth it for having on-line "redefinition" of the macros you're testing. Just make sure 2 is documented as a caveat, and you're good to go.
Certainly a shorter answer to the whole discussion than the naysaying I launched into. ;)
Very interesting! Particularly the <- and -> operators. It's refreshing to see new approaches to stack shuffling that don't come from the kind of established zeitgeist of stack shufflers (dup, swap) & combinators (dip, bi, tri). I was confused by the notions of the "downstack" and "upstack", because it looks to me that you really only have one stack, and a cursor into it, like:
-- (bottom) n
-- ^
<- -- (bottom) n
-- ^^^^^^^^
{ '*' << } -- (bottom) { '*" << } n
-- ^^^^^^^^^^
-> -- (bottom) { '*' << } n
-- ^
times
or
-- (bottom) a b c
-- ^
<- -- (bottom) a b c
-- ^
<- -- (bottom) a b c
-- ^
operate_on_a -- (bottom) whatever operate_on_a produces b c
-- ^^^^^^^^
-> -- (bottom) whatever operate_on_a produces b c
-- ^
etc.
Also, it seems to me the language isn't really "typeless"; rather, the programs aren't generally type-directed. The system appears similar to Arc's, where you have a few already-used type tags (string, int, num), but in principle the tags are arbitrary. I suppose one distinction could be the treatment of the data under certain operations: Bipedal has syntax for strings and integers, but what happens when you call + on a string and an integer? Is it just treated like adding the string's address with the integer, perhaps?
Finally, I'm interested by the crypto angle. Do you think this is just a generally important (or interesting) feature, or is there a particular application you have in mind? [I said, hoping you're still checking this forum. :)]