Currently, + can be used to add numbers or concatenate strings or lists. PG mentioned this in his early essays on Arc. I believe this was his last word on the issue[0]: "Using + to concatenate sequences is a lose. This kind of overloading is just a pun. I found that it actually made programs harder to read, not easier, because I kept thinking I was looking at math code when I wasn't. As several people have pointed out, concatenation isn't addition. It's not commutative, for example. Ok, you were right; we're tossing it." But, for some reason, this feature is still part of Arc and is used liberally in arc.arc. For an illustration I find amusing: $ grep -c '(+ (list' arc.arc
4
So here's what I did. I replaced the definition of + in ac.scm so that it (like -, /, expt, and so on) was exported verbatim from mzscheme; I redefined join as a replacement, so that it concatenated strings as well as lists (duct taping it together, with liberal use of $ to call Scheme functions); and I haphazardly went through files ending with ".arc", replacing instances of + that looked like they were supposed to concatenate rather than add, until Arc was able to load without errors. I had another Arc process already going from before I made these changes (and I of course backed up the files before I changed them).I ran a couple of tests. The gist is, addition seems to work about twice as fast as it did before[1]. Additionally, I noticed a while back that calling + on small numbers consumes memory (i.e. produces garbage), while calling * on similar numbers doesn't. I was sure this was due to the type-checking. This experiment of mine seems to confirm this. With my mem-use macro [2]: arc> (mem-use (+ 1 2)) ; Original; + concatenates
32
arc> (mem-use (+ 1 2)) ; New; + does not concatenate
0
arc> (mem-use (* 3 4)) ; Either one
0
I do a lot of math with Arc, and finding that addition is working half as fast is it could be because of some feature I plan never to use and was supposed to be dropped a long time ago... it really grinds my gears. So I am taking the trouble to write this up and announce it to the Arc Forum. I want to convince everyone (especially PG, who will presumably write the next official release of Arc) to agree to drop that feature. So here are some questions answered in advance:"What should we use instead, to concatenate strings and lists?" I think join; at the moment it only touches lists, so we can extend it to use strings and that won't break any existing code. "Won't that just make join run slower?" Slightly, yes; now joining the lists (1 2 3) and (4 5 6) ten thousand times takes ~90 msec instead of ~80. That's a smaller difference, and I don't think joining lists is at all likely to be the performance-critical part of your app. If you want a function that only joins lists, that function has historically been called append in other Lisps. "Why are you coding things like that, where performance on numerical computations is apparently important, in Arc?" Because I like Arc. And its design goals (see "The Hundred-Year Language") certainly allow for making it possible to write incredibly fast code in Arc. If you wonder how Lisp code can be made to run fast, take a look at Common Lisp and its 'declare syntax. "If someone makes an Arc implementation that does type analysis and/or allows type declarations (like in Common Lisp) so the compiler can speed things up, then wouldn't that fix the slowdown problem?" Yeah, well, a type analysis system like that is pretty nontrivial to implement. I don't think anyone's done it for Arc so far. I think most of us will be using some version of the mzscheme implementation for as far ahead as I care to look, and I think it would be pretty hard to tack onto the mzscheme implementation. Even if someone had already done it, it would be an obstacle for future implementers of Arc on other platforms. It is so much easier to simply guarantee that + is supposed to work on numbers only. "I like that feature and use it." Well, I don't like it and I don't use it. Look again at PG's stated reasons for dropping +-concatenation; I agree with them and could repeat them myself. I think I've made my case. For those who want to hack their own Arc to fix + like I did. Here is a rough description of the changes (if someone feels like showing me how to use a diff program to display the nice patch changes, please feel free): - In ac.scm, replace (xdef + ...) with (xdef + +), and replace
the body of ar-+2 with (+ x y).
- In arc.arc, insert two lines so that the first four lines of
the body of 'join are:
(if (no args)
nil
($.string? (car args))
(apply $.string-append (map1 [coerce _ 'string] args))
Duct tape and liberal use of $. Feel free to improve it.
- In arc.arc, strings.arc, html.arc, and srv.arc, look at
every instance of the character + and decide whether it
should be a 'join. It's not hard, just tedious. The
toughest decision involved a call to 'respond in html.arc
(which seems it should be replaced).
- This is just enough to make Arc load without errors. It may
not be everything. Ideally PG/RTM would do it themselves,
but they seem to take their time between releases of Arc.
[0]http://paulgraham.com/arclessons.html[1]Here are the tests. ; Original; + concatenates
arc> (repeat 10 (time:do (= i 0) (repeat 1000000 (++ i)) i))
time: 1367 msec.
time: 1291 msec.
time: 1252 msec.
time: 1276 msec.
time: 1254 msec.
time: 1259 msec.
time: 1267 msec.
time: 1260 msec.
time: 1282 msec.
time: 1285 msec.
; New; + does not concatenate
arc> (repeat 10 (time:do (= i 0) (repeat 1000000 (++ i)) i))
time: 629 msec.
time: 622 msec.
time: 647 msec.
time: 644 msec.
time: 704 msec.
time: 641 msec.
time: 642 msec.
time: 623 msec.
time: 638 msec.
time: 630 msec.
The difference is more pronounced when we increment lexical variables rather than global; the former is significantly faster. I'm making it put all the millisecond counts into one list: ; Original; + concatenates
arc> (keep [isa _ 'int] (readall:tostring:repeat 30 (time:let
i 0 (repeat 1000000 (++ i)) i)))
(673 686 674 668 682 677 667 688 676 669 687 766 671 693 674
671 685 669 670 698 671 675 685 671 719 685 673 672 685 670)
; New; + does not concatenate
arc> (keep [isa _ 'int] (readall:tostring:repeat 30 (time:let
i 0 (repeat 1000000 (++ i)) i)))
(263 259 255 255 258 273 256 262 266 256 254 254 266 270 262
256 257 342 262 261 275 258 263 264 258 259 256 262 268 262)
Also, amusingly, (-- i -1) runs slightly faster than (++ i) in the original, +-concatenating, version; in the first example, performing this replacement results in an average of around 1130 msec, rather than 1280 (these are eyeballed averages). Worst optimization trick ever. I'm tempted to say that this alone is so bad that even if the concatenation were a nice feature, it should be dropped immediately in horror.[2]http://arclanguage.org/item?id=12255 |