I did some trivial benchmarks, computing fibonacci(34). Arc took 40 sec, Python took 6.4 seconds, and mzscheme took 1 second.
I used the straightforward recursive algorithm in Arc, the near-identical code in Scheme, and the corresponding code in Python.
arc> (def fib (n) (if (< n 2) 1 (+ (fib (- n 1)) (fib (- n 2)))))
arc> (time (fib 34))
time: 39811 msec.
Obviously this isn't a real-world benchmark, but a factor of 40 difference between Arc and mzscheme is much more than I expected. (I was expecting maybe a factor of 2 due to the function call overhead.) Any idea where the time is getting spent? Did I totally mess up the measurement somehow? I'm using the same mzscheme instance for both measurements.
I hate to say it, but Arc is starting to look like an informally-specified, bug-ridden, slow implementation of half of Common Lisp :-)
At this stage of language design, where everything is so fluid, there's little point to take such benchmarks, IMO.
What you might want to look out is design flaws which make future optimization very difficult. (But considering the 100-year language goal, pg can argue that any optimization specialized to today's compiler and processor technology shall be premature optimization.)
Damn, it must be all the Fibonacci calculations that are making News.YC so unusably slow...
Seriously, though, if you're curious about why the Arc version is slower, just look at the Scheme code that the function ac produces as its output. Maybe there's some obvious optimization we're missing.
I thought from looking at the internals before that the ar_funcall overhead would be the main factor. However, it turns out that of the 40 seconds, about 25 are spent in _<, _+ takes about 9 seconds, ar-funcalls about 2.5 seconds, ar-false? and _- about 1 second each.
So it looks like arc< is the main time sink, although the others all contribute.
I think checking the type of all the arguments is costly.
In case anyone is interested, the Scheme code assigned to _fib is:
(if (not (ar-false? (ar-funcall2 _< n 2))) 1
(ar-funcall2 _+ (ar-funcall1 _fib (ar-funcall2 _- n 1))
(ar-funcall1 _fib (ar-funcall2 _- n 2)))))
I don't know much about MzScheme internals. But from my experience of implementing Gauche Scheme, inlining basic operators such as <, +, and - is a big performance boost, and I suspect MzScheme does similar thing as far as these ops are not redefined. Calling them many times via 'wrapper' function like ac.scm does diminishes that effect.
That said, one possible overhead is that, although arc compiler tries to use ar-funcall2 to avoid consing the arguments, they are consed after all since _+ and _< are defined to take a list of arguments.
The original (time (fib 34)) took 79.8s on my machine.
Changing arc< in ac.scm with this:
(cond [(and (number? a) (number? b)) (< a b)]
[(and (string? a) (string? b)) (string<? a b)]
[(and (symbol? a) (symbol? b)) (string<? (symbol->string a)
[(and (char? a) (char? b)) (char<? a b)]
[else (< a b)])]
(cond ((all number? args) (apply < args))
((all string? args) (pairwise string<? args #f))
((all symbol? args) (pairwise (lambda (x y)
(string<? (symbol->string x)
((all char? args) (pairwise char<? args #f))
(#t (apply < args)))]))
made (time (fib 34)) 72.8s, and further changing _+ with this:
(cond [(and (string? a) (string? b)) (string-append a b)]
[(and (arc-list? a) (arc-list? b))
(ac-niltree (append (ar-nil-terminate a)
[else (+ a b)])]
(cond [(all string? args)
(apply string-append args)]
[(all arc-list? args)
(ac-niltree (apply append (map ar-nil-terminate args)))]
[else (apply + args)])]))
made (time (fib 34)) 49.5s.
But generally, these kind of tuning for microbenchmarks only have small effect in real applications, for microbenchmarks magnifies inefficiency of very specific parts.
(Afterthought: It won't be too difficult to write a Scheme macro that expands variable-arity type-dispatching functions like _+ or _+ into case-lambda form. Then this kind of optimization can be applied without cluttering the source too much.)
I don't know how this slowdown comes about, but myself, I don't worry about speed.
The impression I got from Arc is that, currently, almost the entirety of its substance is in its idioms, or how your write programs in it, rather than its implementation.
What I don't understand is why you say it's informally-specified. Isn't its specification the source itself, and isn't that more formal than the english-language specification of Common Lisp; I thought Paul Graham said something to that effect in one of his essays.
I hope I'm not talking above myself, being a noob and all.
The problem of source-is-the-spec approach is that it tends to overspecify things. Especially it is difficult to represent something is "unspecified" in the form of the source code.
What's the benefit of not specifying something in the specification? It leaves a room for different implementation strategies, each of which is specialized for certain situations. It allows putting off design decisions when there's not enough experience to decide which choice is better. It can warn the user that the existing choice is purely arbitrary and it may be changed when circumstances change (note: the big point is that, a concrete implementation has to choose one, even when from the specification point of view the choice doesn't matter.)
For example, Arc's current map is restart-safe for list arguments but restart-unsafe for string arguments (restart-safe means, if you evaluate (map f xs), and a continuation is captured within f, and later the continuation is restarted and the restart doesn't affect the result of the previous return value of map.) Is it a part of "spec" so that one can rely on that exact behavior? Can I rely on that the restart actually modifies the string returned from the first call? Or is it just an arbitrary implementation choice and one should assume the result of mixing map and call/cc is unspecified?
(BTW: In R5RS Scheme it was unspecified. In R6RS Scheme it is explicitly specified that map must be restart safe.)
I think, having a prose spec that leaves things unspecified may not be particularly useful in Arc's case; unspecified behaviour in a prose spec is difficult to automatically test for.
Generally, the benefit of a prose, incomplete spec, to program writers, would be compatibility to an abundance of implementations (in either space or time). It helps program writers approach any number of implementers, without actually meeting any particular one. But to make this happen both must read and follow the spec. This is hard on the program writer in particular; he can't automatically test for a spec in the prose, he must read the spec and test his program with an abundance of implementations.
Will Arc's target audience follow a prose spec? Will anyone?
I don't think that it is. It's a way of saying that the Arc language satisfies two constraints: it is suitable for writing programs, and suitable for writing specifications. For "exploratory programming" this might be the right thing.
I don't know if Arc actually satisfies that, but I can certainly see the point. Part of this would be a lack of optimization in implementation: implement something in the most straightforward manner, and it is its own specification.
On the other hand, in a separate english-language specification you could say that a certain thing is unspecified and leave it up to the implementation to decide on a certain behaviour. I have been reading a bit in the scheme report (that's a specification right?) and there are a certain number of unspecifications in there.
i don't arc is intended to have more than one implementation, or at least more than one popular implementation. if this is the case, nothing is unspecified. whatever the implementation does, that is the language.
It's rather a dishonest argument to use an example like that, because it's an artifact of bootstrapping the prototype off MzScheme. A more convincing argument would be strange behavior resulting from the way something was defined in arc.arc.
Is annotate a general mechanism, or specifically for defining macros? Is ellipsize supposed to limit its output to 80 characters or display at most 80 characters of the input? Is median of an even-length list supposed to return the lower of the middle two? Is cdar supposed to be missing? Is (type 3.0) supposed to be an int? Is support for complex numbers supposed to be in Arc? Is client-ip supposed to be there, or is it left over from something? Does afn stand for "anonymous fn"? What does "rfn" stand for?
These are all real questions that I've faced while trying to write documentation. Let me make it clear that these are not rhetorical questions; I'm more interested in getting actual answers than arguing about using the code as the spec.
Wasn't the point keeping the names car and cdr so you can compose them? (I remember reading such in one of pg's essays.) Then it seems to me to take full advantage of that you need to provide those names for use.
I don't think it is unreasonable to do the following, but it is currently not provided in Arc:
I didn't mean Arc will never have cdar. But to avoid having a language designed according to arbitrary rules rather than the actual demands of the task, I've been trying to be disciplined about not adding operators till I need them.
It doesn't matter whether features are deliberate or not. It's very common in math for someone to discover something that has interesting properties they never imagined. In fact, it's probably closer to the truth to say that if a mathematical concept doesn't have properties the discoverer never imagined, it's not a very interesting one.
Lisp itself is an example of this phenomenon. JMC didn't expect to use s-expressions in the real language, but they turned out to be way more useful than he envisioned.
I'm not just splitting hairs here, or trying to defend myself. In design (or math), questions of deliberateness are not binary. I'll often decide on the design of an operator based on what looks elegant in the source, rather than to meet some spec, just as Kelly Johnson used beauty as a heuristic in designing aircraft that flew well.
It's a good argument in general sense, but I doubt it is applicable in library API.
If you're delivering a final product, users don't care if some design is deliberate or not; they care it is good or bad. If you're deriving mathematic proof, others don't care if some choice is deliberate or not; they care if it is correct, beautiful, or useful to prove other theorems.
That's because changing your choice afterwards won't break existing products or proofs that relied on the previous choices.
In the case of library API, changing your choice does break existing software relying on the library. In the current Arc case it is forewarned so it's perfectly fine, but at some point (50years from now, maybe?) you have to let more people write software on it; by that moment it should be clear that what they can rely on and what they cannot.
by that moment it should be clear that what they can rely on and what they cannot
The only difference if the implementation is the spec is how they know what they can rely on. If the implementation is the spec, they decide by reading the source; if it's a document writen in English, they decide by reading that.
Implementation can be, and will be, changed, inevitably. Then does the language change as well, or the languages remains the same but just implementation is improved? How can you tell the difference purely from the source?
Some Scheme implementation evaluates arguments left to right. You can see that by reading the source. In future, it may switch it right to left, maybe for better optimization. The spec in natural language, or more formal and abstract form like in Appendix A of R6RS, can explicitly say the order of evaluation is unspecified. How you tell your users that they should not rely on the evaluation order purely by the source code, given the state of most programming languages?
Ideally I like to think the source only describes the spec and the compiler and runtime figure out the details, so maybe spec-by-source and spec-by-other-notation will converge in future. Is that what you are talking?
(Please don't argue that unspecified evaluation order is bad or not; I just use that example to illustrate the point. And incidentally, since Arc is defined in terms of Scheme, the order of argument evaluation order is just unspecified as well. But it's just delegating the spec to a lower level.)
One point everybody else is missing: since arc explicitly makes no claims of backwards compatibility, the notion of a spec is meaningless.
If the goal of a language is to be readable there's nothing wrong in the implementation being the spec. Consider it a form of self-hosting, or of eating your own dogfood.
An implementation in a reasonably high-level declarative language is a more reasonable 'spec' than one in C. More features are amenable to checking just by reading rather than by execution.
When something is obviously a bug it doesn't matter if it's a bug in the spec or the implementation.
Those two categories -- obvious bugs, and questions about what is or is not in the language that should be answered by reading the code rather than executing it -- seem to cover all the objections people have raised.
At least I'm talking about the attitude of spec-by-source in general, not particularly about Arc, FYI.
Edit: I agree that more abstract, declarative language is closer to spec-by-source. If somebody says Prelude.hs is the spec of Haskell's standard library I agree. But the core language semantics is still not in Haskell itself, is it? (I'm still learning. Correct me if I'm wrong.)
The problem is that as people learn the language they will build mental maps of what works and what doesn't and in the process will write code that depends on things that could legitimately be considered bugs or arbitrary side effects of the current implementation.
Whether or not this matters to you or even should matter is another concern, but this has been a spot of contention for languages like Python and OCaml whose spec is the code.
This is entirely orthogonal to benchmarking, but I thought I'd point out how memoization is a huge win for the recursive Fibonacci:
arc> (defmemo fib (n) (if (< n 2) 1 (+ (fib (- n 1)) (fib (- n 2)))))
arc> (time (fib 10000))
time: 957 msec.
...many digits of output omitted...
For those unfamililar with memoization, it makes the function magically remember the results for previous calls. So once you've computed (fib 100), for instance, subsequent calls to (fib 100) return the memoized value immediately. Obviously this only makes sense for functions that depend only on their arguments and don't have side effects. (You pay a space cost, of course, to hold all these results.)
(Even with memoization, this is a silly way to compute Fibonacci numbers, but I think it's an interesting demonstration.)
I benchmarked your benchmark against all the other benchmarks I could think of including the idea that maybe it did not make sense to time something sitting on top of something else and it came in last.
Ignoring the merits of repeated function calls as a benchmark, this is surprisingly good. A naive scheme implementation I wrote was about 100X slower than Python. Since Arc hasn't yet plucked the low hanging fruit of optimization, I think you folks are fine.
That is not actually possible without memoization or using another fib function.
The complexity of the 'standard' fib function for 'benchmarks' is exponential.
2^20000 is kind of big for the number of computational steps.
so my guess is you have a serious bug, or are cheating.