"what if you're dealing with hundreds of millions records?"
Then just write it in C :)
Let me try to rephrase my argument. Sometimes you care about every last cycle, most of the time you care about making it a little bit faster and then you're done. Sometimes your program needs large scale changes to global data structures that change the work done, sometimes it needs core inner loops to just go faster. Sometimes your program is still evolving, and sometimes you know what you want and don't expect changes.
just a little faster + work smarter => rearchitect within arc.
just a little faster + faster inner loops => rewrite them in C
every last cycle + rigid requirements => gradually rewrite the whole thing in C and then do micro-optimizations on the C sources. You could do arc optimizations here, but this is my bias.
every last cycle + still evolving => ask lkml :)
If you're doing optimizations for fun, more power to you. It's pretty clear I'm more enamored with the readability problem than the performance problem.
I don't want to sound more certain than I am about all this. Lately I spend all my time in the 'just a little faster' quadrant, and I like it that way.
"The key, in all this, is to understand these languages well enough to make good judgment calls on where best to invest ones time."
I don't disagree with your thought's, however I don't think they account for the TCQ aspect of everyone's situation.
Let's put put it another way. In my situation, if I have to look a using C then it's already game over[1]. However I can, with my limited amount of time (lunches, evenings and weekends) become proficient enough in Clojure (with a continual thanks to Arc).
What I am suggesting is that knowing the language well enough to easily identify these 50%+ hitters is a matter of finding low hanging fruit and at the same time becoming a better Arc/Clojure programmer. It does not mean I want to change my direction in the heavily-optimized/low-level to high-level/low-maintenance/readable code continuum.
[1] I'm not a programmer[2], I am a business analyst that works with software that simulates oil & gas production from reservoirs that sit several hundred kilometers underground. Simulation scenario's are run for 20 year production periods across hundreds of interdependent wells. The current software tools run the simulations across many cores, they take 8 days to run to completion and they cost about a half a million dollars per seat (+18% per year maintenance fees). This cost can be attributed to the R&D that occurred 8-10 years ago (i.e they required a team(s?) of P.Engs writing software in C to maximize the performance). Eight years ago you couldn't do (pmap[3] (sortby-commonest or whatever.... )) so easily. Nowadays I have the opportunity to create a 70% solution all by my lonesome, costing only a portion of my time. Hence why understanding the language well enough to find the low hanging fruit and not having to use C, is probably bigger deal to me.
[2] Well, maybe I shouldn't say this... rather I should say it's not my day job. I have no formal education in the field. I'm pretty much self taught.
[3] pmap is the same as map only it distributes the load across multiple processors in parallel.