Arc Forumnew | comments | leaders | submitlogin
A confession of stupidity
6 points by akkartik 5141 days ago | 10 comments
Y'all may have watched me complain about performance for the last couple of days. I want to issue a mea culpa and say it was all my fault. I still love you arc, I'm not going to switch to SBCL just yet.

Buried in my 3000 lines of arc was this innocuous-looking 2-liner:

  (def most-recent-unread(user feed)
    (most doc-timestamp (rem [read? user _] feed-docs.feed)))
I was basically sifting through 10-100 element lists over and over again. Worse, doc-timestamp lazily loads and caches metadata files. As I randomly selected feeds for the user I was doing 10-100 disk reads per request.

A little reorganization keeps the lists always sorted, so I can just:

  (def most-recent-unread(user feed)
    (car (rem [read? user _] feed-docs.feed)))
This one-line change takes my server from 50% timeouts at 6 users to 15% timeouts at 100 concurrent users.

You can assess the level of optimization in a codebase by the size of the 'longest pole in the tent'. By that heuristic this past week is basically a testament to my incompetence. The good news with incompetence: it lets me find 10x speedups like this. That (hopefully) won't last.

Thanks to aw for hammering away at me to look at the mote in my own code before blaming my tools (http://arclanguage.org/item?id=11515). I could have saved myself a world of pain[1] if I'd just remembered these manual-profiling tools I wrote a year ago:

  (= times* (table))

  (mac deftimed(name args . body)
    `(do
       (def ,(symize (stringify name) "_core") ,args
          ,@body)
       (def ,name ,args
        (let t0 (msec)
          (ret ans ,(cons (symize (stringify name) "_core") args)
            (update-time ,(stringify name) t0))))))

  (proc update-time(name t0) ; call directly in recursive functions
    (or= times*.name (list 0 0))
    (with ((a b)  times*.name
           timing (- (msec) t0))
      (= times*.name
         (list
           (+ a timing)
           (+ b 1)))))

  (def print-times()
    (prn "gc " (current-gc-milliseconds))
    (each (name time) (tablist times*)
      (prn name " " time)))
(To profile a function replace def with deftimed. Be careful using it with recursive functions.)

This isn't the first time I've 'forgotten' to profile before optimizing. Hopefully it'll be the last.

[1] 2 days of experiments with ab, 2 days of looking at erlang/lfe/termite/tornado/unicorn, 1 fun day of rewriting in SBCL, and 1 day of moving to a new data structure I turned out to not need.



2 points by evanrmurphy 5139 days ago | link

A great reminder for me about the importance of profiling, thanks for posting!

I have a less technical confession of stupidity to make: I was feeling a little disappointed by the rate of submissions on this forum the last few days, when I realized I was only checking the front page and not the new submissions page. Now I look forward to reading what I've missed. :)

-----

3 points by akkartik 5139 days ago | link

Thanks evanrmurphy. It's almost like there are two communities here, with the second one visible to (and probably only interesting to) those willing to poke around a bit. The front page skips the detailed discussions and has become more of a summary of the cool stuff people are doing with arc. The new page is more like a traditional forum.

As a luxury of being so small, this community encourages all sorts of questions and discussion[1] (and angst :) - but without karma. Compare the boost 'Ask HN' posts get on HN; HN has to either encourage or discourage discussion; its size makes it hard to do much else.

[1] http://arclanguage.org/item?id=11315

-----

2 points by evanrmurphy 5138 days ago | link

As a luxury of being so small, this community encourages all sorts of questions and discussion...but without karma. Compare the boost 'Ask HN' posts get on HN; HN has to either encourage or discourage discussion; its size makes it hard to do much else.

I made one Ask HN post that didn't get a single comment. Surprising how a community a tiny fraction of the size can be more responsive in some ways.

Maybe that shouldn't be surprising though, since responsiveness tends to be inversely proportional to size in so many things [1]. When there are many people using the site it's tempting to think, "Well someone has got to respond to my post!" But there is much more posting happening on large sites too, and different rules emerge (e.g. HN culture's greater emphasis on karma, as you suggest).

[1] Committees, web browsers, vehicles, mammals...

-----

2 points by shader 5139 days ago | link

Yeah, I normally check comments first and then new, since comments change the most, and new second most. This forum doesn't have enough traffic right now to need to sort by popularity.

Have fun reading ;)

-----

1 point by evanrmurphy 5138 days ago | link

It's funny, I never would've used the comments page on HN, but its utility is clear here. Traffic is a significant variable in usability on these sites, even though they're clones of one another.

-----

1 point by conanite 5140 days ago | link

shameless plug: tell me if something like http://home.conandalton.net/rainbow-profile-report.html would be useful; rainbow has a nice built-in profiler although the output is a little cryptic. I mentioned it here before at http://arclanguage.org/item?id=10688

-----

1 point by aw 5140 days ago | link

Very nice!

Is "total time" the elapsed time (including time that a function is waiting for locks, I/O, etc.), or CPU time?

-----

1 point by conanite 5140 days ago | link

Glad you like it :)

I haven't figured out how to measure CPU time in java; the profiler just calls System.nanoTime() at entry and exit of each function invocation and notes the difference.

-----

1 point by akkartik 5140 days ago | link

Not cryptic at all! Way better than just entries like gsNNN..

-----

1 point by aw 5141 days ago | link

Welcome :) And thanks for the profiling macro! I'd love to get to the point where we can press a button and have the computer say, "ding! 90% of your CPU time is being spent in foo"

-----