It makes sense to specify the example expr / expected pairs as sexps & strings: run repl-output on exprs, compare those strings to the expected strings.
Thanks for thinking through that! Yes, I'd had a similar though not as fully-formed idea, but was stuck on distinguishing stdout from return value in the examples. But yeah, that's overthinking it -- we have to mentally parse the two at the repl anyway.
Only other quibble: a string isn't very nice to read next to code, even if it looks nice at the repl. Hmm..