akkartik, I was wondering if you might like to test this approach and see how fast it is compared to your version? (I could test the speed myself if you wanted, but without your JSON data my results might not be representative).
I believe the particular require planet line below requires MzScheme version 4, let me know if you're using version 3...
(def deepcopylist (xs)
(if (no xs)
nil
(cons (deepcopy (car xs)) (deepcopylist (cdr xs)))))
(= scheme-f (read "#f"))
(= scheme-t (read "#t"))
(def deepcopy (x)
(if (is x scheme-t)
t
(is x scheme-f)
nil
(is x #\nul)
nil
(case (type x)
table (w/table new
(each (k v) x
(= (new (deepcopy k)) (deepcopy v))))
cons (deepcopylist x)
x)))
($ (require (planet dherman/json:3:0)))
(= scheme-read-json ($ read-json))
(def read-json (s)
(deepcopy (scheme-read-json s)))
The require planet line gave me a bunch of warnings about not having scribble installed, but it worked anyway.
The idea is we run the original unmodified Scheme module, and then convert its Scheme return value to Arc.
I expect this would probably be slower than your code. The question is, how much slower? If it turns out to be, oh, 1% slower for example, we might not care! :) And a lot easier than going into every Scheme module we might want to use and modifying it to return correct Arc values.
OK, here's a version that uses the lshift parser that you ported. Uncomment the require file line and put in the path to the original json.ss module...
I timed a few runs of your port and this version against your data set; times for both versions varied between 585ms and 831ms on my laptop, but there wasn't a difference that I could see between the two versions given the spread of times for each run.
I'm running Linux on my laptop. There could well be a difference, I simply haven't run the tests enough times to be able to tell, given that I'm getting a pretty wide spread of run times for each version. Which could be for example my web browser sucking up some CPU at random times or whatever...
Great idea. I thought of that approach but discounted it without attempting.
I tried running it but got a parse error.
default-load-handler: cannot open input file: "/usr/lib/plt/collects/scribble/base.ss" (No such file or directory; errno=2)
default-load-handler: cannot open input file: "/usr/lib/plt/collects/scribble/base.ss" (No such file or directory; errno=2)
setup-plt: error: during making for <planet>/dherman/json.plt/3/0 (json)
setup-plt: default-load-handler: cannot open input file: "/usr/lib/plt/collects/scribble/base.ss" (No such file or directory; errno=2)
setup-plt: error: during Building docs for /home/pair0/.plt-scheme/planet/300/4.1.3/cache/dherman/json.plt/3/0/json.scrbl
setup-plt: default-load-handler: cannot open input file: "/usr/lib/plt/collects/scribble/base.ss" (No such file or directory; errno=2)
Error: "read: expected: digits, got: ."
Yeah, be sure to try each test multiple times. You can get that much of a variance simply from the garbage collector running at one time but another, from the file needing to be read from disk vs. already cached in memory by the operating system, some other process hitting the CPU, or in EC2, the virtual CPU getting fewer cycles at the moment...
Seems to work. (require (file "lib/.../foo.ss")) appears to be the right thing to do for .ss files; a Scheme load uses Arc's readtable which messes up on square brackets, and a plain (require "lib/.../foo.ss") doesn't like periods in directory names.