I've moved 3 of my 4 domains over to the new server (Ubuntu10.4+Nginx). While I don't have any real evidence or even enough traffic to prove nginx is better than Apache, I am quit happy with the move. I didn't want to start spewing benchmarks, since the configuration settings are different[1], but I will say this: My first run at setting up Apache, as a total newb, took 3+ days enduring far too much complexity and troubleshooting ending in poor results. I have accomplished so much more with nginx in 3-4 hours.I know I am spewing a bunch stuff with out much real evidence, but for anyone else using Apache as a front end server I highly recommend considering nginx.
As for my original problem, I stopped using the http-get library and instead use wget with one other change (getting all the files downloaded first) I've cut the process time from 2 hours to 48 minutes. So far have received no errors. My CPU still churns at 96%.
[1] I hadn't applied re-write rules with my Apache set up, but nginx was dead simple to separate what I needed relayed to Arc vs. what to pull direct:
location / {
http://127.0.0.1:9000;
}
location /css {
root home/arc/static/;
expire max;}
location /js {
root home/arc/static/;
expire max;}
location /images {
root home/arc/static/;
expire max;}
I'm sure I could have done something like:
location /css /js /images{
root home/arc/static/;
expire max;}
but I've only used it a little bit.
+ I have so many more options/tweaks I have been able to easily incorporate without any headaches.