There's still one open question regarding filenames. So far we've been writing uploaded files to random filenames, but the uploaded filename is actually present in the POST request. I see 2 ways to provide it to the programmer (say when uploading a file in a field called file):
a) (arg req "file") returns just the contents of the uploaded file; the filename is in (arg req "filename"). "filename" is the multipart header, so I could just inline all the various headers into the args table.
b) (arg req "file") returns an alist or table with the various headers for that part of the multipart POST. File contents are in ((arg req "file") 'contents).
c) (arg req "file") returns file contents like in a, and other metadata like filename or creation-date is in some other field of req, say req!multipart.
b generalizes to multiple file uploads in a single form, but it's also a little more work to get at form input values. c addresses this but now you've got stuff for a field scattered in multiple places. What do people prefer?
Does the fnid multipart form test (the second test) and the static multipart test (the third test) work for you?
They do not work for me. I get a "srv thread took too long for <ip address>". I suspect this could be related to nginx again -- though, I started the webserver on a different port and run form-tests through that different port.
How come a non-multipart test works and a multipart test does not? Is there something going on with multipart and ports?
I tried having nginx stopped and only one Anarki running, and tests two and three do not work for me. Uploading a 7K file runs forever until the thread is killed (srv thread took too long).
A 150k plaintext file works but a 147k binary pdf does not.
Update: The bug has to do with reading in bytes vs characters. Earlier, srv would readc from the POST body unless type was multipart, and your code (like the webupload example) would readb from the body otherwise. Now srv is always the one reading body, and it always reads characters using readc. As a result it gets confused by binary uploads that don't translate to characters.
Update 2: The sentence beginning 'Earlier' is incorrect, and webupload was always using readc as well. I'm not sure what I was looking at.
$ ls -l bintest
-rw-r--r-- 1 akkartik akkartik 145974 Jun 5 12:39 bintest
$ racket -f as.scm
arc> (w/infile f "bintest" (len:readbytes 200000 f))
145974
arc> (w/infile f "bintest" (len:readchars 200000 f))
141878
readchars does some interpretation but otherwise works fine. However when trying to upload bintest through a socket, it never terminates. Most curious.
Update: I've confirmed that the bytes in the file fail to be encoded as a unicode string, so presumably that's the issue. Another bit of sloppiness is that we're trying to read n characters from the request body where n is the Content-Length in bytes. webupload.arc has always had this problem.
Your existing code won't work as is. Since it's meaningless to try to convert possibly-binary data to a string, file contents are now returned as a list of bytes. There's a helper called bytes-string for when you're sure you have just ascii data. Otherwise you'll need to know the encoding of text uploads and convert the bytes appropriately.
I should warn you that it's gotten a lot slower. You might need to temporarily up threadlife. I have some ideas for speeding it up, but let's check first if this works for you.
Ah, this is because the fnid field is being read as a list of bytes. I could convert fnid to string as a special case. Another option is to pass it in with the action url like in aform-multi: http://github.com/nex3/arc/blob/46e3820a6b/lib/srv.arc#L560
Update: Ok, I finally found a way to check when it's safe to convert to string, so now all fields (including fnid) will auto-convert to string when possible. If you get a list of bytes you know it has some sort of non-ascii data.
That is really weird. I find myself momentarily out of ideas :( Maybe some sort of linux setting that controls how ports are opened? Are you running iptables or anything like that?
Done. I came up with a way to get the best of both worlds. Multipart request args are now packaged in a table with all metadata, and with the actual body in key "contents". The arg helper is smart enough to deref "contents" in this table, so you can just say (arg req "file") to get its contents. To get at other fields, use the new function multipart-metadata.