Arc Forumnew | comments | leaders | submitlogin
How to implement a no-side-effects macro?
3 points by lacker 6104 days ago | 19 comments
I'm new to Lisp, so I apologize in advance if the answer should be obvious.

I want to write a macro that prevents code from having any side effects. Using = or zap in the code itself is fine, that should just be undone once the block exits. The motivation is to be able to run untrusted code - I figure I can mock out system primitives, but I also need to keep from running stuff like (= no nil), and to keep different untrusted bits of code from interfering with each other.

E.g. I want to be able to do something like

> (= foo 2)

2

> (no-side-effects (= foo 3) (= bar 4) (+ foo bar) )

7

> foo

2

> bar

Error: "reference to undefined identifier: _bar"

I'd appreciate advice from some of the experts out there!



3 points by kens1 6104 days ago | link

I think you're looking for Java, and its security manager :-)

MzScheme's namespaces (http://download.plt-scheme.org/doc/mzscheme/mzscheme-Z-H-8.h...) may help you with this, since you can have a separate namespace for each "sandbox", and the variables won't collide.

I guess you could also do this with a macro that replaces all the symbols by gensyms, so inside no-side-effects, foo is really a unique variable. But that seems like a pain to implement.

As an aside, watch out for denial-of-service attacks in untrusted code: e.g. code that goes into an infinite loop or infinite memory allocation. Depending on what you're doing, that can be a big problem, or not.

-----

2 points by lacker 6104 days ago | link

"Use Java" is not the answer I was hoping for. ;-)

This seems like such a simple request - I can express it in a few sentences - that it should be at least not extremely difficult to implement in the ideal programming language. So I hope it can be done in Arc, and if not, hopefully this is a use case that can be taken into account while figuring out how to handle arc modules/namespaces/whatever.

Anyways. I thought about replacing symbols by gensyms, but at what point would I do that? If I just check all symbols in the code I will miss things like

(eval `(= ,(sym "foo") 2))

For denial of service, I was thinking about something like (reval n code) would run the code but only allow up to n function calls. Seems like this would require writing eval in Arc then modifying. Is this similar to what you were suggesting by a replacing-symbols-by-gensyms macro?

-----

1 point by kennytilton 6104 days ago | link

"This seems like such a simple request - I can express it in a few sentences"

Awesome. I want to time travel. That's just one sentence. :)

"it should be at least not extremely difficult to implement in the ideal programming language"

Cue the asbestos! No one said Arc was intended to be some abstract ideal satisfying all few-sentence specifications, and as per the Java crack, who says that the ideal language is idiot-proof or attack-proof? To the contrary, the Design Imperatives I have seen trust the programmer to be good and are not hoping to be able to run virus attacks imperviously.

As for your question, it is a hard problem and one reason I think we do not have a Lisp plug-in for browsers.

-----

4 points by lacker 6104 days ago | link

I don't mean to start a flame war - I am certainly not a big Java or JavaScript fan ;-)

I don't want to make the language idiot-proof. I acknowledge this is a hard, rather "meta" thing and I was hoping Arc would be powerful enough to do this.

So, regardless of whether "the ideal language would have it", you agree this would be very useful; it would let us have an Arc plug-in for browsers for example. Or make a site where you could make your own Arc site by entering code and the main servers would run it restrictedly. Arc is trying to be native to the web, and allowing restricted execution is a very webby thing to do.

Anyway, I will still try to do this. ;-)

-----

2 points by kennytilton 6104 days ago | link

"I was hoping Arc would be powerful enough to do this."

OK, asbestos did not work, try the foam!

What you might be missing is that power makes attacks easier, not harder. As I suggested, part of that power is reflection so you have a fighting chance, my concern is that once you have blocked all the exits there will be no way to get in. Whatever that means.

"you agree this would be very useful"

I do? Wow. Actually, I do not know much about security,tho I might have to learn soon. My guess is that probably the best way to go is not to cripple the plugin, rather to allow only registered, vetted, digitally signed code to run. ie, the plugin looks for a digital certificate before launching.

Tilton's Law: Solve the right problem. What is the problem? Losers posting evil code. Solution? Don't run just anyone's code.

-----

3 points by sacado 6104 days ago | link

"I think we do not have a Lisp plug-in for browsers"

Well, we have JavaScript, and it is not really safer by design than Lisp (they are quite close, too). But JavaScript has a security manager, quite restrictive sometimes. And look at rebol : that's also a close relative to Lisp and it's got a cool security manager.

Well, they mainly prevent you from reading/writing the host filesystem or from connecting to undesired remote servers.

Lacker, this might not be as "secure" as Java's model, and not exactly what you were asking as it does not prevent you from overwriting another's code, but it is obviously enough to run untrusted code on one's machine. And to implement it, you only need to overwrite the dedicated axioms in ac.scm.

-----

1 point by kennytilton 6104 days ago | link

"Well, we have JavaScript,"

Then there was that other ILC where the speaker argued we Lispers should rejoice because Javascript was a Lisp and was in all the browsers. Make sure there are no children in the room and I'll tell you what happened next.

-----

1 point by sacado 6104 days ago | link

Compared to Java (i.e., seen from far away), it is. Well, of course, code isn't also data in JavaScript :)

-----

1 point by kennytilton 6104 days ago | link

Many agree. Here is a blog entry on the ILC guy I mentioned:

  http://bc.tech.coop/blog/030920.html
Looks I'll be tearing into ActionScript shortly, I'll let you know. :)

-----

3 points by cchooper 6104 days ago | link

Combining what gaah (Jesin) said with other things you've pointed out, the final solution would work something like this:

1. Write mock versions of all the Arc primitives that are dangerous.

2. Write a function that calls macex on an expression and then scans it for all instances of (set <symbol> ...).

3. Write a macro that takes an expression or a file, and wraps the whole thing in a with form that locally defines/shadows all the mock functions and the symbols detected with the above function.

4. Create a mock version of eval that wraps its argument in the above macro. This should obviously be shadowed in the macro too.

These are probably all the tools you need.

-----

1 point by lacker 6104 days ago | link

Interesting. I think this roughly makes sense.

There are still some parts that are stumping me. It seems like I need to either (a) mock annotate to use a different mechanism, or (b) prevent the code from annotating pre-existing variables. The same goes for accessing sig.

Also, I'm not sure how to handle scar/scdr/sref. It don't think places are first-class, so I can't have a table of untouchable places, unless I start working in mzscheme. (Which I would rather avoid.)

-----

2 points by kens1 6104 days ago | link

If you want the gory details of places, see my recent document: http://arcfn.com/doc/setforms.html

-----

1 point by lacker 6104 days ago | link

Wow... that is gory. :-/

It seems like the most plausible way to keep scar/scdr/sref from mutating variables outside the no-side-effects scope is to keep a table of the variables they are allowed to operate on, which is basically anything the user created with cons or table inside the no-side-effects scope. This should invalidate code like

(= a (table))

(no-side-effects (= b a) (= (b "key") "value"))

Is there any more natural way to track this than keeping a table keyed by symbol?

-----

3 points by eds 6104 days ago | link

This is probably a horribly inefficient way to implement this... but couldn't you just copy all the variables before executing the body?

Assuming

  (= a (table 'foo 'bar))
then a line like this

  (no-side-effects (= a!foo 'baz) a)
might be converted to something roughly like this

  (let a (copy a)
    (= a!foo 'baz)
    a)
a is unchanged after evaluation because the let made a deep copy that overshadowed the global value of a.

You would still have to think about constructs like def and mac, but at least this deals with shared structure of lists and tables.

Maybe something like http://arclanguage.org/item?id=3572 ?

-----

3 points by gaah 6104 days ago | link

I think you want this:

  arc> (= a 3)
  3
  arc> (= b 5)
  5
  arc> (let a (+ a 4)
         (+ a b))
  12
  arc> (with (a 2 b (+ b 2))
         (+ a b))
  9
  arc> (with (a 6 b 2)
         (prn a) (prn b)
         (= b a) (= a 'foo)
         (prn a) b)
  6
  2
  'foo
  6
  arc> a
  3
  arc> b
  5
The title of your post is misleading. I think when people saw "How to implement a no-side-effects macro" they read that as "How to implement macros with no side effects" rather than "macro that hides side effects from the surrounding scope".

You asked a simple question about lexical scoping. You got complex and confusing answers about macro hygiene. It is not that Arc has no easy answer to your lexical-scoping question, it is just that your simple question looks like a much harder one about macro hygiene.

-----

1 point by lacker 6104 days ago | link

Sorry I was unclear - I was indeed actually trying to ask a hard question.

I don't know a priori what variables this code will try to access. I could possibly figure that out with a macro, but some of the details are still unclear to me. I also don't want side effects if annotate, def, mac, or anything like that was run from the no-side-effect'd code.

At any rate, I think just wrapping in a let is insufficient, even if you do know all the variable names. E.g. you can break out of the sandbox with scdr:

> (= a '(1 2))

(1 2)

> (let a a (scdr a '(3)))

(3)

> a

(1 3)

-----

2 points by sjs 6103 days ago | link

This might be crazy, but how about changing ac-global-name (ac.scm) to a unique gensymed prefix? I don't know about doing it per thread, but it could be an interesting way to temporarily create a unique namespace.

edit: To be clear, I mean exposing it in arc somehow so you can effectively (let global-name (uniq) ...).

-----

1 point by lacker 6101 days ago | link

Interesting, I hadn't thought of this. You have to somehow let people still refer to, say, the + function, by its global name. Maybe that could be via some sort of const wrapper thing.

-----

1 point by gaah 6100 days ago | link

I don't have access to the Arc source for various reasons at the moment (maybe in 30 minutes), but assuming an appropriate definition of deepcopy:

  (mac w/nondestructive (names . body)
    (if (acons names)
        (cons `(fn ,names ,@body)
              (map [list 'deepcopy _] names))
        `(let ,names (deepcopy ,names) ,@body)))
There's probably a better way to write that, but that's what I have now.

-----