Arc Forum | absz's comments

Arc Forum

new | comments | leaders | submit | absz's comments

2 points by absz 6357 days ago | link | parent | on: ([+ _ 10]:[* _ 20] 10) = error...

It's actually in-between:

  (f:g a b c)

becomes

  ((compose f g) a b c)

, which normally wouldn't work if f or g were a macro; however, compose is special-cased in ac.scm so that in the head of a list, this becomes

  (f (g a b c))

. Thus, : only works with macros at the head of a list. You can't do

  (f a b macro:function c)

.

-----

1 point by absz 6376 days ago | link | parent | on: What's next?

What about implementing copy-on-write semantics? That would preserve the efficiency of sharing while allowing mutation. Since Arc only has two mutation primitives (set and lset), modifying those would be necessary and sufficient. You would have to have every object on the heap know how many threads referenced it, and if there was more than one, set/lset would copy and then mutate as opposed to simply mutating in place. (Disclaimer: I am not an expert.)

If I recall correctly, Termite's ports are essentially processes, which is what allows them to be passed around.

-----

3 points by almkglor 6376 days ago | link

scar, scdr, sref....

My problem is with this: http://arclanguage.com/item?id=7066

To summarize: Suppose Process A has an object a. It sends this object to process B. This object is a copy, a' (conceptually, in the language level). Now Process A sends object a to process B again. This object is a second copy, a'':

  (def process-A (process-B a)
    (==> process-B a)
    (==> process-B a))

  (def process-B ()
    (givens a
              (<==
                any any)
            a2
              (<==
                any any)
      (if (is a a2)
          (err "I somehow received the same object twice!"))))

So I also need copy-on-write and copy-on-message-pass-if-receiver-already-has-a-copy.

Edit: It is important for them to be separate because process B expects two separate objects, and it might mutate one without expecting the other to be mutated.

There was a solution in my old shared concept, but it involved an O(M log N) search, where M is the message size in number of objects and N is the number of shared objects the receiver has access to, which I decided was simply suboptimal compared to an O(M) copy (where M is number of bytes in the message, but still...).

Edit2: re: passing stuff around. My intent is to not integrate the actual passing-around-stuff-via-network server in the C++-side code; instead, the VM will provide access to plain sockets. Instead an Arc-side server would be defined, with its semantics standardized (but not implementation, so that you can, say, implement a "more secure" server, or use some other protocol, etc.).

Instead, if a server receives a serialized PID over its connection from another computer, it checks a table matching the PID computer and id number. If the table fails to match, it launches a "reflector process" which simply sends every message it receives to the server, for serialization, and inserts that into the table. The PID of the reflector proces is then used as the message sent by the server to the destination process. All PIDs then get this translation (obviously the table has to be bidirectional. so that a PID can be matched to the serialization of the PID.).

Obviously not only processes would require reflectors, but also ports if we want to pass them around ^^.

-----

2 points by shader 6375 days ago | link

Interesting problem. I bet this is why the erlang folks decided to use mutation-less, copy-everything, so that they didn't have to carefully think through every possible case. Maybe you could write your vm to automatically copy everything, and then have a means of coding exceptions. So, if you find that a certain specific case can share, you can add that to the list of exceptions. Otherwise, it copies, so you don't have to worry about unexpected errors.

And you should probably allow the user to require copying if they need to, so that there aren't any surprises.

As an aside, you mentioned a O(M log N) search to determine whether to share or not. Now maybe I don't understand, but you could use a Bloom filter to determine if the object is present on the other process. The only shortcoming is the false positives, which mean you would copy unnecessarily, but they happen so infrequently that it probably isn't an issue. I believe bloom filters were mentioned on Hacker News yesterday. The link on wikipedia is: http://en.wikipedia.org/wiki/Bloom_filter

Using a bloom filter, you'd be able to in constant time, low memory usage, determine if an item is in a set. In this case the items are shared objects and the set is the objects owned by the target process. Now, maybe I don't understand what you're trying to do, but you can clarify that shortly.

How much are you implementing on the c++ side? Couldn't you get a few basic necessities built first, and then just abstract towards usability in arc? I don't know exactly what you're doing, but I don't see why you have to make everything at once.

Maybe we should move this to another thread? :)

-----

1 point by almkglor 6375 days ago | link

http://arclanguage.com/item?id=7142

-----

1 point by absz 6375 days ago | link

That's a good point. But what about having copy-on-write-if-more-than-one-logical-copy semantics? Then every Arc object on the heap would have a shared flag. Then if you send an object to another thread, the object is shared, and it has its internal shared field set to true (or 1, if this is C). And whenever you try to set (or lset, or scar, or scdr, or sref---good catch) an object whose shared is true (1), then you copy it first, modify it, set its shared flag to false (0), and then make the reference through which you modified it point instead to this new object.

Except this doesn't take any shared structure into account---as soon as you send something over the network, all the objects which share it are fragmented. So you need some sort of "versioning." Perhaps you instead have every reference have a version flag, and every copy between threads increments version for the new copy; modifying something checks the reference you are modifying it through, makes a new copy if necessary for that version, and then modifies it. The problem is that I can't see a good way to do versioned indexing, especially if references are really just raw pointers.

Though note that if you do come up with a working "copy-on-write-and-other-things" scheme, the Arc copy function becomes essentially

  (def copy (arg)
    (==> (current-pid) arg)
    (<== (current-pid) any))

.

-----

2 points by almkglor 6375 days ago | link

> Then every Arc object on the heap would have a shared flag....

This was indeed my original idea: http://arclanguage.com/item?id=7037

Incidentally, 'lset is very, very hard to implement if closures are not very mutable, as in the arc2c output.

> and then make the reference through which you modified it point instead to this new object.

  (def hmm (x)
    (let y (cdr x)
      (scar y 42))) ; is x mutated? the reference I used here was *just* y...

> the Arc copy function becomes essentially (...)

Yes, this is true, although I'd prefer to add some sort of 'uniq tag so that other pending messages don't accidentally get taken ('<== does pattern matchimg, obviously implemented via my p-m macro) ^^. It will even correctly handle recursive and shared structure, which I think the current 'copy does not ^^.

  (def copy (arg)
    (w/uniq tag
      (==> (mypid) (list tag arg))
      (<==
        (,tag cpy) cpy)))

Edit:

> That's a good point. But what about having copy-on-write-if-more-than-one-logical-copy semantics? Then every Arc object on the heap would have a shared flag. Then if you send an object to another thread, the object is shared, and it has its internal shared field set to true (or 1, if this is C).

The problem here is: what if the object to be sent is already shared and the receiver already has it? Say:

  (def process-A ()
    (let v (list 42)
       (==> B v)
       (==> C v)))

  (def process-B ()
    (<==
      any  (==> C any)))

  (def process-C ()
    (givens v1 (<== any any)
            v2 (<== any any)
      (if (is v1 v2)
          (err "I somehow received the same object twice!"))))

-----

2 points by shader 6375 days ago | link

I don't think you have to worry about versioning. Because sharing is, as far as I can tell, just to save time and bandwidth over copying, conceptually all messages are copies. This means that you shouldn't be programming as if you can update other processes just by changing a shared object. They would have to detect the change and copy it again anyway, if the other owning processes were remote.

Instead, you should only transfer information over messages. That way you don't need to worry about keeping shared objects up to date for all users.

Good idea about the shared flag; maybe it should be a ref counter? That way you can tell if the other process(es) have stopped using it. So, if the ref count is 1, than just straight up mutate, otherwise, copy?

Also, couldn't the send function be:

  (-> pid message)

? It would save a char ;) Of course, you probably have a reason for not using that.

-----

2 points by absz 6375 days ago | link

You need versioning in a situation like the following:

  (givens str1 "message"        ; 1
          str2 str1             ; 2
    (do                         ; 3
      (-> another-process str1) ; 4
      (= (str1 0) #\M)          ; 5
      (prn str1)                ; 6
      (prn str2))               ; 7

Let's assume you just have the shared flag. First, line 4 will set the shared flag to true. Then, on line 5, we mutate str1. We check the shared flag of the value of str1. OK, it's true, so we make a copy of the value and mutate it. Then we print out our values on lines 6 and 7:

  Message
  message

The problem is that the copy separates the string not only from its copies in other threads, but also its aliases in the thread it comes from.

Versioning would mean that str1 and str2 would both be pointers to "version 0" of some value, which had an internal max-version of 0. Line 4 sends the object over, so max-version is incremented (to 1), and the other process is sent a pointer to "version 1" of the same object (1 because that's the new max-version). However, since no modifications have been made, somehow (I don't know how) both "version 0" and "version 1" resolve to the same underlying object. Then, on line 5, sref is called on "version 0" of this object, so the object is copied: "version 0" is copied away and modified, but "version 1" is left alone. When we print the objects, their reference to "version 0" makes them look at this new, copied objects, so we see

  Message
  Message

The problems here are that (1) I don't know how we'd do versioned addressing, and (2) what we do with the copy of "version 0." It seems like you would have a structure which had a list of the available values, and some sort of mapping from versions to values, e.g.

  ; Available values
  A --> "a string or another object"
  B --> "A string or another object"
  C --> "A string."
  
  ; Version --> value mapping
  0 --> [A]
  1 --> [A]
  2 --> [B]
  3 --> [A]
  4 --> [C]

Then, when we modify 0, to, say, "Another object...", we would add D --> "Another object..." to the list of values and change 0 to 0 --> [D].

Hmm, thinking about the underlying structure of this and writing it out makes me think this makes less and less sense, and that either (1) this is actually trivial/a restatement of another scheme/both and thus won't work, or, if I'm lucky, that (2) there's some more general and less clunky scheme hiding inside it (hey, I can dream, right?). My money's on (1) right now, but I do feel like I understand the problem better.

Oh, and I like your suggestion of -> / <-. Nice and short.

EDIT: Almkglor, is any of the code for your VM available for perusal/editing? School's gotten out and I've gotten interested, so I might be interested in working on it (if I can... I am not all that experienced, after all, so it might be beyond me; regardless, I'd be interested in reading it).

-----

2 points by shader 6375 days ago | link

Maybe you could get versioning to work by making a real alias as opposed to just a copied reference. If somehow (i'm getting really far fetched here) expansion of macros, etc, could happen in any position instead of just application, you could have the alias expand to the other variable name. Then, you wouldn't have to worry about messing with your aliases ;)

So instead of merely 'evaluating' identifiers, you 'expanded' them, and variables expanded to their values, etc. you could have better aliases, and neater macros. Of course, I've probably misunderstood something horribly, but that's why you guys are here.

-----

1 point by absz 6375 days ago | link

The problem is that macro expansion happens at "compile" time, so local variables don't have their values yet. But you reminded me of something in arc2c: to quote the "NOTES" file, "...if a local variable is ever mutated, a 'sharedvar' C-struct is used to represent that variable. These are constructed by %sharedvar primitives, and are accessed by %sharedvar-write and %sharedvar-read primitives."

So, since I believe almkglor said he was planning to use arc2c for this VM, only changing the code generation step, then the solution seems to be to just use the shared flag on values (probably the reference-counting version, as shader suggested). It seems like since we have sharedvars, the problem solves itself: when you send an object to another thread, you don't send the sharedvar, you send the value the sharedvar is wrapping. Then, when you modify something, then if it's a sent copy, that copy is modified, not affecting the original (because of the shared flag); if it's a local with aliases, then it's modified through the sharedvar, and since the sharedvar is still aliased and only its internals are modified, there's no problem!

Well, maybe---that assumes that I actually understand sharedvars well enough. Almkglor? Stefano (since you suggested/invented them)?

-----

1 point by almkglor 6375 days ago | link

> It seems like since we have sharedvars, the problem solves itself: when you send an object to another thread, you don't send the sharedvar, you send the value the sharedvar is wrapping.

You never actually send sharedvars - they are not accessible at the Arc level. However, they are necessary in order to send functions across processes; the generated code expects to see sharedvars, so it has to have (a copy of) sharedvars.

Edit:

> ...if a local variable is ever mutated, a 'sharedvar' C-struct is used to represent that variable. These are constructed by %sharedvar primitives, and are accessed by %sharedvar-write and %sharedvar-read primitives.

This means this case:

  (let foo 1
    (set foo 42))

Importantly, this does not use sharedvars!

  (let foo "hello"
    (sref foo #\H 0))

-----

2 points by shader 6375 days ago | link

So, is that the only situation versioning is required? Updating local aliases? If so then there should be some work around, or we could just forget about it and consider it a "feature" ;)

Also, I agree about the whole perusal / editing bit. I'm probably even less experienced than absz, but no less interested.

And I again recommend moving to a new thread. The window's getting a bit cramped, and we aren't even talking on the original subject of the post.

Is there a news.yc / forum equivalent for Anarki?

-----

2 points by almkglor 6375 days ago | link

> Is there a news.yc / forum equivalent for Anarki?

You're using it ^^

-----

1 point by almkglor 6375 days ago | link

> Almkglor, is any of the code for your VM available for perusal/editing?

Not yet, the only code I've started writing on is the memory management code. There are two versions, one which uses a mark-and-sweep for local heaps and has shared heaps implemented via boost::shared_ptr, and my newer version which uses stop-and-copy and does (will do?) copy-on-message-pass.

I can send you (and anyone interested) some initial tarballs though ^^. WARNING: C++. With RAII. Be scared.

My solution for the multiple-version thing was to simply have the thread copy all its shared objects and drop all references to shared.

Then I began to feel uncomfortable with it and decided that just copying the darn thing might be better after all ^^

My main reason for using ==> and <== was to make them stand out a bit ^^. -> and <- just look so skinny. Although it's not carved into stone yet.

I already did a mock-up for this in plain Anarki-on-mzscheme, which is the main reason why Anarki suddenly got semaphores and other lock primitives suddenly a month back ^^

-----

3 points by absz 6377 days ago | link | parent | on: (delist somelist)

Actually, in looking over the functions again, it turns out that there is already a function that does what you want: random-elt. That should do what you need it to do.

As for your problem (and no, it doesn't work on Anarki[1]), the problem is that rand-choice is a macro, not a function, and apply only works on functions. The only solution is a terrible hack:

  (def mapply (macro . parms)
    " Applies the macro `macro' to `parms', which are not quoted.
      See also [[mac]] [[apply]] "
    (eval:apply (rep macro) (join (butlast parms) (last parms))))

But don't use that!

[1]: But you should use Anarki anyway :)

-----

2 points by bOR_ 6377 days ago | link

Thanks for the reminder not to apply on macro's, and for finding random-elt. I need to learn how to read ;).

random-elt does exactly what I want.

(btw.. small consistency issue in the syntax here?) rand-choice random-elt

-----

1 point by absz 6378 days ago | link | parent | on: (delist somelist)

To further clarify, if you've programmed in Ruby or Python: (apply func ... arglist) is equivalent to func(..., * arglist) in those languages.

-----

2 points by absz 6380 days ago | link | parent | on: Proposal: "." Syntax Sugar

See http://arclanguage.org/item?id=4012 .

Note that you can also use : to compose multiple functions.

The Anarki also has, if you include "ssyntaxes.arc":

11. x..y, which is the same as (range x y), and x..y..z, which is the same as (range x y z) (a range from x to y with step size z).

12. f//g, which is the same as (orf f g); this returns a function where ((orf f g) x y ...) is equivalent to (or (f x y ...) (g x y ...)). This can be used with as many functions as you want.

13. f&&g, which is like f//g but uses andf instead of orf, so the function is equivalent to (and (f x y ...) ...).

14. typ?, which returns the function (fn (x) (isa x 'typ)).

15. User-definable ssyntax, so you can add your own!

Also, note that ssyntax only works with variables and integers, not with literal lists or function calls.

And your example of using . for infix is actually a holdover from mzscheme, and not used anywhere particularly; its existence in Arc seems to be accidental.

-----

1 point by dpara 6380 days ago | link

aah beautiful, forget I ever said anything :)

-----

1 point by absz 6383 days ago | link | parent | on: 2 questions about for

You need to initialize v to nil so that it exists. The with binding creates a v that can be set. But you're right, there doesn't seem to be a reason you need the gensym.

-----

2 points by skenney26 6383 days ago | link

Thanks for the help absz. I'm creating a variety of iterators for a graphics app and I just want to make sure they're being written correctly.

I still don't quite understand why v needs to be initialized. For example, the following definition seems to work fine:

  (mac for (v init max . body)
    (w/uniq gm
     `(let ,gm (+ ,max 1)
        (loop (set ,v ,init) (< ,v ,gm) (set ,v (+ ,v 1))
          ,@body))))

-----

2 points by absz 6383 days ago | link

The problem is that if you define it this way, then your v is no longer lexical; instead, it's in the global namespace. Thus, writing (for ref 0 n (frob ref)) would overwrite the global ref procedure, which is not what you want.

-----

1 point by skenney26 6383 days ago | link

Ah, it finally makes sense. Thanks again.

-----

10 points by absz 6385 days ago | link | parent | on: When is the next Arc-version coming out PG?

I had been under the impression that it was supposed to last for one hundred years, not languish for one hundred years, but I suppose I might have misread something :/

-----

5 points by absz 6385 days ago | link | parent | on: How does hash table work when key is a list?

Or maybe there's a bug somewhere? The default assumption should not be "pg is an elitist jerk," but "pg is a fallible human being."[1] Whether or not pg is an elitist jerk is irrelevant, as the answer is probably that there's a bug/some unexpected behaviour hiding somewhere in arc.arc or ac.scm. In fact, I think that is the case, and the problem has to do with translating lists between Arc and mzscheme: see http://arclanguage.org/item?id=6993. But it's always best to analyze first, since you get more done that way.

[1]: In fact, "so-and-so is a fallible human being" is almost always the best first stop: Hanlon's Razor ("Never attribute to malice that which is adequately explained by stupidity.") and all that.

-----

1 point by sacado 6384 days ago | link

There are bugs in the implementation of hash tables, he confirmed this. Don't try them with strings as keys for example (I mean, strings you modify).

-----

2 points by applepie 6384 days ago | link

I don't think pg is an elitist at all.

-----

1 point by absz 6384 days ago | link

My apologies then—I misread the sentiment behind your comment. In that case, I direct the sentiment of my previous comment away from you :)

-----

6 points by absz 6385 days ago | link | parent | on: How does hash table work when key is a list?

The problem lies in the split between how Arc and mzscheme represent lists. Arc essentially uses iso to compare table keys (iso compares lists by structure, so while (isnt (list 'cons) (list 'cons)), (iso (list 'cons) (list 'cons))). Thus both your examples should work.

The problem lies in the definition of lists. (I'm explaining this from the beginning because I don't know what your skill level is; if you know how lists work, then please skip the rest of this paragraph.) Lists are composed of cons cells, each of which is essentially an ordered pair of a car and a cdr. The car holds the value, and in a list, the cdr must be another list. So what do you use to represent the empty list?

This is where the problem lies: in mzscheme, () is the empty list, and 'nil is just a symbol. In Arc, nil is a symbol that is also the empty list. Thus, we get

  arc> vtable
  #hash(((cons) . a-method))
  arc> (= (vtable '(cons)) 'another-method)
  another-method
  arc> vtable
  #hash(((cons) . a-method) ((cons . nil) . another-method))

Note that the two keys are different: the first key is a Scheme list, ending in (), whereas the second is an Arc list, ending in nil. What's happening becomes clearer when we look at the definition of list (Anarki docstring omitted for clarity):

  (def list args
    args)

So as far as I understand it, when an Arc function with a variadic parameter (like list) is called, the arguments are passed to it as a Scheme list instead of an Arc list. Usually this is transparent, since Arc tries to convert () to nil on the Arc side, and nil to () on the mzscheme side.[1] However, it's not perfect, and here's one of the places where that shows up.

I didn't see an easy fix inside ac.scm, and don't have the time to keep looking. However, now I'm worried that that's going to trip me up sometime in the future... The only solution I can think of is to explicitly call ac.scm's ac-nil function, which performs the translation from () to nil, but this is (a) very ugly, and (b) only works on Anarki. For what it's worth, though, here it is (where $ is Anarki's macro for accessing mzscheme):

  arc> (= vtable (table))
  #hash()
  arc> (= (vtable ($.ac-niltree (list (type '(1))))) 'a-method)
  a-method
  arc> vtable
  #hash(((cons . nil) . a-method))
  arc> (vtable '(cons))
  a-method
  arc> (vtable ($.ac-niltree (list 'cons)))
  a-method
  arc> (vtable (list 'cons))
  nil

I hope that helps, but I hope even more that this bug gets fixed. (I, unfortunately, have external obligations and can't continue chasing it down.)

[1]: It also tries to convert #f and nil, but that's not relevant for your problem.

-----

2 points by zhtw 6385 days ago | link

Thanks for the detailed explanation. I got the idea. Well, in my case this is going to stay deep inside the library. So any workaround would be fine.

But I solved this by adding (join ... nil).

-----

2 points by absz 6385 days ago | link

That is a much nicer in-Arc solution than mine; good catch. In any case, I'm glad I could be of assistance.

-----

4 points by absz 6387 days ago | link | parent | on: Interpreter Manipulation?

It's a very important concern. Now, I don't know the details of the subject, but Steve Yegge just gave a talk on optimizing dynamic languages; he uploaded the transcript at http://steve-yegge.blogspot.com/2008/05/dynamic-languages-st... . Thoughts?

-----

3 points by almkglor 6387 days ago | link

Possibly we can hack apart 'ac and fix the parts where 'ar-eval is randomly inserted, even though they are unnecessary (I think).

However the million dollar question of optimizing an Arc with a redefinable 'eval is the fact that Arc 'eval has TWO steps: compilation and then evaluation. Function definitions are always compiled.

For example:

  ; eval to make everything postfix
  (redef eval (e)
    (if (acons e)
        (given fun (last e)
               arg (cut body 0 -1)
          (old `(,fun ,@arg)))
        (old e)))
  (t 0 5 if)
  (nil 0 5 if)

But how do you handle functions?

  (foo ((x)
      (x
        0
        42
        if)
      fn)
    set)

Do you implicitly insert 'eval to the function body, or not?

-----

2 points by absz 6387 days ago | link

This is actually also the million-dollar semantic question. Right now, arc-eval is not inserted around function or macro bodies, so only top-level forms (those at the repl) are affected. But we can't insert it around bodies as is, because (1) Arc-side eval doesn't take an environment, but they need to pass an environment in, and (2) they don't evaluate things immediately, but instead they create the forms that will be evaluated. What is the best way to do things here? We have no other language to guide us, after all.

-----

2 points by skenney26 6386 days ago | link

I'm curious: So arc-eval translates your arc code into scheme code and then passes it to scheme's eval? If so, would it be possible to print the scheme code to a file before it gets evaluated?

-----

1 point by almkglor 6386 days ago | link

> I'm curious: So arc-eval translates your arc code into scheme code and then passes it to scheme's eval?

Yes, I think so.

> If so, would it be possible to print the scheme code to a file before it gets evaluated?

Yes. sacado did some work on this I think.

-----

1 point by almkglor 6386 days ago | link

Personally, I'm rather squeamish about letting 'eval be redefined. Because the redefined 'eval will not be consistently used.

Perhaps it's better to just define a top-level which preprocesses the code before passing it to 'eval?

-----

2 points by eds 6386 days ago | link

That could be accomplished by redefining 'arc-eval rather than 'eval. 'arc-eval is used internally for evaluating top-level forms but not really anything else since it isn't exposed to arc, whereas 'eval could be used by arbitrary arc code. If you want to redefine the top-level evaluation function, than maybe what you want is some way to influence the results of 'arc-eval.

-----

1 point by almkglor 6386 days ago | link

So much easier:

  (def mytl ()
    (pr "arc> ")
    (eval (my-transformation (read)))
    (mytl))

-----

2 points by eds 6386 days ago | link

I guess I misunderstood your intention. I thought you wanted to be able to redefine the existing toplevel used by the arc interpreter itself, not define a completely new toplevel to be used in user code... (and yes, the latter is quite trivial).

-----

1 point by almkglor 6386 days ago | link

> I thought you wanted to be able to redefine the existing toplevel used by the arc interpreter itself

As opposed to emulating the existing toplevel in user code?

Allowing a redefinition of 'eval to work on the REPL will probably be OK, but still, there would be an impedance mismatch with functions that were defined before 'eval was redefined.

-----