Arc Forum | bogomipz's comments

Arc Forum

new | comments | leaders | submit | bogomipz's comments

3 points by bogomipz 5358 days ago | link | parent | on: Ask: PG and Anti-OO

This kind of OO with closures is a fun experiment and looks very elegant at first sight. I love the (x!deposit 50 "meh") version for its simplicity, the use of ssyntax, and the fact that you can pass x!deposit around as a first class function. Thanks to macros, you can of course easily come up with a nice syntax for the definitions:

  (defclass Bank-Account (password)
    (money 0 debt 0)
    (def check-pass (pw)
      (unless (is pw password)
        (err "Wrong password!")))
    (def deposit (x pw)
      (self!check-pass pw)
      (++ money x))
    (def withdraw (x pw)
      (self!check-pass pw)
      (if (< money x)
          (err "Not enough money.")
          (-- money x)))
    (def check (pw)
      (self!check-pass pw)
      money)
    (def change-pw (new-pw pw)
      (self!check-pass pw)
      (= password new-pw)))

However, the approach has some issues in real life use. First, every bank account instance replicates the method table and so takes up more memory the more methods the class defines, and each method is a closure that takes up memory as well. Also, this hash table obviously needs to be built every time an instance is created. Another big problem that follows from the above is that when you add or redefine methods on the class, existing instances are left with the old implementation. And there is no way to implement inheritance here.

I guess it is possible to remedy most or all of those problems by sacrifying methods as closures and instead do:

  (= bank-account-mt
    (obj check-pass (fn (self o pw)
                      (unless (is o!pw pw)
                        (err "Wrong password!")))
         deposit (fn (self o x pw)
                   (self 'check-pass pw)
                   (++ o!money x))
         withdraw (fn (self o x pw)
                    (self 'check-pass pw)
                    (if (< o!money x)
                        (err "Not enough money.")
                        (-- o!money x)))
         check (fn (self o pw)
                 (self 'check-pass pw)
                 o!money)
         change-pw (fn (self o new-pw pw)
                     (self 'check-pass pw)
                     (= o!pw new-pw))))

  (def Bank-Account (password)
    (let o (obj money 0 pw password)
      (afn (method-name . args)
        (apply (bank-account-mt method-name)
               (cons self (cons o args))))))

Again using a macro to improve readability and writability. Adding inheritance is left as an exercise for the reader.

-----

2 points by rocketnia 5357 days ago | link

I'm sure this doesn't surprise you, but here's a quick version of 'defclass that uses a syntax similar to your first example and an implementation similar to your second example:

  (mac defclass (name constructed-fields derived-fields . defs)
    (let mt (sym:string name '-mt)
      `(do (= ,mt (obj ,@(mappend
                           [do (case car._
                                 def  (let (name parms . body) cdr._
                                        `(,name (fn ,(cons 'self
                                                       (cons 'o parms))
                                                  ,@body)))
                                  (err:+ "An invalid 'defclass "
                                         "declaration was "
                                         "encountered."))]
                           defs)))
           (def ,name ,constructed-fields
             (let o (withs ,derived-fields
                      (obj ,@(mappend [list _ _]
                               (join constructed-fields
                                     (map car pair.derived-fields)))))
               (afn (method-name)
                 (fn args
                   (apply (,mt method-name)
                          (cons self (cons o args))))))))))
  
  (defclass Bank-Account (password)
    (money 0)
    (def check-pass (pw)
      (unless (is pw o!password)
        (err "Wrong password!")))
    (def deposit (x pw)
      self!check-pass.pw
      (++ money x))
    (def withdraw (x pw)
      self!check-pass.pw
      (when (< o!money x)
        (err "Not enough money."))
      (-- o!money x))
    (def check (pw)
      self!check-pass.pw
      o!money)
    (def change-pw (new-pw pw)
      self!check-pass.pw
      (= o!password new-pw)))

-----

1 point by bogomipz 5357 days ago | link

Nice, and you even changed it so x!deposit returns a function again! This does of course add some overhead since a closure is constructed every time you call a method, but still.

One thing I'm not quite happy with is that one has to write o!money. Would it somehow be possible to hide the o? Would it be possible to use !money or .money, or does the parser not allow that? And how to pass the hash table from the afn to the methods without polluting their namespaces? It could be done using a gensym, but then it is not possible to add methods to the method table outside defclass.

Perhaps doing something like this:

  (= bank-account-mt
    (obj check-pass (fn (self pw)
                      (unless (is self!ivars!pw pw)
                        (err "Wrong password!")))
         deposit (fn (self x pw)
                   self!check-pass.pw
                   (++ self!ivars!money x))
         withdraw (fn (self x pw)
                    self!check-pass.pw
                    (if (< self!ivars!money x)
                        (err "Not enough money.")
                        (-- self!ivars!money x)))
         check (fn (self pw)
                 self!check-pass.pw
                 self!ivars!money)
         change-pw (fn (self new-pw pw)
                     self!check-pass.pw
                     (= self!ivars!pw new-pw))))

  (def bank-account (password)
    (let ivars (obj money 0 pw password)
      (afn (selector)
        (if (is selector 'ivars)
            ivars
            (fn args
              (apply (bank-account-mt selector)
                     (cons self args)))))))

Then make defclass turn .foo into self!ivars!foo. Another macro could exist for (re)defining methods after the fact:

  (defmethod bank-account steal-money (x)
    (-- .money x))

Or even redefine Arc's def so you could do:

  (def bank-account!steal-money (x)
    (-- .money x))

since (bank-account 'steal-money) is not an atom and 'def could thus recognize it as different from an ordinary function definition.

-----

2 points by bogomipz 5418 days ago | link | parent | on: Poll: Best representation for JS dot notation

The table-like syntax is nice, but it has the following problem.

Let's say you have the expression;

  a!b!c!d!e!f

If you now want to replace the a!b part with (a!b 4), you end up with;

  (((((a!b 4) 'c) 'd) 'e) 'f)

Unless I'm missing something, there is no way to have ssyntax for the part after the first set of parentheses. If it was the f that gained parentheses, it would not affect the rest of the expression;

  (a!b!c!d!e!f 4)

What do you think about this alternative;

  (a.b 4 . c.d.e.f)

Your example would then be;

  (= document.body.innerHTML (document.getElementById "foo" . value))

I'm not sure if this looks as nice, although it does solve a technical problem.

-----

3 points by rocketnia 5418 days ago | link

(Before you get too excited, I'm not the person you replied to. ^_^ )

First, for your particular example, you could just do this:

  a!b.4!c!d!e!f

If you need (a!b 4 5), it does get more complicated, and I've gotten a bit annoyed about that myself. Nevertheless, there's still a way (albeit a way which requires a bunch of typing to refactor into):

  (!f:!e:!d:!c:a!b 4 5)

You know, I bet this would create an awful lot of JavaScript. XD

In practice, I rarely have more than three chained property accesses in C-like languages, or more than four things connected by ssyntax in Arc, so I think I'd just add one or two sets of parentheses and live with it. I won't pretend my case is typical, though. :-p

-----

2 points by bogomipz 5415 days ago | link

> I bet this would create an awful lot of JavaScript.

Not really, it should create this;

  a.b(4, 5).c.d.e.f

-----

2 points by evanrmurphy 5415 days ago | link

Well, (!f:!e:!d:!c:a!b 4 5) expands to

  ((compose (get (quote f)) 
            (get (quote e)) 
            (get (quote d)) 
            (get (quote c)) 
            (a (quote b))) 4 5)

which should in turn generate

  (function(){
    var g3283=arraylist(arguments);
    return get('f')(get('e')(get('d')(get('c')(apply(a.b,g3283)))));
  })(4,5);

-----

0 points by bogomipz 5414 days ago | link

:-(

-----

1 point by evanrmurphy 5414 days ago | link

Am I mistaken, or what do you mean by that?

-----

1 point by bogomipz 5413 days ago | link

No, no, I trust that your translation is correct. I was just disappointed that it would compile down to this much JS code since my example was design to model a.b(4).c.d.e.f.

I don't have a running Arc to check it on at the moment because mzscheme 372 does not compile for me (probably my gcc version is too new).

-----

1 point by evanrmurphy 5413 days ago | link

Ah, I see. Yes, at the moment this compiler isn't very good at generating minimal JavaScript, since it's so faithful to arc.arc's macro definitions. A lot of the later work might involve optimizing it to produce smaller, more efficient JS.

Of course, you can still use (((((a!b 4) 'c) 'd) 'e) 'f) to generate a.b(4).c.d.e.f. [1]

> mzscheme 372 does not compile for me

Did you know Arc 3.1 works on the latest MzScheme? [2]

---

[1] Actually, you might be further disappointed to know (((((a!b 4) 'c) 'd) 'e) 'f) is currently compiling to:

  get('f')(get('e')(get('d')(get('c')(get(4)(get('b')(a))))));

get here is a JS function not unlike rocketnia's ref [3]. Its purpose is to disambiguate the Arc form (x y), which may compile to x(y), x[y] or (car (nthcdr y x)), depending on the type of x (function, array/object or cons, respectively).

[2] http://arclanguage.org/item?id=10254

[3] http://arclanguage.org/item?id=12102

-----

2 points by evanrmurphy 5405 days ago | link

> (((((a!b 4) 'c) 'd) 'e) 'f) is currently compiling to:

  get('f')(get('e')(get('d')(get('c')(get(4)(get('b')(a))))));

I wrestled with this disambiguation problem for some time and finally settled (for now ;) on a simple inference system based on the most common use cases. The algorithm is:

1. If the form has a single quoted arg, as in (x 'y), it's compiled to x['y']. This allows object access chains like document!body!innerHTML to be compiled correctly by default.

2. If the form has 0 or 2+ args, or 1 arg that isn't quoted, then it's considered a function call:

  (x) => x()
  (x y) => x(y)
  (x y z) => x(y,z)

I'm still looking into the least kludgy way to pass a single quoted arg to a function. Here are some options:

  (x "y")
  (x `y)        ; quasiquote isn't currently used for anything else
  (x 'y nil)    ; the function can just ignore the nil arg
  (fncall x 'y)

-----

2 points by rocketnia 5405 days ago | link

What about something like this?

  callget!y.x

I don't know. If it comes up often enough, I think I'd rather have a special (fncall x 'y) ssyntax. Maybe x!y could expand to (fncall x 'y) and x.`y could expand to (x 'y).

-----

1 point by evanrmurphy 5404 days ago | link

I had assumed that since x.'y was read as two distinct symbols, x.`y would be too, but it's not the case:

  arc> 'x.'y
  x.
  arc> y   ; still evaluating previous expr
  arc> 'x.`y
  |x.`y|

Any idea why these are treated differently? Whatever the reason, it means I can use x.`y without hacking the reader. So, thanks for pointing this out to me! ^_^

I'm currently torn about whether to do

  x!y => (x 'y) => (fncall x 'y) => x('y')
  x.`y => (x `y) => (objref x 'y) => x['y']

as you suggested, or the reverse. Leaning toward your way so that functions are totally normal and objects special, rather than having functions with a single quoted arg be some exception.

-----

1 point by evanrmurphy 5404 days ago | link

So I went ahead and implemented it your way. ^_^ For an example of it in action, check out the following from the Hello JQuery tutorial at http://docs.jquery.com/Tutorials:Getting_Started_with_jQuery:

  $(document).ready(function() {
     $("a").click(function() {
       alert("Hello world!");
     });
   });

To reproduce this now using my arc-to-js compiler:

  ($.document.`ready (fn ()
    ($!a.`click (fn ()
      (alert "Hello world!")))))

"write much less, do more" ^_^

This example works particularly well because the $("a") jQuery selector can be compiled from $!a. A challenge arises with more complex selectors, as in this snippet from the Find Me: Using Selectors and Events tutorial:

  $(document).ready(function() {
     $("#orderedlist").addClass("red");
   });

Since $("#ordered list") has the special character #, we're unable to compile it from $!#orderedlist. Either most of the ssyntax has to be sacrificed for parens, as in

  ($.document.`ready (fn ()
    ((($ "#orderedlist") `addClass) "red")))

or Arc's get ssyntax must be used:

  ($.document.`ready (fn ()
    (.`addClass!red ($ "#orderedlist"))))

I hope to post updated source for js.arc and arc.js soon so that people who are interested can start trying out the compiler.

-----

1 point by evanrmurphy 5401 days ago | link

Does anyone know why the reader interprets x.'y as two symbols but x.`y as only one?

-----

2 points by fallintothis 5401 days ago | link

Not quite sure (I suspect it's a bug), but it seems like it has to do with the implementation of make-readtable (which brackets.scm uses).

  $ mzscheme
  Welcome to MzScheme v4.2.1 [3m], Copyright (c) 2004-2009 PLT Scheme Inc.
  > (parameterize ((current-readtable #f)) (read))
  x`y ; read in as two items
  x
  > y
  > (parameterize ((current-readtable (make-readtable #f))) (read))
  x`y ; read in as one symbol
  |x`y|

For braver people than me, you might check the source at http://github.com/plt/racket/blob/master/src/racket/src/read....

-----

1 point by waterhouse 5412 days ago | link

In fact arc3.1 even works on Racket, the new PLT Scheme. Only thing is that the command-line "racket" prints a newline after the "arc>" prompts, for some reason. But you can open as.scm with the editor DrRacket (as you could with DrScheme), set the language to be "Pretty Big", and hit Run; it will work.

-----

1 point by bogomipz 5412 days ago | link

Wow, it seems to work fine with Racket 5.0, and I don't notice any issues with the prompt.

This should be mentioned on http://www.arclanguage.org/install

Thanks for the hint, evanrmurphy!

-----

1 point by waterhouse 5412 days ago | link

For some reason, now I don't notice any issues with the "arc>" prompt in "racket" either. And I don't think I'm doing anything differently than I was before. ...I am forced to conclude that, when entering things into the REPL, I held down the return key long enough that it accepted an extra (blank) line of input. This explains the behavior exactly. Strange that I should have done this several times in a row... and how embarrassing. Oh well. At least now I can give racket a clean bill of health.

-----

1 point by prestonbriggs 5412 days ago | link

Not for me. Nor does it work with mzscheme. I get the complaint

ac.scm:1023:0: ffi-obj: couldn't get "setuid" from #f (The specified procedure could not be found.; errno=127)

Preston

-----

1 point by waterhouse 5412 days ago | link

That is a known issue with Windows. (I'm guessing it's the reason arc3 is still the "official" version on the install page.) Simple workaround[1]: Find the line that says:

  (define setuid (get-ffi-obj 'setuid #f (_fun _int -> _int)))

and replace it with

  (define (setuid x) x)

I have done this on at least two Windows computers and Arc ran fine afterwards.

[1]Source: http://arclanguage.org/item?id=10625

-----

1 point by prestonbriggs 5411 days ago | link

Got it, thanks.

-----

2 points by ylando 5410 days ago | link

Why arc do not have a normal web page; See:

  http://www.ruby-lang.org/en/
  http://www.python.org/
  http://www.perl.org/
  http://www.erlang.org/
  http://clojure.org/

-----

2 points by akkartik 5410 days ago | link

Because it's unfinished (and may remain so). See http://arclanguage.org and more recently http://news.ycombinator.com/item?id=1525323. No point sucking people in with a normal-looking webpage if the language isn't really ready for production use.

-----

1 point by evanrmurphy 5408 days ago | link

Could you talk about your decision to use it for Readwarp then? If Arc's not really ready for production use, might it still be a good choice for a certain minority of developers?

-----

2 points by akkartik 5408 days ago | link

Yeah, I'm not trying to say you shouldn't use it for production use :)

They're opposing perspectives. As a user of arc I'd throw it into production[1]. At the same time, from PG's perspective I'd want to be conservative about calling it production ready.

I suspect arc will never go out of 'alpha' no matter how mature it gets, just because PG and RTM will not enjoy having to provide support, or having to maintain compatibility.

[1] With some caveats: treat it as a white box, be prepared to hack on its innards, be prepared to dive into scheme and the FFI. And if you're saving state in flat files, be prepared for pain when going from 1 servers to 2.

Not all kinds of production are made the same.

-----

1 point by evanrmurphy 5415 days ago | link

> The table-like syntax is nice, but it has the following problem. [...] Unless I'm missing something, there is no way to have ssyntax for the part after the first set of parentheses.

Yes, this is sometimes a problem for me too, or at least an annoyance. It's one of those things that's a bug or feature depending upon who you ask, though. [1] Whichever way you classify it, the root issue is with Arc, not the compiler, which just conforms to Arc's ssyntax rules.

  (= document.body.innerHTML (document.getElementById "foo" . value))

Interesting formulation, but the inner parens' inclusion of value makes it look like value is another argument in the function call. It also might be too similar to dotted cons notation, e.g. '("foo" . value).

[1] The explanation at http://arclanguage.org/item?id=2195 has helped me to appreciate that ssyntax only works between symbols.

-----

1 point by bogomipz 5819 days ago | link | parent | on: Multi character matching, and ssyntaxes

Nice idea! This would eliminate both pos and posmatch.

Call the string with an index to get the character at that position, call it with a character or string to find the index.

Although, given your example, it looks like (str "world") should return a range.

-----

1 point by shader 5819 days ago | link

That doesn't sound too hard to do. Assignment and position on strings are handled by string-set! and string-ref. If those were modified to accept a string as input instead of just a numerical index, then Adlai's code would work.

Maybe we should just make two scheme functions str-set! and str-ref and use those instead, as opposed to over-writing the original functions.

This sounds like a good spot for the redef macro ;)

Anyway, because position matching and assignment are handled separately, (= (str "world") "foo") could still work even without (str "world") returning a range.

-----

1 point by bogomipz 5819 days ago | link

Yes, there just seems to be a dilemma of whether (str "world") should return an index or a range. If Arc had multiple return values, it could return the start and end indices, and a client that only uses the start index would just ignore the second value :)

-----

2 points by Adlai 5818 days ago | link

The return value should correspond to what was being searched for.

In other words, searching for one character should return an index, while searching for a substring should return a range.

There are thus four operations which would ideally be possible through ("abc" x):

  arc> (= str "hello arc!")
  "hello arc!"
  arc> (str "arc")
  6..8     ; or some other way of representing a range
  arc> (str #\!)
  9
  arc> (str 5)
  #\space
  arc> (str 4..7)   ; same as previous comment
  "o ar"

A way to take advantage of multiple values, if they were available, could be something like this:

  arc> (str #\l)
  2
  3

-----

1 point by conanite 5818 days ago | link

Just curious - wouldn't it suffice to return the index of the beginning of the matched string when running a substring search?

  arc> (str "arc")
  6

, because you already know the length of "arc", so you don't really need a range result?

Otherwise, these are great ways to occupy the "semantic space" of string in fn position.

-----

1 point by shader 5818 days ago | link

I agree with you. I don't think that returning a range is necessary.

Even if call position and assignment weren't handled separately, it would still be possible to work off of the length of the argument and the index, without needing a range.

The question is whether or not pg agrees with us enough to add it to arc3 ;)

-----

1 point by conanite 5818 days ago | link

If there are 100 arc coders in the world, there are probably 100 versions of arc also. The question is whether you want it in your arc :)

-----

1 point by shader 5818 days ago | link

True. And I do. Unfortunately, I'm busy working on several other things at once right now. If you want to start working on it, be my guest. Hopefully I'll be able to share what I've been doing soon.

-----

1 point by Adlai 5818 days ago | link

I guess that (str "world") could just return an index, because (= (str "world") "arc") has access to the entire call, and can thus calculate

  (+ (str "world") (len "world"))

to figure out what the tail of the string should be after a substitution.

-----

1 point by shader 5818 days ago | link

Well, scheme supports multiple values, so it shouldn't be too hard to get them into arc, right?

-----

1 point by conanite 5818 days ago | link

arc supports multiple values via destructuring

  (let (a b c) (fn-returning-list-of-3-things)
    ...)

In the particular case of returning multiple indexes into a string though, you don't usually know in advance how many matches there will be, so destructuring isn't an option.

-----

1 point by Adlai 5818 days ago | link

Multiple return values from a form are allocated on the stack, not on the heap. I don't 100% understand what that means, though...

One practical consequence is that you don't have to deal with later multiple values if you don't want to, but when values are returned as a list, you have to deal with them.

-----

3 points by bogomipz 6208 days ago | link | parent | on: Multiple Return Values?

What if your function originally just returned one value, but at some later point you realize that a second value would be useful in some situations?

With multiple return values you can just extend it without breaking existing clients. If, on the other hand, you add a list wrapper around the returned values, all call sites must be changed to take car of the list.

-----

3 points by bOR_ 6202 days ago | link

That would be useful indeed. The flip side of the coin might be something that was sort of mentioned in 'on lisp'. If all functions return only one value (be it a list or a single value) by default, you can write a general memoize layer around functions that doesn't have to check how many multiple return values are returned.

I also noticed a carif function in arc. If you are worried about single values that will become lists in the future, you might start using carif in your current clients.

-----

2 points by bogomipz 6223 days ago | link | parent | on: Are strings useful ?

I think the strongest reason for separate strings and symbols is that you don't want all strings to be interned - that would just kill performance.

About lists of chars. Rather than analyzing lists every time to see if they are strings, what about tagging them? I've mentioned before that I think Arc needs better support for user defined types built from cons cells. Strings would be one such specialized, typed use of lists.

Also, how do you feel about using symbols of length 1 to represent characters? The number one reason I can see not to, is if you want chars to be Unicode and symbols to be ASCII only.

-----

2 points by sacado 6223 days ago | link

Symbols, ASCII only ? No way, I'm writing my code in French, and I'm now used to calling things the right way, i.e. with accents. "modifié" means "modified", "modifie" means "modifies", that's not the same thing, I want to be able to distinguish between both. Without accents, you can't.

Furthermore, that would mean coercing symbols into strings would be impossible (or at least the 1:1 mapping would not be guaranteed anymore).

-----

2 points by stefano 6223 days ago | link

From the implementation point of view representing characters as symbols is a real performance issue, because you would have to allocate every character on the heap, and a single character would then take more than 32 bytes of memory.

-----

2 points by sacado 6223 days ago | link

I think that's an implementation detail. You could still somewhat keep the character type in the implementation, but write them "x" (or 'x) instead of #\x and making (type c) return 'string (or 'sym).

Or, if you take the problem the other way, you could say "length-1 symbols are quite frequent and shoudn't take too much memory -- let's represent them a special way where they would only take 4 bytes".

-----

1 point by stefano 6222 days ago | link

This would require some kind of automatic type conversions (probably at runtime), but characters-as-symbols seems doable without the overhead I thought it would lead to.

-----

2 points by bogomipz 6277 days ago | link | parent | on: Quick question - un-setting a symbol?

Right indeed. The normal way to do this would be;

  (with (y nil foo nil)
    (= foo (fn (x) (+ x y)))
    (= y 10)
    (foo 5))

You can't use 'def there because, unlike in Scheme, def always makes a global binding in Arc. Including x in the with is not necessary, by the way.

From the sound of it, this does not solve lacker's problem, however, because he does not know up front what variables he needs to declare.

-----

4 points by bogomipz 6277 days ago | link | parent | on: (rev '()) - nil

Simply () works too, and is more than 33% shorter ;)

The following shows four ways to express the empty list, and proves that they mean the same in Arc. What's a little peculiar, however, is that one gets printed differently;

  arc> (is () '() nil 'nil)
  t
  arc> ()
  ()
  arc> '()
  nil
  arc>

-----

1 point by bogomipz 6279 days ago | link | parent | on: Lists as functions on symbol arguments

And a third alternative is to use dotted pairs for the associations, but my point was that by treating a plain list as alternating keys and values, it plays nice with rest arguments in functions.

Generally, a list may be interpreted in different ways in different situations, and a common complaint about lisp is that you can't tell if a cons cell is supposed to be the starting point of a tree, an assoc list, a sequence, or something else. I think the way to tackle this in Arc should be to make better use of annotations.

A rest argument will always be a plain list without a tag. That's the reason for the suggested interpretation of kvp!b.

-----

1 point by nlavine 6278 days ago | link

Why are we assuming that keyword arguments must be passed as flat lists of keywords and values?

  (bar 1 2 ('foo 3) ('baz 4))

I agree the flat way is cleaner, but this is certainly a possibility too.

-----

1 point by cooldude127 6279 days ago | link

well, you could always just use pairs to turn the interleaved list into an alist.

-----

1 point by bogomipz 6279 days ago | link

Yes, with the overhead of the operation plus a let form.

My suggestion only really applies if pg decides against adding keyword arguments to Arc.

-----

2 points by cchooper 6279 days ago | link

Exactly. It's basically a roundabout way of adding keywords into the language. A better idea would be to just add them, and then list-functional notation could be used for something more generally useful.

-----

3 points by bogomipz 6279 days ago | link | parent | on: Regular expressions

So ruby has a syntax for regular expressions, such as /\D+/. What I've always wondered is, does this have any advantage at all?

I mean, the actual regex operations are done by methods on the string class, which like nex3 mentioned is at the library level.

Is there any reason

  a_string.split(/\D+/)

is better than

  a_string.split("\D+")

Please do enlighten me.

-----

5 points by nex3 6279 days ago | link

A distinction between regexen and strings is actually very handy. I've done a fair bit of coding in Ruby, where this distinction is present, and a fair bit in Emacs Lisp, where it's not.

There are really two places where it's really important. First, if regexen are strings, then you have to double-escape everything. /\.foo/ becomes "\\.foo". /"([^"]|\\"|\\\\)+"/ becomes "\"([^\"]|\\\\"|\\\\\\\\)+\"". Which is preferable?

Second, it's very often useful to treat strings as auto-escaped regexps. For instance,

  a_string.split("\D+")

is actually valid Ruby. It's equivalent to

  a_string.split("D+")

because D isn't an escape char, which will split the string on the literal string "D+". For example

  "BAD++".split("D+") #=> ["BA", "+"]

Now, I'm not convinced that regexen are necessary for nearly as many string operations as they're typically used for. But I think no matter how powerful a standard string library a language has, they'll still be useful sometimes, and then it's a great boon to have literal syntax for them.

-----

3 points by bogomipz 6279 days ago | link

Ok, so what it comes down to, is that you don't want escapes to be processed. Wouldn't providing a non-escapable string be far more general, then?

Since '\D+' clashes with quote, maybe /\D+/ is a good choice for the non-escapable string syntax. Only problem is that using it in other places might trigger some reactions as the slashes make everybody think of it as "regex syntax".

-----

3 points by nex3 6279 days ago | link

Escaping isn't the only thing. Duck typing is also a good reason to differentiate regular expressions and strings. foo.gsub("()", "nil") is distinct from foo.gsub(/()/, "nil"), and both are useful enough to make both usable. There are lots of similar issues - for instance, it would be very useful to make (/foo/ str) return some sort of match data, but that wouldn't be possible if regexps and strings were the same type.

-----

4 points by bogomipz 6278 days ago | link

Now we're getting somewhere :) For this argument to really convince me, though, Arc needs better support for user defined types. It should be possible to write special cases of existing functions without touching the core definition. Some core functions use case forms or similar to treat data types differently. Extending those is not really supported. PG has said a couple of times;

"We believe Lisp should let you define new types that are treated just like the built-in types-- just as it lets you define new functions that are treated just like the built-in functions."

Using annotate and rep doesn't feel "just like built-in types" quite yet.

-----

2 points by almkglor 6278 days ago | link

Try 'redef on nex3's arc-wiki.git. You might also be interested in my settable-fn.arc and nex3's take on it (settable-fn2.arc).

-----

3 points by earthboundkid 6278 days ago | link

You could always do it the Python way: r"\D+" => '\\D+'

There's also u"" for Unicode strings (in Python <3.0) and b"" for byte strings (in Python >2.6).

-----

2 points by map 6279 days ago | link

If the "x" modifier is used, whitespace and comments in the regex are ignored.

  re =
  %r{
      # year
      (\d {4})
      # separator is one or more non-digits
      \D+
      # month
      (\d\d)
      # separator is one or more non-digits
      \D+
      # day
      (\d\d)
  }x

  p "the 1st date, 1984-08-08, was ignored".match(re).captures

  --->["1984", "08", "08"]

-----

3 points by bogomipz 6286 days ago | link | parent | on: Bug with voting on the forum

Is this browser dependent? In Firefox 2.0.0.12 on Linux, I can't reproduce what you describe. After the first click, clicking on the same spot has no effect at all.

-----

1 point by absz 6286 days ago | link

Then it probably is browser-dependent (I thought it might be). It's a quirk of setting something hidden, I suppose.

-----