Arc Forumnew | comments | leaders | submit | partdavid's commentslogin
2 points by partdavid 5876 days ago | link | parent | on: How Unicode works

Yes, though unfortunately that article is oversimplified. Maybe these make better links:

http://www.unicode.org/reports/tr15/ http://www.unicode.org/reports/tr9/ http://www.unicode.org/reports/tr10/

-----

2 points by partdavid 5888 days ago | link | parent | on: Arc challenge: 8 Queens Problem

I'm not thrilled with this escript (Erlang), namely because Erlang provides a perfectly nice way to format the resulting lists but lojic dictates the output must be exactly the same, thus the fairly ugly business in main():

  #!/usr/bin/env escript
  
  -import(lists, [seq/2, reverse/1]).
  
  main([]) ->
     lists:foreach(fun(L) ->
                         io:format("[ ~s ]~n",
                                   [string:join(
                                      [ integer_to_list(I)
                                        || I <- L ], ", ")])
                   end, queens([], 0)).
                         
  
  threatens(Q, _, [Q|_], _) -> true;
  threatens(_, _, [], 0) -> false;
  threatens(Q, I, [Q2|Queens], I2) ->
     if
        abs(I - I2) =:= abs(Q2 - Q) -> true;
        true -> threatens(Q, I, Queens, I2 - 1)
     end.
  
  is_valid([Q|Queens]) ->
     TL = length(Queens),
     not threatens(Q, TL + 1, Queens, TL).
  
  queens([], _) -> queens([ [X] || X <- seq(1, 8) ], 8);
  queens(Tree, 1) -> Tree;
  queens(Tree, Place) ->
     queens([ [N|Node] || N <- seq(1, 8),
                          Node <- Tree,
                          is_valid([N|Node]) ],
            Place - 1).
Note that threatens/4 there does short-circuit and avoids evaluating needless cases, without "return."

-----

2 points by lojic 5888 days ago | link

"but lojic dictates"

I prefer the word encourage :) I know it's nitpicky, but the idea was to not simply use a native print. Of course, had I required the 'pp' Ruby lib, I could have just said:

  pp stack   # => [ 1, 2, 3, ... , 8 ]

-----

2 points by partdavid 5888 days ago | link

I liked the pun in "lojic dictates", though :)

My point was more that having to put spaces in meant doing it myself rather than just io:format("~p~n", [Answer]).

-----

1 point by lojic 5888 days ago | link

Ah, I missed the pun - nice.

Your point is my point. I wanted to see how well Arc handles formatting something that didn't fit neatly into an expected pattern.

Ruby has several formatting techniques that are quite nice:

1) String interpolation. You can insert an expression into a spring by using #{expression}

2) String formatting. You can use sprintf style patterns in strings via format or the % operator. For example:

  puts "%4.1f hours" % hours
  puts "%4.1f hours and %4.1f minutes" % [hours, minutes]
So, does Arc have format now?

-----


I'm not sure what session is, there.

-----

1 point by tiglionabbit 5889 days ago | link

Session is a base64-encoded cookie with a secret key for each sinatra app.

-----

1 point by partdavid 5903 days ago | link | parent | on: Docstrings: a contrarian view

Little discursive notes about why one thing works and something more obvious doesn't can be really helpful. Commentary about the code you didn't write can be helpful.

But in my experience on large projects that kind of documentation becomes:

   # do_my_foo() is a function accepts a float and
   # returns a float, performing necessary calculations.
   double do_my_foo(double inarg) {
   .
   .
   .
That is, they lie, they rot, and, since it's a "requirement" the programmer wasn't inclined to perform, they aren't informative, either.

-----


I think the OP's point was, if it's so much slower than Mzscheme, something is "wrong" with Arc itself, not the runtime.

-----


I don't think that it is. It's a way of saying that the Arc language satisfies two constraints: it is suitable for writing programs, and suitable for writing specifications. For "exploratory programming" this might be the right thing.

I don't know if Arc actually satisfies that, but I can certainly see the point. Part of this would be a lack of optimization in implementation: implement something in the most straightforward manner, and it is its own specification.

-----

2 points by cadaver 5909 days ago | link

On the other hand, in a separate english-language specification you could say that a certain thing is unspecified and leave it up to the implementation to decide on a certain behaviour. I have been reading a bit in the scheme report (that's a specification right?) and there are a certain number of unspecifications in there.

-----

1 point by cooldude127 5899 days ago | link

i don't arc is intended to have more than one implementation, or at least more than one popular implementation. if this is the case, nothing is unspecified. whatever the implementation does, that is the language.

-----

0 points by soegaard 5909 days ago | link

Letting the implementation be the spec rules out bugs in the implementation. If, say, (+ 1 1) returns 3, then it isn't a bug, since that's what the spec says.

-----

6 points by pg 5909 days ago | link

It means you've specified a bad language.

-----

2 points by soegaard 5909 days ago | link

Sure. In general it isn't so easy to figure out, whether something was done on purpose.

Example: Was it intensional that (1 . + . 1) evaluated to 2?

-----

4 points by pg 5909 days ago | link

It's rather a dishonest argument to use an example like that, because it's an artifact of bootstrapping the prototype off MzScheme. A more convincing argument would be strange behavior resulting from the way something was defined in arc.arc.

-----

9 points by kens 5908 days ago | link

Is annotate a general mechanism, or specifically for defining macros? Is ellipsize supposed to limit its output to 80 characters or display at most 80 characters of the input? Is median of an even-length list supposed to return the lower of the middle two? Is cdar supposed to be missing? Is (type 3.0) supposed to be an int? Is support for complex numbers supposed to be in Arc? Is client-ip supposed to be there, or is it left over from something? Does afn stand for "anonymous fn"? What does "rfn" stand for?

These are all real questions that I've faced while trying to write documentation. Let me make it clear that these are not rhetorical questions; I'm more interested in getting actual answers than arguing about using the code as the spec.

-----

7 points by pg 5908 days ago | link

The former; the latter; yes; yes; yes (for now); yes; the former (for now); anaphoric; recursive.

-----

6 points by parenthesis 5908 days ago | link

I've matched pg's replies up with the questions, to make this discussion easier to read:

annotate is a general mechanism

ellipsize is supposed to display at most 80 characters of the input

median of an even-length list is supposed to return the lower of the middle two

cdar is supposed to be missing

(type 3.0) is supposed to be an int (for now)

support for complex numbers is supposed to be in Arc

client-ip is supposed to be there (for now)

afn stands for "anaphoric function"

rfn stands for "recursive function"

-----

2 points by drewc 5899 days ago | link

But pg's was so much more concise! ;)

-----

4 points by eds 5908 days ago | link

>> Is cdar supposed to be missing?

> yes

Wasn't the point keeping the names car and cdr so you can compose them? (I remember reading such in one of pg's essays.) Then it seems to me to take full advantage of that you need to provide those names for use.

I don't think it is unreasonable to do the following, but it is currently not provided in Arc:

arc> (cdar '((1 2 3) (4 5 6))) Error: "reference to undefined identifier: _cdar"

Maybe this is just me missing CL's four levels of composition of car and cdr.

-----

8 points by pg 5908 days ago | link

I didn't mean Arc will never have cdar. But to avoid having a language designed according to arbitrary rules rather than the actual demands of the task, I've been trying to be disciplined about not adding operators till I need them.

-----

5 points by parenthesis 5908 days ago | link

I suppose you can cdr:car .

On the one hand, it does feel like all the c....r s should be there.

On the other hand, I think cadr is pretty much the only one I ever actually use; and it is there.

-----

2 points by drcode 5908 days ago | link

I noticed this contradiction, too... :) If we're not going to use c[ad]r composability, why not just use unique, easily distinguishable names for all of these that don't compose:

  car  --> hd
  cdr  --> tl
  caar --> inner
  cddr --> skip2
  cadr --> second
...or something like that. Unique names would reduce errors.

-----

4 points by soegaard 5909 days ago | link

Never mind the example. What troubles me with the the-code-is-the-spec approach, is that for an outsider, it is impossible to tell which decisions where made deliberately and which were accidental.

Just for the record, I find it is fair game to say there is no specification, while the experimentation phase is still going on.

-----

5 points by pg 5908 days ago | link

It doesn't matter whether features are deliberate or not. It's very common in math for someone to discover something that has interesting properties they never imagined. In fact, it's probably closer to the truth to say that if a mathematical concept doesn't have properties the discoverer never imagined, it's not a very interesting one.

Lisp itself is an example of this phenomenon. JMC didn't expect to use s-expressions in the real language, but they turned out to be way more useful than he envisioned.

I'm not just splitting hairs here, or trying to defend myself. In design (or math), questions of deliberateness are not binary. I'll often decide on the design of an operator based on what looks elegant in the source, rather than to meet some spec, just as Kelly Johnson used beauty as a heuristic in designing aircraft that flew well.

-----

1 point by shiro 5908 days ago | link

It's a good argument in general sense, but I doubt it is applicable in library API.

If you're delivering a final product, users don't care if some design is deliberate or not; they care it is good or bad. If you're deriving mathematic proof, others don't care if some choice is deliberate or not; they care if it is correct, beautiful, or useful to prove other theorems. That's because changing your choice afterwards won't break existing products or proofs that relied on the previous choices.

In the case of library API, changing your choice does break existing software relying on the library. In the current Arc case it is forewarned so it's perfectly fine, but at some point (50years from now, maybe?) you have to let more people write software on it; by that moment it should be clear that what they can rely on and what they cannot.

-----

2 points by pg 5908 days ago | link

by that moment it should be clear that what they can rely on and what they cannot

The only difference if the implementation is the spec is how they know what they can rely on. If the implementation is the spec, they decide by reading the source; if it's a document writen in English, they decide by reading that.

-----

4 points by shiro 5908 days ago | link

Implementation can be, and will be, changed, inevitably. Then does the language change as well, or the languages remains the same but just implementation is improved? How can you tell the difference purely from the source?

Some Scheme implementation evaluates arguments left to right. You can see that by reading the source. In future, it may switch it right to left, maybe for better optimization. The spec in natural language, or more formal and abstract form like in Appendix A of R6RS, can explicitly say the order of evaluation is unspecified. How you tell your users that they should not rely on the evaluation order purely by the source code, given the state of most programming languages?

Ideally I like to think the source only describes the spec and the compiler and runtime figure out the details, so maybe spec-by-source and spec-by-other-notation will converge in future. Is that what you are talking?

(Please don't argue that unspecified evaluation order is bad or not; I just use that example to illustrate the point. And incidentally, since Arc is defined in terms of Scheme, the order of argument evaluation order is just unspecified as well. But it's just delegating the spec to a lower level.)

-----

1 point by soegaard 5908 days ago | link

Actually PLT Scheme guarantees left-to-right order, but that doesn't change your point.

-----

2 points by akkartik 5908 days ago | link

One point everybody else is missing: since arc explicitly makes no claims of backwards compatibility, the notion of a spec is meaningless.

If the goal of a language is to be readable there's nothing wrong in the implementation being the spec. Consider it a form of self-hosting, or of eating your own dogfood.

---

An implementation in a reasonably high-level declarative language is a more reasonable 'spec' than one in C. More features are amenable to checking just by reading rather than by execution.

When something is obviously a bug it doesn't matter if it's a bug in the spec or the implementation.

Those two categories -- obvious bugs, and questions about what is or is not in the language that should be answered by reading the code rather than executing it -- seem to cover all the objections people have raised.

-----

1 point by shiro 5908 days ago | link

At least I'm talking about the attitude of spec-by-source in general, not particularly about Arc, FYI.

Edit: I agree that more abstract, declarative language is closer to spec-by-source. If somebody says Prelude.hs is the spec of Haskell's standard library I agree. But the core language semantics is still not in Haskell itself, is it? (I'm still learning. Correct me if I'm wrong.)

-----

1 point by almkglor 5908 days ago | link

Right. And nobody needs comments in source code, ever.

-----

1 point by akkartik 5908 days ago | link

What!?

FWIW, here's how I think comments in source code can be improved, especially in exploratory programming: http://akkartik.name/codelog.html

-----

1 point by almkglor 5908 days ago | link

Oh come on. What are comments but prose descriptions of the code?

Anyway please reparse my post with <sarcastic>...</sarcastic> around it.

-----

1 point by akkartik 5907 days ago | link

I did get the sarcasm, but not how you extrapolate from my comment.

"There's nothing wrong with a code spec in this case" != "prose specs suck"

-----

1 point by almkglor 5907 days ago | link

Hmm. Must be a bug in my reader ^^

-----

2 points by oconnor0 5908 days ago | link

The problem is that as people learn the language they will build mental maps of what works and what doesn't and in the process will write code that depends on things that could legitimately be considered bugs or arbitrary side effects of the current implementation.

Whether or not this matters to you or even should matter is another concern, but this has been a spot of contention for languages like Python and OCaml whose spec is the code.

-----

1 point by partdavid 5912 days ago | link | parent | on: First Priority: Core Language

There's nothing joyful about regular expressions. For one thing, as above, it leaks logic all over your code. Secondly, it's unclear--it would be quite difficult to identify a bug in your expression. A related clarity problem is that you have restricted inputs to a subset of valid inputs, and it's hard to see how or why. Third, they are brittle and hard to instrument for diagnostics.

-----

1 point by lojic 5911 days ago | link

I don't see how it "leaks logic all over your code". But I like to keep an open mind - what are you suggesting as an alternative for the above example?

Regarding the difficulty in identifying a bug in the expression, I tend to agree. That's why I have a lot of unit test cases for each meaningful regular expression.

Regarding restricting the inputs to a subset of valid inputs, which inputs would you like to accept that the regex rejects (for U.S. currency only)? I haven't had any complaints yet, but that's not to say it won't happen in the future.

-----

1 point by partdavid 5903 days ago | link

1) If you have capture patterns, you have code in one place dependent on the expression in another without the coupling being clear. More mildly, you are married to regular expression operators because you have direct references to your (regular-expression-defined) subprogram all over. You can't decide not to make it a regular expression; or to make it two, or make it a much clearer expression and a programmatic dress-up. The alternative is to not use regular expressions.

2) Eh, unit tests can catch the mistakes you anticipated making. Lots of other mistakes are possible. Why write a complex regular expression and page full of unit testing code for it when you could write more straightforward logic.

3) Like I said, it's not clear why you have the expressions you do.

I'm not saying regular expressions never ever have their place. In particular, they can be a convenient method to offer users to specify search and validation patterns and that kind of thing. But fixing program logic into them is a bad idea.

Now, if it's inconvenient or inefficient to express that textual extraction in some way other than regular expressions, I'm suggesting that is a failure of the language (for example, because pattern matching is weak or specifying grammars is cumbersome), not a point for recommending regular expressions.

-----

2 points by lojic 5887 days ago | link

Sorry, I just now saw this. The Arc forum makes it darn near impossible to realize someone has replied to an older item :(

I think an example of what you're talking about would be great. If you have a better way to validate textual data than regular expressions, then naturally I would want to know about it.

Here's a few regular expressions I've collected. I realize they're not perfect (e.g. the email regex), but they're good enough for my purposes presently.

    REGEX_EMAIL     = /^([-\w+.]+)@(([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})|(([-\w]+\.)+[a-zA-Z]{2,4}))$/
    REGEX_FLOAT     = /^([0-9]+|[0-9]{1,3}(,[0-9]{3})*)?(\.[0-9]*)?$/
    REGEX_INTEGER   = /^([0-9]+|[0-9]{1,3}(,[0-9]{3})*)$/
    REGEX_ISO_8601  = /^(\d{4})-(\d{2})-(\d{2})[Tt](\d{2})[-:](\d{2})[-:](\d{2})[Zz]$/
    REGEX_MONEY     = /^(\$[ ]*)?([0-9]+|[0-9]{1,3}(,[0-9]{3})*)?(\.[0-9]{0,2})?$/
    REGEX_PHONE     = /^(\((\d{3})\)|(\d{3}))[-. ]?(\d{3})[-.]?(\d{4})[ ]*([^\s\d].{0,9})?$/
    REGEX_SSN       = /^(\d{3})-?(\d{2})-?(\d{4})$/
    REGEX_ZIP_CODE  = /^(\d{5})[- ]?(\d{4})?$/
So, what would you use to accomplish the same thing without regular expressions that is as concise? The regular expressions allow an easy way to both validate user input and parse it via groups. They're declarative vs. imperative. I have these in a Ruby module with associated functions for parsing (which primarily uses the groups) etc., so they're encapsulated together.

I think you mentioned you're an Erlang programmer, so how would the non-regex Erlang code look to validate an email address corresponding to the REGEX_EMAIL expression above?

-----

1 point by partdavid 5877 days ago | link

Ah, you're right, it's a bit hard to see when folks have replied. Yes, I'm an Erlang programmer.

In response to your question, I don't accept your premise that replicating a particular regular expression is a real programming task. You say your email regular expression isn't perfect, but it's not clear to me why you chose those particular set of restrictions beyond what's defined in the RFC--so it's a little hard for me to replicate (for example, the local-part and domain of the address can have a more kinds of characters that what you have defined).

Instead, I'll offer this as a non-equivalent but interesting comparison. I've elided the module declarations (as have you), including the imports that allow some of these functions without their module qualifiers:

  email(S) ->
     [User, Domainp] = tokens(S, "@"),
     {User,
      case {address(Domainp), reverse(tokens(Domainp, "."))} of
         {{ok, Addr}, _} -> Addr;
         {_, RDomain = [Tld|_]} when length(Tld) >= 2,
                                     length(Tld) =< 4 ->
            join(reverse(RDomain), ".")
      end
     }.
I don't know how the terseness of this compares with your example, given that it includes some things that yours doesn't (a way to call it, a format for the return value rather than the capture variables). Terseness, of course, in the pg sense of code tree nodes, whatever they are. :)

The Erlang function above returns a tuple of the local-part and the domain part and throws an error if it can't match the address. If this were something I wanted to ship around to other functions or send to another machine or store in a database table or something, I would have email/1 return a fun (function object) instead.

If either one of us wanted something better than what we have (or even if we don't--it seems like coming up with The Right Thing To Do With Email Addresses is worth a bit of time to do only once) I would write a grammar. The applicable RFC 2822 more or less contains the correct one, which is only a few lines.

At the "low" end of text processing power, there are basic functional operations on strings and lists, and at the "high" end there are grammar specifiers and parser generators. In the band in between lives regular expressions, and I am not convinced that that band is very wide. I like regular expressions (and, indeed, I would like it very much if Erlang had better support for them) but for me they are a specialized tool, particularly useful (like wildcard globbing) for offering as an input method to users.

But they aren't a general solution to every kind of problem, and for that reason I don't think Arc or any other general-purpose language benefits from baking them into the basic syntax--they belong in a library.

-----

2 points by partdavid 5915 days ago | link | parent | on: The Erlang Challenge

http://arclanguage.org/item?id=1743

When I read the arc challenge at first it seemed like a trivial "win" for arc and a cheap shot. It upset me, to be frank, and that interested me, because usually if you read something upsetting it means there's something there you need to see.

After I got done grumbling (mostly to myself) about how much better in a general case was my favorite language (Erlang), I used the challenge as a jumping off point for thinking about what an Erlang web framework would be (freely admitting my status as a dabbler and dilettante here, where yariv maintains a mature Rails-like web framework for Erlang).

I wrote an (extremely rudimentary) framework to support the code in my Erlang answer to the arc challenge. I noticed that a number of the answers supposed hypothetical or imaginary frameworks, so I suspect that a lot of people only got what was on the surface of the arc challenge (this toy web app is so short!) without understanding what was underlying it (what would it take to extend your favorite language to express something so concisely? And does that concise form take the form of a Greenspunesque arcalike or is it possible to express the same intent in your language's idiom?).

In short, I don't feel there's a need to "retaliate" by posing concurrency-specific challenges where Erlang must win--in fact, such challenges have been posed before (e.g. Joe Armstrong's Java vs. Erlang white paper). I'd rather do what I can to learn from what pg has done with arc.

-----


Maybe arc could also provide online code upgrades as well.

I like the idea of qualified names (namespaces for things) but prefer it to be divorced from the name of the resource where you find it. For example, in Perl and ruby I can require 'coolstrings.rb' which doesn't have anything to do with a "coolstrings" namespace or class but does change the way strings behave. It also provides a way to elegantly implement pragmas.

-----

1 point by ryantmulligan 5915 days ago | link

Live code upgrades is a great feature. Lisps already support this to some degree. If you make an app and expose a REPL on some local port then you can recompile functions on the fly. I'm not sure how it handles the current execution stack when this happens though.

-----

2 points by partdavid 5916 days ago | link | parent | on: Nitpick: Why "rem" and not "rm"?

I would think "remainder" or "remaining", while the "rm" command suggests removal to me.

-----

More