Arc Forum | Robert Harper: "Since every value in a dynamic language is classified in this ma...

Arc Forum

2 points by rocketnia 4767 days ago | link | parent

Robert Harper: "Since every value in a dynamic language is classified in this manner, what we are doing is agglomerating all of the values of the language into a single, gigantic (perhaps even extensible) type."

Laurence Tratt: "Bob points out that, conceptually, a dynamically typed program has a single static sum type ranging over all the other types in the program. We could therefore take a dynamically typed program and perform a conversion to a statically typed equivalent. Although Bob doesn't make it explicit how to do this, we can work out a simple manual method (though it's not clear to me that this would be fully automatable for most real programs)."

This part of the debate goes off on an unnecessary tangent. There's not really any need for a "gigantic (perhaps even extensible)" sum type.

  -- Haskell code
  
  {-# LANGUAGE FlexibleInstances #-}
  
  -- A hack to show functions as opaque values at the REPL.
  data ShowableFunc a b = ShowableFunc (a -> b)
  instance Show (ShowableFunc a b) where
    show _ = "[function]"
  
  data Value =
      FunctionValue (ShowableFunc Value Value)
    | SymbolValue String
    | TaggedValue Value Value
    -- etc.
    deriving Show

Having abandoned gigantic and extensible notions, the translation is actually very straightforward. The TaggedValue case is all we need in order to define a so-called dynamic type system in the Arc style.

Admittedly, we may need a basic DSL to help our embedded untyped programs go against the grain of the language.

Here's how we might start to support this kind of programming in Haskell. The only callable things in Haskell are of type (A -> B), so our programs will need to use a "call" function all over the place, but Haskell's type classes (with FlexibleInstances) make it possible to shove aside a lot of other boilerplate.

  -- Some type classes for convenient expression of Value values using
  -- Haskell syntaxes. We could go without these, but we'd need
  -- explicit conversions all over. Consider this weak typing.
  class ToValue a where
    tv :: a -> Value
  instance ToValue Value where
    tv x = x
  instance (ToValue a) => ToValue (Value -> a) where
    tv x = FunctionValue (ShowableFunc (\v -> tv (x v)))
  instance ToValue String where
    tv x = SymbolValue x
  
  call :: (ToValue func, ToValue arg) => func -> arg -> Value
  call func arg =
    let func' = tv func in
    let arg' = tv arg in
    case func' of
      FunctionValue (ShowableFunc haskellFunc) -> haskellFunc arg'
      _ -> error ("Call to non-function " ++ show func')

We can go on to define some built-ins for the dynamic type system:

  annotate :: Value
  annotate = tv $ \tag rep ->
    TaggedValue tag rep
  
  -- I'd name this "type" as in Arc, but that's a reserved word.
  getType :: Value
  getType = tv $ \x -> case x of
    FunctionValue _ -> SymbolValue "fn"
    SymbolValue _ -> SymbolValue "sym"
    TaggedValue tag _ -> tag
    -- etc.

Now we can try this out in GHCi:

  *Main> getType `call` (annotate `call` "my-type" `call` "my-rep")
  SymbolValue "my-type"
  *Main> getType `call` "my-type"
  SymbolValue "sym"
  *Main> tv (\a b -> annotate `call` (b :: Value) `call` (a :: Value)) `call` "bar" `call` "foo"
  TaggedValue (SymbolValue "foo") (SymbolValue "bar")

The ":: Value" annotations are necessary to make the type class instance we're using unambiguous. (As far as Haskell is concerned, we're still allowed to define another instance which works for arguments of type other than Value.) Another few utilities could save us from typing these annotations, but this is an example.

Notice that this technique can faithfully translate dynamic type errors too, without the static type checker getting in the way.

  *Main> "foo" `call` "bar"
  *** Exception: Call to non-function SymbolValue "foo"

I initially coded this stuff without the type classes, but I ended up having to write the conversions explicitly like this:

  *Main> getType `call` (annotate `call` SymbolValue "my-type" `call` SymbolValue "my-rep")
  SymbolValue "my-type"
  *Main> getType `call` (SymbolValue "my-type")
  SymbolValue "my-type"
  *Main> let fn haskellFunc = FunctionValue (ShowableFunc haskellFunc)
  *Main> fn (\a -> fn $ \b -> annotate `call` b `call` a) `call` SymbolValue "bar"
 `call` SymbolValue "foo"
  TaggedValue (SymbolValue "foo") (SymbolValue "bar")

---

Above I quoted both articles, but the rest of the quotes I'm responding to are just from Laurence Tratt.

---

"[If sum types are] Used indiscriminately – particularly where a sum type ranges over an unduly large number of other types – they create an unwieldy mess because of the requirement to check all the types that have been ranged over."

See my use of the catch-all pattern "_ ->" in the definition of "call" above. If the majority of the branches use a default behavior, usually this kind of pattern-match is available to save the day.

---

"The encoding of dynamically typed languages we used above would lead to a huge increase in the size of programs, since a static type system forces us to explicitly express all the checks that a dynamically typed run-time implicitly performs for us."

The kind of encoding I used would lead to only a moderate increase, and then only due to the lack of dedicated language-level support for that style of programming.

In statically typed programming, the "checks" tend to be expressed in the types rather than the expessions. Usually, when code uses a library utility, it doesn't have to reiterate the types of that utility, so the checks are still just as implicit.

(In my code above, the ":: Value" annotations are a counterexample. In a sense, this "check" is used for branching among type class instances.)

---

"It would also remove the possibility of making use of a dynamically typed system's ability to execute and run programs which are deliberately not statically type safe (a useful property when comprehending and modifying large systems)."

Some escape hatches have been investigated for this purpose. See gradual typing or Haskell's "Dynamic" type.

The way I think of it, sometimes types are a meaningful part of the program, and sometimes they're an effective way for a language or library to expose a well-behaving realm of possible uses in the first place.

It's easy to interpret the "well-behaving realm" case as artificial restrictions on an innately wild program space. But restrictive interfaces can frustrate us in untyped languages too.

---

"We could design a sound type system that will deal with this specific case but, because a sound type-system won't be complete, we can always find another similar run-time correct example which the type system will reject."

"It also provides a mirror-image encoding from above: one can always translate statically typed programs to a dynamically typed equivalent."

This section relies on the implicit assumption that for every program in a typed language, the type annotations can be deleted to yield a program in a corresponding untyped language.

That observation does not apply in general. Certain static type system features, such as Haskell's type classes, allow the programmer to specify run time behavior in the type annotations rather than in the expressions. This technique is used extensively in Haskell and Agda as a way to reduce expression boilerplate. I use it above.

---

"In this article, I've tried to show that statically and dynamically typed languages overlap in many ways, but differ in others. Neither is exactly a subset of the other nor, as the underlying theory shows, could they be."

I flatly disagree; untyped languages are exactly a special case of typed languages.

The thing is, integers are exactly a special case of real numbers too, and that's no reason to dismiss integers. :)

---

"One must consider the pros and cons for the tasks one has at hand, and choose accordingly. Personally, I use both styles, and I have no wish to unduly restrict my language toolbox."

Agreed.

2 points by rocketnia 4764 days ago | link

At the risk of talking to myself... :-p

Actually, there's a terminology issue going on here. Laurence Tratt and Pauan find it important to talk about errors, so in the above post I went point by point to address that. But as far as I'm concerned, "{static,dynamic} type error" is nothing more than a way to say "{compile,run} time error that a statically {untyped,typed} style could have avoided," and I take that suggestion with a grain of salt.

Robert Parker and I don't see "dynamic typing" as some kind of alternative strategy with extra features that somehow prevent the use of static typing. For what particular reasons and contexts should static typing be considered harmful?

---

In my opinion, static typing is mostly useful for three reasons: It has nice symmetry with mathematical reasoning[1], it can support static analysis of programs[2], and it can control a particular side effect. I'll focus on that last part, since side effects are what distinguish one computing platform from another.

Static typing controls the side effect of indeterminism when calling a black box library. By default, code that calls a black box can't expect very much about what will happen, due to the fact that it's a black box. By the time the program has run, the box could have been filled with any version of the library. Static typing establishes at least some guarantees about what kind of library code will fit in the box. (Since the intricacy of those guarantees can span a rich mathematical theory, it's nothing to scoff at.)

The black box side effect is ultimately "implemented" as a result of the programmer's manual process of selecting and upgrading libraries. So we can't just write an interpreter that simulates a new variant of this side effect; it needs to fit seamlessly into the programmer's existing habits, or we're veering into new language territory.

Within a statically typed module system, untyped programming seamlessly fits in as a trivial special case--as "unityped" programming. But it isn't as clear how to do typed programming within a module system that doesn't already support that particular static type system (or another that's at least as logically strong). I'm very interested in options here.

---

[1] Symmetry with mathematics is just an aesthetic. Aesthetics has power too, but it doesn't do much to resolve debates. :)

[2] Static analysis is just the design pattern of doing a preliminary computation in order to reason about an upcoming "real" computation. When this pattern comes up, it's useful to design a static type system that meets the needs of the analysis, but static typing in the implementation language doesn't always help.

-----