Arc Forumnew | comments | leaders | submitlogin
Thanksgiving wishes
6 points by waterhouse 4747 days ago | discuss
Suppose I'm copying data about x86-64 opcodes per instruction signature out of the amd64 programmer's manual:

http://support.amd.com/us/Processor_TechDocs/APM_V3_24594.pdf

and into a text file:

http://pastebin.com/raw.php?i=TrGLRRMy

Much of this crap is repetitive. In particular, many instructions will have the same opcode for their 16-bit, 32-bit, and 64-bit forms (they will be distinguished in code by a prefix indicating the operand size):

  ADD:
  reg/mem16, reg16 -> 01 /r
  reg/mem32, reg32 -> 01 /r
  reg/mem64, reg64 -> 01 /r
And many of them (usually the same ones) operate on "reg/mem16, imm16", "reg/mem32, imm32", and "reg/mem64, imm32". Also for the ADD instruction, but in the highly abbreviated form I settled on:

  r16 i16 . 81 /0 iw
  r32 i32 . 81 /0 id
  r64 i32 . 81 /0 id
Even with these abbrevs, it's painful to type that up manually for more than one or two instructions. So, I wrote progressively more advanced Arc functions to do much of it for me. It began with a simple function that would generate three things and put them into the clipboard:

  arc> (def dick (a b c) (pbcopy:tostring:map (fn (x y) (prn a x
  " " b x " . " c " " y)) '(16 32 64) '(iw id id)))
  #<procedure: dick>
  arc> (dick "rm" "i" "83 /4")
  -->
  rm16 i16 . 83 /4 iw
  rm32 i32 . 83 /4 id
  rm64 i64 . 83 /4 id
Note that the "i64" there is wrong; it should be "i32". I corrected that by hand, but eventually decided to automate it. I moved from a "prn"-based command to a token-based symbol replacement thing, so that instead of (prn "rm" n) where n was 16/32/64, I'd write the symbol 'rmn, which would get replaced by "rm16"/"rm32"/"rm64"; the 'in symbol would be replaced by "i16"/"i32"/"i32". By the end, I had a kind of "interpreter" that handled many "commands". Also, of course, I had a wrapping macro so I didn't have to quote anything.

  (def dick4 (xs) ;note that this was all on one line in the REPL
    (mapn (fn (n)
            (withs (a (expt 2 (+ n 4)) b (if (is a 64) 32 a))
              (apply prsn
                     (map [case _
                            in (symb 'i b)
                            rn (symb 'r a)
                            rmn (symb 'rm a)
                            mn (symb 'm a)
                            w ('(iw id id) n)
                            x "."
                            +rx ('(+rw +rd +rq) n)
                            ax ('(ax eax rax) n)
                            _]
                          xs))))
          0 2))
  (mac d4 args `(pbcopy:tostring:dick4 ',args))

  arc> (d4 rn rmn in x 69 /r w)
  -->
  r16 rm16 i16 . 69 /r iw
  r32 rm32 i32 . 69 /r id
  r64 rm64 i32 . 69 /r id
And finally I ran it as a "read-eval-pbcopy loop", so that I didn't have to type the "(d4 ...)" part (or edit a previous command).

  arc> (while t (eval `(d4 ,@(readall:readline))))
  rn rmn x 2B /r
  ax in x A9 w
  rmn in x F7 /0 w
Also, I wrote a specialized command to generate all of the conditional move instructions (cmovxx), and modified it to handle the conditional jumps (jxx) and sets (setx).

You can view the whole Arc log here: (For some reason, my terminal sometimes drops newlines.)

http://pastebin.com/raw.php?i=Qr9n6A71

Anyway, having an Arc REPL made this task easier, faster, and much more pleasant. Thus, I am thankful for it.

Now everyone have a happy Thanksgiving, made more convenient by tools made by others and yourself and by your expertise with them.

(Why was I putting opcodes into a mostly machine-readable format in the first place? You may consider this a teaser of things to come.)