Assembly language for Power Architecture

The first in a planned series of articles that introduces PowerPC ASM.
Assembly language for Power Architecture, Part 1: Programming concepts and beginning PowerPC instructions

Starting with this introduction to assembly language concepts and the PowerPC instruction set, this series of articles introduces assembly language in general and specifically assembly language programming for the POWER5.
It could just be me, but I think the ASM designers could've afforded to make this stuff a bit more human consummable - the preference is on the side of terseness:

li 0, 1
mr 3, 6
ld 6, 0(4)

Could be expressed a little cleaner as:

load R0, #1
load R3, R6
load R6, #0(R4)

No need for seperate instruction names for operations that are only different in how they load the data. And a clearer delineation of what is a register and what is a constant value. But then my bias for MCC68k is probably showing through. And the extra character for registers probably just makes higher level PL compiled code larger. (Oh well, easy enough to write a pretty viewer if one is so inclined.)

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

goodie

just in time for some XBox 360 and PS3 programming. :)

Terseness is fine

You get used to it. Assembly language is already verbose enough, so why not go with shorter mnemonics for common instructions? Okay, I wrote an entire video game in PowerPC assembly language back in the 90s, so I'm used to it.

That said, I think most assemblers are poorly designed. They're needlessly primitive. I wrote a postfix assembler years ago (standalone, not part of a Forth) that was super terse, but a joy to use.

As one example, Eric Isaacson's a86 assembler has a conditional return instruction, even though there isn't such an instruction in the x86. It wasn't just a macro, either. The assembler scanned through the instruction stream looking for a "ret" and branched to it. If one couldn't be found, then one was added. Slick.

gas option

The -mregnames option to gas allows you to say r17 instead of 17.

load R0, #1
load R3, R6
load R6, #0(R4)

I think most people would prefer to use "load" only for load instructions, since this is a load-store architecture. (CISCy architectures such as x86 can do a load as part of many different instructions, such as "add".)

Needlessly primitive

I agree with Yorick, in that I'd prefer that load only be used for instructions that load words from memory, as opposed to register-to-register moves. If you're writing, or even generating, assembly, you have to be painfully aware and pain-stakingly careful to differentiate between these two types of instructions anyway. (A register move is not going to have anywhere near the impact a memory load might).

The problem I have with making assemblers more sophisticated, is that the most likely purpose of dropping down to assembly is to be very precise about the actual instructions you issue. If you obscure that in the name of abstraction, you obstruct your goal. Abstractions which have zero-penalty (such as labels and compile-time macros with exposed implementation) don't present those problems.

Basically, assembly is a different world, because you always want to know all the implementation details all of the time. If you don't need to know the details, then you probably shouldn't be coding that functionality in assembly.

More primitive in syntax than features

One instruction per line?
No way to interactively assemble and test individual definitions?
Macro syntax that's typically messy and full of special cases?

I wrote a 6502 cross-assembler where I reworked all the mnemonics and used postfix for everything. I don't have the exact syntax handy, but I could write code that looked more or less like this:

: clear-hw-regs 0 lai SCREENBASE sam TIMEOUT sam ;

I can see how someone would initially think this ugly, but it's a simple scheme and was much more pleasant to work with than traditional assemblers. I didn't need labels for loops. Plus I could add new operators in a couple of minutes. The object code for the whole assembler as a standalone package was less than 3K.

I agree that more human

I agree that more human friendly assembler is good, but the assembler shouldn't obscure the instruction used.

A middle ground is nice:
loadi r0, #1
copy r3, r6 (move register is a strange name)
load.8 r6, #0(r4)
load.16 r6, #0(r4)
load r6, #0(r4)
load.64 r6, #0(r4)

And to be even more readable replacing , by loadi r0 The advantage being that even a guy who doesn't know the assembly language of a CPU have no doubt in which direction the assignment occur in
copy r3

OK let'me try this

OK let'me try this again:
loadi r0 <- #1
copy r3 <- r6 (move register is a strange name)
load.8 r6 <- #0(r4)
load.16 r6 <- #0(r4)
load r6 <- #0(r4)
load.64 r6 <- #0(r4)

Mine

Analog Device's Sharc DSPs have, imho, a clearer assembly language.
Instead of silly MOV AX,BX or MOVE or MV or MR or LD or LOAD, you simply write :
R0=R1;
Indirect accesses are written as :
R8=DM(I4); ( DM=Data Memory, Harvard !)
As with most DSPs, several operations can occur simultaneously, for example :
R7=BSET R6 BY R0, DM(I0,M3)=R5, PM(I11,M15)=R4;
IF NOT SV F8=CLIP F2 BY F14, F7=PM(I12,M12);

Maybe the parser is a bit more complex but, does it matter ?

PL360, HLA

High Level Assembly for X86 and PL360 may be of interest to thoose searching for inspiration on how to make assembly language look different.

Other infix assemblers

See also CERN's PL-11 and PL-VAX, Cray's CAL, Jim Neil's Terse, Bell Labs' LIL and SMAL. IBM's HLASM and MIT's Midas both have extremely powerful macros.

Assembly with algebraic syntax

As someone who has spent much of his professional and personal life working with assembly languages, I feel that the typical opcode-operand syntax is a silly anachronism.

For example, one DSP I work with has such a bizarre instruction set, code written in the native syntax is essentially unreadable. I designed an assembler that uses an algebraic syntax that almost looks like C:

  a = a/8 + sqrt(2)     ; multiply and add
  a = [myVariable] + 1  ; memory read and add
  a = [myArray + Ib]    ; indirect read using "Ib" register
  [myVariable] = a      ; memory write
  if (a < 0) {          ; conditionals
    a += 1/2
  }
  a = (b + 2)*(b - 2)   ; algebraic simplification to a = b^2 - 4

Each statement is a single instruction (usually) and you have complete low-level control of the code. You just don't have to deal with stupid three-character mnemonics. See bkasm.sf.net.