Thursday, December 29, 2011

Embedded C Tutorial (Chapter 12: Assembly Language Programming)

What's in Chapter 12?
How to insert single assembly instructions
How to compile with a mixture of assembly and ICC11/ICC12 C files
How to compile with a mixture of assembly and Metrowerks files
ICC11/ICC12 Assembler Directives
How to use assembly to optimize a C function

One of the main reasons for using the C language is to achieve portability. But there are occasional situations in which it is necessary to sacrifice portability in order to gain full access to the operating system or to the hardware in order to perform some interface requirement. If these instances are kept to a minimum and are not replicated in many different programs, the negative effect on portability may be acceptable. There are two approaches to writing assembly language with ICC11 and ICC12. The first method inserts a single assembly instruction directly into a C function using the asm("string"); feature. With Metrowerks, we write asm string . Everything within the string statement is assumed to be assembly language code and is sent straight to the output of the compiler exactly as it appears in the input. The second approach is to write an entire file in assembly language, which may include global variables and functions. Entire assembly files can be inserted into our ICC11/ICC12 C programs using the asm(".include 'filename' "); feature. In Metrowerks, we include assembly files by adding them to the project. Entire assembly files can also be assembled separately then linked at a later time to the rest of the programs. The simple insertion method is discussed in this chapter.

How to insert single assembly instructions.

To support this capability, C provides for assembly language instructions to be written into C programs anywhere a statement is valid. Since the compiler generates assembly language as output, when it encounters assembly language instructions in the input, it simply copies them directly to the output.
A special directive delimits assembly language code. The following example inserts the assembly language instruction cli (enable interrupts) into the program at that point.
asm(" cli");  /* ICC11/ICC12 syntax*/
asm cli       /* Metrowerks syntax*/
Some of the older versions of ICC11 require a space before the op code as shown in the examples in this chapter. ICC12 version 5.1 does not need the space before the op code. The extra space is ignored by these newer compiler versions, so experiment with your particular compiler to see whether or not the space is required. Macro substitution is not performed, but you can define macros that insert assembly. The following ICC11/ICC12 macros are defined in the HC11.H and HC12.H header files.
#define INTR_ON() asm(" cli")
#define INTR_OFF() asm(" sei")

The following macros are defined in Metrowerks syntax.
#define INTR_ON() asm cli
#define INTR_OFF() asm sei

The following function runs with interrupts disabled.
void InitFifo(void){
  INTR_OFF();       /* make atomic, entering critical section */
  PutI=GetI=Size=0; /* Empty when Size==0 */
  INTR_ON();        /* end critical section */
}

Listing 12-1: Example of an assembly language macro

Of course, to make use of this feature, we must know how the compiler uses the CPU registers, how functions are called, and how the operating system and hardware works. It will certainly cause a programming error if your embedded assembly modifies the stack pointer, SP, or the stack frame pointer, X. On the other hand, in most situations you should be able to modify the CCR, A, B, or Y without causing a program error. It is good practice to observe the resulting assembly output of the entire function to guarantee that the embedded assembly has not affected the surrounding C code. Unfortunately, this verification must be repeated when you upgrade the compiler.
You can assess global variables directly using its equivalent assembly name (starts with an underscore). The following function adds one to the 16-bit global time.
short time;
void Add1time(void){
  asm(" ldy _time");
  asm(" iny");
  asm(" sty _time");
}

Listing 12-2: ICC11/ICC12 Example of an assembly language access to a global variable

short time;
void Add1time(void){
  asm ldy time
  asm iny
  asm sty time
}

Listing 12-2b: Metrowerks Example of an assembly language access to a global variable

In ICC11/ICC12 you can assess a local variable directly using a %before its name.
void InitFifo(void){ unsigned char SaveSP;
  asm(" tpa");           /* Reg A contains previous CCR */
  asm(" staa %SaveSP");  /* Save previous CCR value */
  asm(" sei");           /* make atomic, entering critical section */
  PutI=GetI=Size=0;      /* Empty when Size==0 */
  asm(" ldaa %SaveSP");  /* Reg A contains previous CCR */
  asm(" tap");           /* end critical section */
}

Listing 12-3: ICC11/ICC12 Example of an assembly language access to a local variable
 
In Metrowerks you can assess a local variable directly using just its name, and the compiler will convert it to the appropriate SP relative addressing mode.
void InitFifo(void){ unsigned char SaveSP;
  asm tpa           /* Reg A contains previous CCR */
  asm staa SaveSP   /* Save previous CCR value */
  asm sei           /* make atomic, entering critical section */
  PutI=GetI=Size=0; /* Empty when Size==0 */
  asm ldaa SaveSP   /* Reg A contains previous CCR */
  asm tap           /* end critical section */
}

Listing 12-3b: Metrowerks  example of an assembly language access to a local variable

The above method of disabling interrupts is a good way to execute critical code. This is an appropriate way to execute critical code because once the critical code is started it will finish (i.e., atomic). The code becomes atomic because interrupts are disabled. At the end of the critical code, the interrupt status is restored to its previous value. This save/restore interrupt status procedure allows you to nest one critical code inside another critical code. If you disable interrupts before the critical code and enable interrupts after the critical code, you are presuming that interrupts were enabled when the critical code was started. The disable/enable method of executing critical code does not allow for one critical code to call another critical code. In the following example, InitFifo properly returns with interrupts still disabled.
void InitSystem(void){ unsigned char SaveSP;
  asm(" tpa");           /* Reg A contains previous CCR */
  asm(" staa %SaveSP");  /* Save previous CCR value */
  asm(" sei");           /* make atomic, entering critical section */
  InitFifo();
  InitPort();
  InitTimer();
  asm(" ldaa %SaveSP");  /* Reg A contains previous CCR */
  asm(" tap");           /* end critical section */
}

Listing 12-4: ICC11/ICC12 example of a multiple line assembly language insertion
void InitSystem(void){ unsigned char SaveSP;
  asm tpa           /* Reg A contains previous CCR */
  asm staa SaveSP   /* Save previous CCR value */
  asm sei           /* make atomic, entering critical section */
  InitFifo();
  InitPort();
  InitTimer();
  asm ldaa SaveSP   /* Reg A contains previous CCR */
  asm tap           /* end critical section */
}

Listing 12-4b: Metrowerks  example of a multiple line assembly language insertion
 
If you don't like the above style of writing each line separately, there is a shorthand in ICC11/ICC12 for multiple-line assembly as shown in the following implementation.
void InitFifo(void){ unsigned char SaveSP;
  asm(" tpa\n"           /* Reg A contains previous CCR */
      " staa %SaveSP\n"  /* Save previous CCR value */
      " sei");           /* make atomic, entering critical section */
  PutI=GetI=Size=0;      /* Empty when Size==0 */
  asm(" ldaa %SaveSP\n"  /* Reg A contains previous CCR */
      " tap");           /* end critical section */
}

Listing 12-5: ICC11/ICC12 a second example of a multiple line assembly language insertion

There is yet another style of writing multiple-line assembly in ICC11/ICC12, but I don't recommend it because it is harder to read.
void InitFifo(void){ unsigned char SaveSP;
  asm(" tpa\n staa %SaveSP\n sei");  /* make atomic, entering critical section */
  PutI=GetI=Size=0;      /* Empty when Size==0 */
  asm(" ldaa %SaveSP\n tap");        /* end critical section */
}

Listing 12-6: ICC11/ICC12 A third example of a multiple line assembly language insertion

This last example suggests the ICC11/ICC12 macro definitions:
#define START_CRITICAL() asm(" tpa\n staa %SaveSP\n sei")
#define END_CRITICAL() asm( ldaa %SaveSP\n tap")

The use of these two macros requires the definition of an 8-bit local variable called SaveSP.

How to compile with a mixture of assembly and ICC11/ICC12 C files

The following C program embeds an assembly language file (programs and data). In this example the C program accesses a global variable (lowGlobal) and calls a function (lowSub) defined in the assembly file, and the assembly function assesses a global variable (highGlobal) and calls a function (highSub) defined in the C file. To access an assembly function, the C program simply calls it, with the standard ICC11/ICC12 parameter passing rules. To access an assembly level global variable, the C program types it with the extern. Notice however that the assembly function (lowSub) does not need a prototype in the high level C program.
/* C level program    file="high.C" */
int highGlobal;
extern int lowGlobal;       // typed here but defined in low.s
asm(".include 'low.s' ");   // insert assembly here
void main(void){
  lowSub(5);     // call to assemble routine
  lowGlobal=6;   // access of assembly global
};
int highSub(int input){return(input+1);}

Listing 12-7: A high-level C program that calls a low-level assembly function

The following assembly program is embedded into the above high level C program. The double colon, ::, specifies the label as external and will be available in the *.map file. The .area text is the standard place for programs (in ROM), and the .area bss is the standard area for globals (in RAM). Assembly level functions (e.g., _lowSub) and variables (e.g., _lowGlobal) are defined beginning with an underscore, "_". Notice that in the assembly file the names have the underscore, but the same name in the C file do not. To access a C function, the assembly program simply calls it (the name begins with an underscore.) The assembly program has full access to high level global variables (the name begins with an underscore.)

; assembly language program file="low.s"
    .area text
_lowSub::              ; definition of low level subroutine
    jsr _highSub       ; call to high level function
    std _highGlobal    ; access to high level global
    rts
    .area bss
_lowGlobal::        ; definition of low level global
    .blkb 2

Listing 12-8: A low-level assembly program that calls a high-level C function

Again, parameter passing with both functions (the assembly calls to the C and the C calls to the assembly) must adhere to the standard ICC11/ICC12 parameter passing rules:
The output parameter, if it exists, is passed in Register D,
The first input parameter is passed in Register D,
The remaining input parameters are passed on the stack,
8-bit parameters are promoted to 16 bits.

Chapter 10 presented some examples of the assembly code generated by the compiler when calling a function with parameters. If you are writing an assembly language function that is to be called from C, one method to get the parameter passing correct is to write a simple C function that simply passes the parameters. Compile this simple C function with your other C code, and observe the assembly language created by the compiler for the simple C function. Next draw a stack picture that exists at the start of the function. The C compiler will do some weird things within the function (like pushing register D on the stack, and shifting some 8 bit parameters around), which you do not have to duplicate. One difficulty with mixing the assembly with C is that when the compiler is upgraded, this compatibility matching must be redone.

How to compile with a mixture of assembly and Metrowerks C files

The following C program embeds an assembly language file (programs and data). In this example the Metrowerks C program accesses a global variable (lowGlobal) and calls a function (lowSub) defined in the assembly file, and the assembly function assesses a global variable (highGlobal) and calls a function (highSub) defined in the C file. To access an assembly function, the C program simply calls it, with the standard Metrowerks parameter passing rules. To access an assembly level global variable, the C program types it with the extern. Notice however that the assembly function (lowSub) does need a prototype in the high level C program.
/* C level program    file="high.C" */
short highGlobal;
extern short lowGlobal;       // typed here but defined in low.asm
short lowSub(short);          // prototype
void main(void){
  lowSub(5);     // call to assemble routine
  lowGlobal=6;   // access of assembly global
  EnableInterrupts;
  for(;;) {} /* wait forever */
};
short highSub(short input){return(input+1);}


 

Listing 12-7b: A high-level Metrowerks C program that calls a low-level assembly function

The following assembly program is linked to the above high level C program. The absentry pseudo-op specifies the label as external and will be available in the *.map file. The EEPROM: section  is the standard place for programs (in ROM), and the RAM: section is the standard area for globals (in RAM). Assembly level functions (e.g., lowSub) and variables (e.g., lowGlobal) are defined in this file. The xref pseudo-op allows the assembly program to access a C functions and C globals.

; assembly language program file="low.asm"
RAM: section
  absentry lowGlobal
lowGlobal: ; definition of low level global
  ds.w 1

EEPROM: section
  absentry lowSub
  xref highSub,highGlobal
lowSub:          ; definition of low level subroutine
  jsr highSub    ; call to high level function
  std highGlobal ; access to high level global
  rts

Listing 12-8: A low-level Metrowerks assembly program that calls a high-level C function

Again, parameter passing with both functions (the assembly calls to the C and the C calls to the assembly) must adhere to the standard Metrowerks parameter passing rules:
The output parameter, if it exists, is passed in Register D,
The first input parameter is passed in Register D,
The remaining input parameters are passed on the stack,
8-bit parameters are promoted to 16 bits.

Chapter 10 presented some examples of the assembly code generated by the compiler when calling a function with parameters. If you are writing an assembly language function that is to be called from C, one method to get the parameter passing correct is to write a simple C function that simply passes the parameters. Compile this simple C function with your other C code, and observe the assembly language created by the compiler for the simple C function. Next draw a stack picture that exists at the start of the function. The C compiler will do some weird things within the function (like pushing register D on the stack, and shifting some 8 bit parameters around), which you do not have to duplicate. One difficulty with mixing the assembly with C is that when the compiler is upgraded, this compatibility matching must be redone.

ICC11/ICC12 Assembler Directives

An assembler directive (or pseudo-op) is not executed by the 6811/6812, but rather affect the assembler in certain ways. The assembly pseudo-ops supported by the ICC11 and ICC12 assembler are described in this section.
The first set of directives affect where in memory the subsequent assembly lines will be stored. The .org pseudo-op takes an expression, and changes the memory storage location to the value of the expression. This directive can only be used within an absolute area. Example
.org 0xF000 ; put subsequent object code at $F000
The .area pseudo-op specifies into which segment the subsequent code will be loaded.
  .area text   ; code in the ROM segment
  .area data   ; code in the initialized RAM segment
  .area idata  ; code in ROM used to initialize the data segment
  .area bss    ; code in the uninitialized RAM segment
  .text        ; same as .area text
  .data        ; same as .area data

When writing assembly language programs, I suggest allocating variables in the .area bss segment and fixed constants/programs in the .area text. In other words, I suggest that you not use .area data and .area idata in your assembly programs. Other names can be used for segments. If the (abs) follows the name, the segment is considered absolute and can contain .org pseudo-ops. For example to set the reset vector in an assembly file, you could write
  .area VectorSegment(abs)  
  .org  0xFFFE ; reset vector address
  .word Start  ; place to begin

The next set of directives allocate space in memory. The .blkb pseudo-op will set aside a fixed number of 8-bit bytes without initialization. Similarly, the .blkw pseudo-op will set aside a fixed number of 16-bit words without initialization.
  .blkb 10 ; reserve 10 bytes
  .blkw 20 ; reserve 20 words

The next three directives load specific data into memory. The .byte pseudo-op will set aside a fixed number of 8-bit bytes and initialize the memory with the list of 8-bit bytes. The size of the allocated storage is determined by the number of data values in the list. The .word and .ascii pseudo-ops work in a similar way for 16-bit words and ASCII strings. The .asciz pseudo-op is similar to .ascii except that an extra byte is allocated and set to null (0). Examples include:
  .byte 10      ; reserve 1 byte initialized to 10
  .byte 1,2,3   ; reserve 3 bytes initialized to 1,2,3
  .word 20      ; reserve 1 word initialized to 20
  .word 10,200  ; reserve 2 words initialized to 10,200
  .ascii "JWV"  ; reserve 3 bytes initialized to "J" "W" "V"
  .asciz "JWV"  ; reserve 4 bytes initialized to "J" "W" "V" 0

Because the 6812 is more efficient when accessing 16 bit data from even addresses, sometimes we wish to skip a memory byte to force the subsequent code to loaded into an even or odd address. To do this we can use:
  .even  ; force next code to be at an even address
  .odd   ; force next code to be at an odd address

There are two ways to make an assembly language label global (accessible outside the file). The first way is to use double colons as in Listing 12-8. The second way is to use the .global pseudo-op:
  .global Start  ; make this label global
We can create assembly constants using the = pseudo-op. One application of this directive is defining symbolic names for the I/O ports. Instead of writing code list this:
; read a byte from the SCI port
getchar:: ldaa 0x00C4   ; wait for new character available
          bita #$20
          beq  getchar
          clra
          ldab 0x00C7   ; new character from SCI
          rts

Listing 12-9: A subroutine that reads a character from the 6812 SCI0 port
 
we can add symbols to make it more readable:
; read a byte from the SCI port
SC0SR1=0x00C4
SC0DRL=0x00C7
RDRF=0x20
getchar:: ldaa SC0SR1   ; wait for new character available
          bita #RDRF
          beq  getchar
          clra
          ldab SC0DRL   ; new character from SCI
          rts

Listing 12-10: A better subroutine that reads a character from the 6812 SCI0 port
 
NOTE: the assembly directive =is not a macro substitute. Rather the expression is evaluated once, and the number value is used in place of the symbol.
Conditional assembly can be implemented using the .if <exp> .else .endif construction. If the <exp> is true (not zero) then the assembly code up to the .else is included. If the <exp> is false (0) then the assembly code between the .else and .endif will be included. For example
IS6812=1 ; means it is a 6812
.if IS6812
SCSR=0x00C4
SCDR=0x00C7
.else
SCSR=0x102E
SCDR=0x102F
.endif
RDRF=0x20
getchar:: ldaa SCSR   ; wait for new character available
          bita #RDRF
          beq  getchar
          clra
          ldab SCDR   ; new character from SCI
          rts

Listing 12-11: A flexible subroutine that reads a character from the 6811 or 6812 SCI port
 
The last pseudo-op is used to include other assembly files. For example
; read a byte from the SCI port
.include "HC12.S"
getchar:: ldaa SC0SR1   ; wait for new character available
          bita #RDRF
          beq  getchar
          clra
          ldab SC0DRL   ; new character from SCI
          rts

Listing 12-12: The .include pseudo-op allows you to divide software into separate files

How to use assembly to optimize a C function

In almost all situations when faced with a time-critical constraint it would be better to solve the problem other ways than to convert C code to assembly. Those alternative ways include using a faster CPU clock speed, upgrading to a more efficient compiler, and upgrading to a more powerful processor. On the other hand, some times we need to write and link assembly functions. One good reason to code in assembly is to take advantage of computer-specific operations. The enabling and disabling of interrupts is an example of an important operation that can not be performed in standard C. Another example is the use of specialize functions on the 6812 like fuzzy logic and table look-up. Although you could develop fuzzy logic control system in standard C, there are compelling speed advantages to implementing the core fuzzy logic controller in assembly.
In this example we will optimize the add3() function presented previously in Chapter 10. The assembly generated by ICC11 and ICC12 for this example was discussed back in Chapter 10. The C code from Listing 10-8 is repeated:
int x1;
static int x2;
const int x3=1000;
int add3(int z1, int z2, int z3){ int y;
    y=z1+z2+z3;
    return(y);}
void main(void){ int y;
    x1=1000;
    x2=1000;
    y=add3(x1,x2,x3);

Listing 10-8: Example function call with local variables
 
The assembly output (Listing 10-10) generated by the ImageCraft ICC12 version 5.1 is also repeated.
    .area text
_x3:: .word 1000
    .area text
; y -> -2,x
; z3 -> 8,x
; z2 -> 6,x
; z1 -> 2,x
_add3:: pshd
    pshx
    tfr s,x
    leas -2,sp
    ldd 2,x
    addd 6,x
    addd 8,x
    std -2,x
    ldd -2,x
    tfr x,s
    pulx
    leas 2,sp
    rts
; y -> -2,x
_main:: pshx
    tfr s,x
    leas -8,sp
    movw #1000,_x1
    movw #1000,_x2
    movw _x3,2,sp
    movw _x2,0,sp
    ldd _x1
    jsr _add3
    std -4,x
    tfr d,y
    sty -2,x
    tfr x,s
    pulx
    rts
.area bss
_x2:   .blkb 2
_x1::  .blkb 2

Listing 10-10: ICC12 assembly of function call with local variables
 
Next we draw a stack picture at the point of the first instruction of the function add3().

Figure 12-1 Stack frame at the start of add3()
 
The next step in optimization is to copy and paste the ICC11/ICC12 compiler code from the *.s file into a new assembly file. We will name the file add3.s. Using the stack frame picture as our guide, we optimize the function. One possible optimization is shown below. Notice that I created a new local variable stack binding based on SP instead of Reg X.
; ****filename is add3.s *******
; z3 -> 4,sp
; z2 -> 2,sp
; z1 in Reg D
_add3:: addd 2,sp    ; z1+z2
    addd 4,sp        ; z1+z2+z3
    rts

Listing 12-13 Optimized add3 function
 
Now this new function is linked into the original program.
int x1;
static int x2;
const int x3=1000;
asm(".include 'add3.s' ");
int add3(int, int, int);
void main(void){ int y;
    x1=1000;
    x2=1000;
    y=add3(x1,x2,x3);

Listing 12-14: Use of the new optimized function
 
Embedding the assembly function (add3) into C seems to work with or without the int add3(int,int,int); prototype.
For more information about assembly language programming see the Freescale Microcomputer Manuals and the help system of the application TExaS that is included with the book Embedded Microcomputer Systems: Real Time Interfacing by Jonathan W. Valvano published by Brooks-Cole.

No comments:

Post a Comment