| 
    
      Processor setup via co-processor 15 
     | 
    
       
     | 
The setup is controlled by co-processor 15 registers, accessed with MRC and MCR in non-user mode.
These registers are particular to the processor specified.
Bits 0 - 7 Revision of processor Bits 8 - 15 Should be '3', identifying processor as an ARM3 Bits 16 - 23 Manufacturer code (&56 = VLSI Technology Inc.) Bits 24 - 31 Designer code (&41 = ARM Ltd)
  Bit 0 - Turns the cache on (1) or off (0)
  Bit 1 - Determines if user mode and non-user mode use the same address
          mapping. 1 if they do, or 0. Should be 1 for use with MEMC.
  Bit 2 - 0 for normal operation, 1 for special monitor mode (processor
          runs at memory speed and address/data always put on external
          pins even if data fetched from cache - for logic analyser
          to trace the program properly).
  Other bits reserved.
       Bit 0 - 1 if virtual addresses &0000000-&01FFFFF are cacheable, 0 if not Bit 0 - 1 if virtual addresses &0200000-&03FFFFF are cacheable, 0 if not ... Bit 31 - 1 if virtual addresses &3E00000-&3FFFFFF are cacheable, 0 if not
Bit 0 - 1 if virtual addresses &0000000-&01FFFFF are updateable, 0 if not Bit 0 - 1 if virtual addresses &0200000-&03FFFFF are updateable, 0 if not ... Bit 31 - 1 if virtual addresses &3E00000-&3FFFFFF are updateable, 0 if not
Bit 0 - 1 if virtual addresses &0000000-&01FFFFF are distruptive, 0 if not Bit 0 - 1 if virtual addresses &0200000-&03FFFFF are distruptive, 0 if not ... Bit 31 - 1 if virtual addresses &3E00000-&3FFFFFF are distruptive, 0 if not
Bits 0 - 7 Revision of processor (&1x) Bits 8 - 15 Processor identity Bits 16 - 23 Manufacturer code (&56 = VLSI Technology Inc.) Bits 24 - 31 Designer code (&41 = ARM Ltd)
Bit 0 - On-chip MMU turned off (0) or on (1) Bit 1 - Address alignment fault disabled (0) or enabled (1) Bit 2 - Instruction/data cache turned off (0) or on (1) Bit 3 - Write buffer turned off (0) or on (1) Bit 4 - 26 bit program space if 0, 32 bit program space if 1 Bit 5 - 26 bit data space if 0, 32 bit data space if 1 Bit 6 - Early abort mode if 0, late abort mode if 1 Bit 7 - Little-endian operation if 0, big-endian if 1 Bit 8 - System bit - controls the ARM610 permission system
  00  No Access - Domain fault generated if tried to access
  01  Client    - Accesses are checked against permission bits in
                  section/page descriptor
  10  Reserved  - Currently behaves like no access mode
  11  Manager   - Accesses are NOT checked, permission faults cannot
                  be generated
       Bits 0 - 3 Status Bits 4 - 7 Domain Bits 8 - 11 Set to zero Bits 12 - 31 Whatever was the last value on the internal data bus
Bits 0 - 3 Revision of processor? Bits 3 - 15 Processor identity - &710 Bits 16 - 23 Manufacturer code Bits 24 - 31 Designer code (&41 = ARM Ltd)
Bit 0 - On-chip MMU turned off (0) or on (1) Bit 1 - Address alignment fault disabled (0) or enabled (1) Bit 2 - Instruction/data cache turned off (0) or on (1) Bit 3 - Write buffer turned off (0) or on (1) Bit 4 - 26 bit program space if 0, 32 bit program space if 1 Bit 5 - 26 bit data space if 0, 32 bit data space if 1 Bit 6 - Early abort mode if 0, late abort mode if 1 Bit 7 - Little-endian operation if 0, big-endian if 1 Bit 8 - System bit - controls the ARM710 permission system Bit 9 - ROM bit - controls the ARM710 permission system
  00  No Access - Domain fault generated if tried to access
  01  Client    - Accesses are checked against permission bits in
                  section/page descriptor
  10  Reserved  - Currently behaves like no access mode
  11  Manager   - Accesses are NOT checked, permission faults cannot
                  be generated
       Bits 0 - 3 Status Bits 4 - 7 Domain Bits 8 - 11 Set to zero Bits 12 - 31 Whatever was the last value on the internal data bus
&41077100.
Bits 0 - 3 Processor revision number
Bit 0 - On-chip MMU turned off (0) or on (1) Bit 1 - Address alignment fault disabled (0) or enabled (1) Bit 2 - Data cache turned off (0) or on (1) Bit 3 - Write buffer turned off (0) or on (1) Bit 7 - Little-endian operation if 0, big-endian if 1 Bit 8 - System bit - controls the MMU permission system Bit 9 - ROM bit - controls the MMU permission system Bit 12 - Instruction cache turned off (0) or on (1)
Bits 0 - 3 Status Bits 4 - 7 Domain Bit 8 Zero Bits 9 - 31 Undefined on read, ignored on write
  The OPC_2 and CRm co-processor fields select which cache
  operation should occur:
    Function         OPC_2    CRm    Data
    Flush I + D      %0000    %0111  -
    Flush I          %0000    %0101  -
    Flush D          %0000    %0110  -
    Flush D single   %0001    %0110  Virtual address
    Clean D entry    %0001    %1010  Virtual address
    Drain write buf. %0100    %1010  -
       
  The OPC_2 and CRm co-processor fields select which cache
  operation should occur:
    Function         OPC_2    CRm    Data
    Flush I + D      %0000    %0111  -
    Flush I          %0000    %0101  -
    Flush D          %0000    %0110  -
    Flush D single   %0001    %0110  Virtual address
       
  The OPC_2 and CRm co-processor fields select the following...
    Function         OPC_2    CRm
    Enable odd word  %0001    %0001
    loading of
    Icache LFSR
    Enable even word %0001    %0010
    loading of
    Icache LFSR
    Clear Icache     %0001    %0100
    LFSR
    Move LFSR to     %0001    %1000
    R14,Abort
    Enable clock     %0010    %0001
    switching
    Disable clock    %0010    %0010
    switching
    Disable nMCLK    %0010    %0100
    output
    Wait for         %0010    %1000
    interrupt
    10 DIM code% 32
    20 P% = code%
    30 [ OPT     3
    40   SWI     "OS_EnterOS"
    50   MRC     CP15, 0, R0, C0, C0
    60   TSTP    PC, #&F0000000
    70   MOV     R0, R0
    80   MOV     PC, R14
    90 ]
   100 PRINT ~USR(code%)
When run, this would print:
   >RUN
   00008FAC                    OPT     3
   00008FAC EF000016           SWI     "OS_EnterOS"
   00008FB0 EE100F10           MRC     CP15, 0, R0, C0, C0
   00008FB4 E31FF20F           TSTP    PC, #&F0000000
   00008FB8 E1A00000           MOV     R0, R0
   00008FBC E1A0F00E           MOV     PC, R14
     41077100
   >
Note that this code must run in a privileged mode.
Here is a short exercise for you:
    10 DIM code% 16
    20 P% = code%
    30 [ OPT     3
    40   CDP     CP1, 0, C0, C1, C2, 0
    50   ADFS    F0, F1, F3
    60   MOV     PC, R14
    70 ]
   >RUN
   00008F78                    OPT     3
   00008F78 EE010102           CDP     CP1, 0, C0, C1, C2
   00008F7C EE010102           ADFS    F0, F1, F2
   00008F80 E1A0F00E           MOV     PC, R14
   >
What do you notice? :-)
When the ARM executes a co-processor instruction, or an undefined instruction, it will offer it
to any co-processors which may be presently attached. If hardware is available to process the
given instruction, then it is expected to do so. If it is busy at the time the instruction is
offered, the ARM will wait for it.
If there is no co-processor capable of executing the instruction, the ARM will take its
undefined instruction trap, in which case the following will happen:
To return, simply pull the saved PC and PSR (depends on 26/32 bit) and push them to the current
PC and PSR, like MOVS PC, R14 in 26 bit systems. This will pick up with the
instruction following the one which caused the trap.
All of the co-processor instructions can be executed conditionally. Please note that the
conditionals relate to the status of the ARM processor, and not the status of any of the
co-processors. This is because the ARM always tries the instruction first, and offers it around
and maybe takes the undefined application trap, so the conditions are ARM related.
To make this clearer:
    10 DIM code% 32
    20 P% = code%
    30 [ OPT     3
    40   FLTS    F0, R0
    50   FLTS    F1, R1
    60   FMLS    F2, F0, F1
    70   FIX     R0, F2
    80   MOVS    PC, R14
    90 ]
   100 INPUT "First number : "A%
   110 INPUT "Second number: "B%
   120 PRINT USR(code%)
This probably won't assemble without an enhanced BASIC assembler.
Anyway, you might think the ARM will hand over to the floating point co-processor to do the four
FP instructions, then hand back afterwards.
If you did, you would be incorrect!
What actually is executed is:
MCR CP1, 0, R0, C0, C0 MCR CP1, 0, R1, C1, C0 CDP CP1, 9, C2, C0, C1 MRC CP1, 0, R0, C0, C2
It is worth pointing out that objasm specifies co-processor registers using the CR
notation (ie, CR0 - CR15), which is first defined with the CN directive. It does not
appear as if default co-processor instructions are defined in Nick Roberts' ASM, though I've only
looked in the instructions at the "defined symbols" section...
Darren Salt's ExtBASICasm provides the register names C0 - C15 to refer to the
co-processors. So if any of these examples fail when you try to assemble them, please check what
format your assembler provides these instructions.  
MRC transfers a co-processor register to an ARM register. It takes
the form:
MRC <co-pro>, <op>, <ARM reg>, <co-pro reg>, <co-pro reg2>, <op2>The co-processor is denoted in most assemblers by
CPx.<co-pro reg> is written to <ARM reg>, using
operation <op>. This may, possibly, be further modified by
<co-pro reg2> and <op2>. For an idea of the sorts of times
when this might be necessary, consider instructions of the form LDR Ra, [Rb], #x.
<op2> may be omitted, as it is in the example, but the other parts
of the MRC instruction must be supplied.
MCR transfers an ARM register to a co-processor register. It takes
the form:
MCR <co-pro>, <op>, <ARM reg>, <co-pro reg>, <co-pro reg2>, <op2>The co-processor is free to interpret the fields as it desires, but the standard interpretation is that the contents of the ARM register are written to the co-processor register using the operation code given, which may be further modified by the second co-processor register and/or the second operation code.
LDC loads data from memory into the co-processor register, while
STC saves data from a co-processor register to memory.LDC <co-pro>, <co-pro reg>, <address> LDCL <co-pro>, <co-pro reg>, <address> STC <co-pro>, <co-pro reg>, <address> STCL <co-pro>, <co-pro reg>, <address>If the 'L' flag is specified, a long transfer is performed. Otherwise a short transfer is performed. The 'L' flag follows the extension, like
LDCEQL.[Rx] [Rx, #x] ! [Rx], #xThese are like those used for the LDR instruction. However they are only eight bits wide and specify word offsets (the ARM types are 12 bit and byte offset).
STR CP0, CR1, [R2, #16]!.
CDP instructs the co-processor to do some processing. It takes the
form:
CDP <co-pro>, <co-pro reg1>, <co-pro reg2>, <co-pro reg3>, <op>This tells the co-processor to do something. The ARM will not wait for it to finish, nor is any sort of status sent back to the ARM. It is possible for a co-processor to maintain a queue of instructions, allowing it and the ARM to process in parallel.
FLTE F0, R0 FLTE F1, R1 MUFE F2, F0, F1 FIX R0, F2 MOV R1, #0you could save a small amount of time with:
FLTE F0, R0 FLTE F1, R1 MUFE F2, F0, F1 MOV R1, #0 FIX R0, F2as the FPU could be finishing the MUF while you MOV. The hardware FPU (as in the 7500FE) runs asynchronous - you can switch to synchronous by setting a bit in the FPSR. The software emulation always runs synchronously, and as it uses the ARM in order to emulate the FP instructions, there is no possible advantage to be gained.
There are no rules for the register types and/or the operation codes. These depend upon the co-processor.