fault rework studies
I am concerned that the implementation of RCU has too much entropy due to our poor understanding of the hardware; specifically, understanding of design assertions of the nature "if the operation is x and the state is y, then z is true (or false)"
CPU flow (ignoring EIS and RPx)
FETCH:
Fetch instruction into CU
Decode instruction
EXEC:
If not MIIF
If (PREPARE_CA || READ_OP)
Do CAF
If (READ_OP)
Do readOperand
Execute instruction
If (WRITE_OP)
If not ((PREPARE_CA || READ_OP)
Do CAF
Do writeOperand
Go to FETCH
Fault handling:
Page faults during 'Fetch instruction into CU' set the FIF bit, which signals a restart at 'FETCH:'.
Page faults during read operands CAF and readOperand restart at 'EXEC:' with MIIF set to off.
Faults during and after instruction execution restart at 'EXEC:' with MIIF set to on. These faults can be page faults from the CAF/write cycle, and non-page faults during instruction execution.
(Although the MIIF bit seems fairly trivial here, it plays a pivotal role in RPx).
The instruction restart model requires each instruction to be "invariant"; that is restarting it should not change the output.
A hypothetical problem instruction would be 'Add 1 to rA and store the result'; if the 'store the result' step faults the instruction will restarted, and rA will be incremented a second time, producing an incorrect result.
I am reasonably confident that the non-EIS instructions are invariant with respect to operands, but I am not sure about Indicator registers.
My concern lies with the fact page faults in the CAF/Write cycle cause instruction re-execution, which is inefficient and unnecessarily stresses the invariant condition.
I am currently considering a change:
FETCH:
Fetch instruction into CU
Decode instruction
EXEC:
If not MIIF
If (PREPARE_CA || READ_OP)
Do CAF
If (READ_OP)
Do readOperand
Execute instruction
rewrite:
If (WRITE_OP)
If not ((PREPARE_CA || READ_OP)
Do CAF
Do writeOperand
Go to FETCH
and a change to RCU
Define WriteFault as the various recoverable faults that can occur during writeOperand (page fault and ?)
if WriteFaults and MIIF goto rewrite.
This relies on the following assertion which I think is true, but I have not yet done documentation and code review to prove it:
The "execute instruction" cannot raise an WriteFault.
There is a case in the conditional transfers that is problematic, but I think they are wrong anyway, and am prepared to fix them.
The idea is if RCU can reliably decide if the fault occurred either during or after 'execute instruction' , it can be smarter about the restart.

This list was generated by inspections and may contain errors and omissions.
// decimal octal
// fault fault mnemonic name priority group handler
// number address
// 0 0 sdf Shutdown 27 7
// 1 2 str Store 10 4 getBARaddress, instruction execution
// 2 4 mme Master mode entry 1 11 5 JMP_SYNC_FAULT_RETURN instruction execution
// 3 6 f1 Fault tag 1 17 5 (JMP_REFETCH/JMP_RESTART) doComputedAddressFormation
// 4 10 tro Timer runout 26 7 JMP_REFETCH FETCH_cycle
// 5 12 cmd Command 9 4 JMP_REFETCH/JMP_RESTART instruction execution
// 6 14 drl Derail 15 5 JMP_REFETCH/JMP_RESTART instruction execution
// 7 16 luf Lockup 5 4 JMP_REFETCH doComputedAddressFormation, FETCH_cycle
// 8 20 con Connect 25 7 JMP_REFETCH FETCH_cycle
// 9 22 par Parity 8 4
// 10 24 ipr Illegal procedure 16 5 doITSITP, doComputedAddressFormation, instruction execution
// 11 26 onc Operation not complete 4 2 nem_check, instruction execution
// 12 30 suf Startup 1 1
// 13 32 ofl Overflow 7 3 JMP_REFETCH/JMP_RESTART instruction execution
// 14 34 div Divide check 6 3 instruction execution
// 15 36 exf Execute 2 1 JMP_REFETCH/JMP_RESTART FETCH_cycle
// 16 40 df0 Directed fault 0 20 6 JMP_REFETCH/JMP_RESTART getSDW, doAppendCycle
// 17 42 df1 Directed fault 1 21 6 JMP_REFETCH/JMP_RESTART getSDW, doAppendCycle
// 18 44 df2 Directed fault 2 22 6 (JMP_REFETCH/JMP_RESTART) getSDW, doAppendCycle
// 19 46 df3 Directed fault 3 23 6 JMP_REFETCH/JMP_RESTART getSDW, doAppendCycle
// 20 50 acv Access violation 24 6 JMP_REFETCH/JMP_RESTART fetchDSPTW, modifyDSPTW, fetchNSDW, doAppendCycle, EXEC_cycle (ring alarm)
// 21 52 mme2 Master mode entry 2 12 5 JMP_SYNC_FAULT_RETURN instruction execution
// 22 54 mme3 Master mode entry 3 13 5 (JMP_SYNC_FAULT_RETURN) instruction execution
// 23 56 mme4 Master mode entry 4 14 5 (JMP_SYNC_FAULT_RETURN) instruction execution
// 24 60 f2 Fault tag 2 18 5 JMP_REFETCH/JMP_RESTART doComputedAddressFormation
// 25 62 f3 Fault tag 3 19 5 JMP_REFETCH/JMP_RESTART doComputedAddressFormation
// 26 64 Unassigned
// 27 66 Unassigned
// 28 70 Unassigned
// 29 72 Unassigned
// 30 74 Unassigned
// 31 76 trb Trouble 3 2 FETCH_cycle, doRCU
So lets sort by usage:
Not relevant: only fires at the begin state of the CPU.
CON TRO
Not relevant: not restartable
EXF TRB
Only during instruction execution
MME MME2 MME3 MME4 OFL DIV CMD DRL
Instruction execution of FETCH cycle
LUF -- I suspect that is not restartable and so not relevant.
Unused
SDF PAR SUF
CAF,getSDW,doAppendCycle
F1 F2 F3 DF0 DF1 DF2 DF3
doAppendCycle or ring alarm in EXEC_cycle
ACV -- make sure that the ring alarm is compatible with the new logic
Special cases:
ABSA
ABSA can generate {ACV,ACV15} boundary violation faults and DF faults;
the new logic would have difficulty distinguishing the DF faults from
writeOperands DF faults.
Need review
IPR -- doAppendCycle, CAF, instruction execution; I suspect that is not restartable and so not relevant.
ONC -- I suspect that is not restartable and so not relevant.
STR -- getBARaddress, instruction execution; suspect that is not restartable and so not relevant.
doABSA; this routine reaches into the SDW/PTW logic; make sure that it's fault logic is correct.
ABSA can generate {ACV,ACV15} boundary violation faults and DF faults; the new logic would have difficulty distinguishing the DF faults from writeOperands DF faults.
ABSA is generating the faults because the business logic is do all of the readOperand steps except actually reading the operand.
The CAF and APPEND unit code is greatly improved since ABSA was written; it may be possible to greatly simplify it by making it READ_OPERAND, and letting it use iefpFinalAddress. T&D extensively tests it.