Tag Archives: macintosh

Carrera in an SE/30 – the code part 2

3rd handler

Next up is the 3rd handler, MacII_3rd: (0x3F94) in our case. Actually it’s called with JSR 8(A6), but that’s an 8 byte offset to the ‘base-address’ of any handler. Clever stuff, huh (Google for ‘pointer-table’)?

This subroutine contains serious magic and was a real hard nut to crack. Especially because it tricked me into believing that I’ve found the ‘crashsite’… which, to spoil the tension, isn’t.
It just kept on killing  Macsbug, because it’s so low-level.

What this routine does is replacing the Vector Base Register (VBR) which ‘lives’ at address 0x00000000. Evil stuff.

  • After disabling interrupts and switching to 32bit-mode a field with 6 long-words (data107) will be populated with data generated in other routines.
    For now I can only guess what these entries are (Values from my SE/30 given in brackets). We’ll discuss all that further down.
  • 0x3FC6 to 0x3FD8 calculates the size of the chunk of code starting at data106 (0x4008) to the beginning of MacII_4th (i.e. the end of Mac_3rd), which is 180 bytes.
  • Using this length, the routine first saves the current VBR onto the stack using the system call _BlockMove.
    Then the original VBR (+some more) will be replaced by the new version beginning at data106. (Killing Macsbug – more on that later)
  • BSR 53_cmd_1x is been called. This brings the Carrera040 into life most likely using the just copied VBR (This is discussed in much detail further down).
  • Now the contents of the stack (= copy of the original VBR) will be copied back into its place, this time using a classic DBRA loop (0x3FF4). My guess, no Toolbox call possible at the moment.
  • Adjust the stack, back to 16bit mode, restore Registers and return-from-subroutiene. Done.

Here’s the code doing all this:

3F94:MacII_3rd: MOVE    SR,-(A7)     ; 3rd call from MacII handler
3F96:           ORI     #$700,SR       ; Set bit 9-11 of SR (disable Interrupts)
3F9A:           MOVEM.L D0-D2/A0-A2,-(A7)
3F9E:           MOVEQ   #1,D0
3FA0:           _SwapMMUMode  
3FA2:           PUSH.B  D0
3FA4:           SUBA.L  A2,A2         ; faster movea.l #0,a2
3FA6:           LEA     data107,A0    ; Filling the data into the 6x32 field
3FAA:           MOVE.L  96(A5),D0
3FAE:           MOVE.L  D0,(A0)+      ; SE30: 9FE00
3FB0:           LEA     data69,A1
3FB4:           MOVE.L  A1,(A0)+      ; SE30: 9D6E2 (User/Supervisor Rootpointer?)
3FB6:           MOVE.L  $64(A5),(A0)+ ; 807FC040
3FBA:           MOVE.L  $6C(A5),(A0)+ ; 807FC040
3FBE:           MOVE.L  $68(A5),(A0)+ ; 00000000
3FC2:           MOVE.L  $70(A5),(A0)+ ; 00000000
3FC6:           LEA     MacII_4th,A0
3FCA:           MOVE.L  A0,D2
3FCC:           LEA     data106,A0
3FD0:           SUB.L   A0,D2         ; 'distance' from data106 to MacII_4th
3FD2:           SUBA.L  D2,A7
3FD4:           MOVEA.L A2,A0
3FD6:           MOVEA.L A7,A1
	  	; save the current VBR to the stack
3FD8:           MOVE.L  D2,D0
	  	; A0 = SE30: 00000000 (src)  - IIci: $FBB08000
	  	; A1 = SE30: 027ff34c (dest) - IIci: $3BF9FC6
	  	; D0 = B4    (count)   - SAME on the IIci!    
3FDA:           _BlockMove ; (A0/srcPtr,A1/destPtr:Ptr; D0/byteCount:Size) 
	  	; write my own VBR...
                ; This copies 180 bytes into 0x000000000 replacing the original VBR. 
                ; ... and kills Macsbug if not circumvented properly.
3FDC:           LEA     data106,A0
3FE0:           MOVEA.L A2,A1
3FE2:           MOVE.L  D2,D0
	  	; A0 = 9F900 (src)   - IIci 10C4EA (data88)    
	  	; A1 = 00000 (dest)  - IIci FBB08000
	  	; D0 = B4    (count) - IIci same  
3FE4:           _BlockMove ; (A0/srcPtr,A1/destPtr:Ptr; D0/byteCount:Size) 
3FE6:           BSR     53_cmd_1x  ; Bring the C040 to life
3FEA:           MOVEA.L A7,A0  ; SP to A0
3FEC:           MOVEA.L A2,A1  ; SE30: 00000000
3FEE:           MOVE.L  D2,D0  ; the code length (B4 again)
3FF0:           BRA.S   lae_163
3FF2:   lae_162 MOVE.B  (A0)+,(A1)+ ; Write the VBR back from the stack
3FF4:   lae_163 DBRA    D0,lae_162
3FF8:           ADDA.L  D2,A7 ; adjust the stack
3FFA:           POP.B   D0
3FFC:           _SwapMMUMode  
3FFE:           MOVEM.L (A7)+,D0-D2/A0-A2
4002:           MOVE    (A7)+,SR
4004:           MOVEQ   #0,D0
4006:           RTS     
 
; Start of VBR replacement- and 040-Code being copied to 0x0 the by line 0x3FE4 
; /if/ theses are the Vectors 0-17, then their meaning would be:
 
4008: data106:  DC.L    #$00001000 ; Reset initial Stack Pointer
400C:           DC.L    #$00000050 ; Reset initial Program Counter
; - ALL of these Vectors point to addr 4050 (offset 0x48) -
4010:           DC.L    #$00000048 ; Buserror  
4014:           DC.L    #$00000048 ; Adress Error
4018:           DC.L    #$00000048 ; Illegal Instruction
401C:           DC.L    #$00000048 ; Zero Divide
4020:           DC.L    #$00000048 ; CHK, CHK2 instruction
4024:           DC.L    #$00000048 ; cpTRAPcc, TRAPcc, TRAPV instruction
4028:           DC.L    #$00000048 ; Privilige Violation
402C:           DC.L    #$00000048 ; Trace
4030:           DC.L    #$00000048 ; LINE 1010 Emulation
4034:           DC.L    #$00000048 ; LINE 1111 Emulation
 
; THESE are definitely no vectors, they are dynamically written by the code above
; and to be used to setup the 040 MMU registers.
4038: data107:  DC.L    #$0009FE00 ;  
403C:           DC.L    #$0009D6E2; 
4040:           DC.L    #$807FC040 ; 
4044:           DC.L    #$807FC040 ; SE30: 
4048:           DC.L    #$00000000 ; SE30: 00000000 
404C:           DC.L    #$00000000 ; SE30: 00000000 
 
4050:           CLR.L   $53000000  ; Poke 0 to $53000000
4056:           BRA     lae_164    ; This points to itself... I'm lost at the moment.
4058:           LEA     data107,A0 ; SE30: 9F900
405C:           MOVE.L  (A0)+,D1   ; SE30: 0009FE00 (User/Supervisor Rootpointer)
405E:           MOVEA.L (A0)+,A1   ; 0009D6E2
4060:           MOVE.L  (A0)+,D4   ; 807FC040
4062:           MOVE.L  (A0)+,D5   ; 807FC040
4064:           MOVE.L  (A0)+,D6   ; 00000000
4066:           MOVE.L  (A0)+,D7   ; 00000000
4068:           MOVE.L  #$C000,D0
406E$           MOVEC   D0,ITT0   ; Set Instruction Transparent Translation
4072$           MOVEC   D0,DTT0   ; Set Data Transparent Translation
4076$           MOVEC   D1,SRP    ; Set Supervisor Rootpointer
407A$           MOVEC   D1,URP    ; Set User Rootpointer
407E:           MOVE.L  #$C000,D0
4084$           PFLUSHA           ; Invalidates all entries in the address translation cache
4086$           MOVEC   D0,TC 
408A:           LEA     data108,A0
408E:           ADDA.L  #$53002000,A0 ; (=0x530A1900)
4094:           JMP     (A0)      ; JuMP to data108 code (below) in C040 RAM range?     
 
4096:  data108: MOVEQ   #0,D0
4098$           MOVEC   D0,ITT0 
409C$           MOVEC   D0,DTT0 
40A0$           MOVEC   D4,ITT0 
40A4$           MOVEC   D5,DTT0 
40A8$           MOVEC   D6,ITT1 
40AC$           MOVEC   D7,DTT1 
40B0$           CINVA   BC
40B2:           NOP     
40B4:           MOVEQ   #0,D0
40B6$           MOVEC   D0,CACR
40BA:           JMP     (A1)    ; 0009D6E2
 
; END 040 Code being copied to somewhere by line 3FE4 
40BC: MacII_4th: MOVEM.L D1-D7/A0-A4,-(A7)  ; 4th subroutine called my MacII_handler
[...]

The Vector Base Register

I wasn’t precise when I initially said “replacing the VBR”. What actually happens is that this routine uses what I’d call an interim-VBR for the moment it initializes the 68040 on the C040. You’ve probably saw the link referring to what the VBR is in the 1st post of this series, but let me go a bit more into detail.

The VBR is a list of addresses (aka vectors) the CPU refers to in case of an exception – and this is true for every 68k system out there, e.g. Mac, SUN, NeXT, Amiga or Atari. Some of them might do some relocation using their MMU, but even the virtual address will be 0x00000000 and the order is the same.  There are 16 basic vectors as listed here:

If for example a divide-by-zero happens, the CPU would call a handler which address is stored in 0x14.  Pretty simple.
So let’s have a look what MacII_3rd left in the VBR (and below that) when the ‘interim VBR’ is in place:

0000: data106:  DC.L    #$00001000 ; Reset initial Stack Pointer
0004:           DC.L    #$00000050 ; Reset initial Program Counter
     ; - ALL of these Vectors point to addr 0x48 -
0008:           DC.L    #$00000048 ; Buserror  
000C:           DC.L    #$00000048 ; Adress Error
0010:           DC.L    #$00000048 ; Illegal Instruction
0014:           DC.L    #$00000048 ; Zero Divide
0018:           DC.L    #$00000048 ; CHK, CHK2 instruction
001C:           DC.L    #$00000048 ; cpTRAPcc, TRAPcc, TRAPV instruction
0020:           DC.L    #$00000048 ; Privilige Violation
0024:           DC.L    #$00000048 ; Trace
0028:           DC.L    #$00000048 ; LINE 1010 Emulation
002C:           DC.L    #$00000048 ; LINE 1111 Emulation
     ; - THESE are definitely no vectors, they are dynamically written by the 
     ;   code above and to be used to setup the 040 MMU registers.
0030:  data107: DC.L    #$00000000 ; SE30: 0009FE00  (12)
0034:           DC.L    #$00000000 ; SE30: 0009D6E2  (13)
0038:           DC.L    #$00000000 ; SE30: 807FC040  (14) 
003C:           DC.L    #$00000000 ; SE30: 807FC040  (15)
0040:           DC.L    #$00000000 ; SE30: 00000000 
0044:           DC.L    #$00000000 ; SE30: 00000000 
 
0048:           CLR.L   $53000000  ; Poke 0 to $53000000 ; C040 off
004C:  blocker3 BRA     blocker3    ; Points to itself... probably a "blocker"
0050:           LEA     data107,A0 ; initial Program Counter (SE30: 9F900)
0054:           MOVE.L  (A0)+,D1   ; SE30: 0009FE00 (User/Supervisor Rootpointer)
0058:           MOVEA.L (A0)+,A1   ; 0009D6E2
005C:           MOVE.L  (A0)+,D4   ; 807FC040
0060:           MOVE.L  (A0)+,D5   ; 807FC040
0064:           MOVE.L  (A0)+,D6   ; 00000000
0068:           MOVE.L  (A0)+,D7   ; 00000000
006C:           MOVE.L  #$C000,D0
0070:           MOVEC   D0,ITT0    ; Set Instruction Transparent Translation
0074:           MOVEC   D0,DTT0    ; Set Data Transparent Translation
0078:           MOVEC   D1,SRP     ; Set Supervisor Rootpointer
007C:           MOVEC   D1,URP     ; Set User Rootpointer
0080:           MOVE.L  #$C000,D0
0084:           PFLUSHA            ; Invalidates all entries in the address translation cache
0088:           MOVEC   D0,TC 
008C:           LEA     data108,A0
0090:           ADDA.L  #$53002000,A0 ; (=0x530A1900)
0094:           JMP     (A0)       ; JuMP to data108 code (below) in C040 RAM range? 
 
009C:  data108: MOVEQ   #0,D0
00A0:           MOVEC   D0,ITT0 ; 0
00A4:           MOVEC   D0,DTT0 ; 0
00A8:           MOVEC   D4,ITT0 ; 807FC040
00AC:           MOVEC   D5,DTT0 ; 807FC040
00B0:           MOVEC   D6,ITT1 ; 00000000
00B4:           MOVEC   D7,DTT1 ; 00000000
00B8:           CINVA   BC
00BC:           NOP     
00C0:           MOVEQ   #0,D0
00C4:           MOVEC   D0,CACR
00C8:           JMP     (A1)    ; 0009D6E2
Farewell, old friend

At this point, my SE/30 always froze and I thought this must be the point where to find incompatibilities between the IIci and SE/30.
But after understanding, what’s really going on, it was clear that overwriting the TRAP exception (Nr.7), Macsbug was simply kicked out of the game as this exception is triggered after every step/trace you do in a debugger…
So to get beyond this point, I had to modify the program counter to skip the point where TRAP is copied-over… which is done inside the Toolbox’ _BlockMove call. So I had to single-step into that and find the right call/time to do a ‘pc=pc+2’ 😉 (Good thing you can define a macro for that).

Okayyyyy. After that’s been written, 53_cmd_1x is called, presumably telling the C040 to come to life.
And keen as it is, it’ll look up the “Reset initial Program Counter” (VBR: 0xC) and starts executing code from 0x50. Any other occurring exception will call the ‘handler’ at 0x48, simply switching the C004 off and sit in an endless loop (0x4C) – probably making the 68030 to take over again.

EmEmYou!

Given everything’s fine, the code at 0x50 will start reading the previously populated data from data107 into several registers.
Then some serious 68040 MMU table setup happens – so this is some kind of ‘040 initialization routine… and the ‘040 is actually running. Woohoo!

Time for some special register explanation:
As we all know, the 68040 has two in-build 4k caches and an MMU. The latter can be programmed how and what to cache. This is defined in 4 registers of which only 2 are of interest here: ITT0 and DTT0, the Instruction and Data Transparent Translation registers, both sharing the same bit-fields following this pattern:

BBBBBBBBMMMMMMMMESS000UU0CC00W00

  • BLogical Address Base – compared with address bits A31-A24. Addresses that match in this comparison are transparently translated
  • MLogical Address Mask – setting a bit in this field causes corresponding bit in Base field to be ignored
  • E – Enable Bit – 1 – translation enabled; 0 – disabled
  • SSupervisor Mode – 00 – match only in user mode 01 – match only in supervisor mode 1x – ignore mode when matching
  • U – User Page Attributes – ignored by 040
  • CCache mode – 00 – Cacheable, Write-through 01 – Cacheable, Copyback 10 – Noncacheable, Serialized 11 – Noncacheable
  • WWrite protect – 0 – write permitted; 1 – write disabled

Here’s an example:

807FC040 = 10000000011111111100000001000000 
           BBBBBBBBMMMMMMMMESS000UU0CC00W00

which means:

  • a bit more than 2GB transparently translated (2063MB)
  • translation enabled
  • Supervisor Mode: ignore mode when matching
  • Cache mode: Noncacheable, Serialized
  • Write permitted

So let’s have a look at the code again:

At 0x70/0x74 the MMU is set to 0xC000, i.e. Enable translation, apply for user & supervisor mode, write-though cache, for logical address space 0x00000000-0x00ffffff (16MB).
Then Supervisor & User Rootpointer are set to 0x9FE00, then the address translation cache is flushed to finally set the Translation Control register to Enable & 8K page size (0x88)… up to here this was pretty much ‘by the book’ of how to set-up MMU tables.

Having its MMU all set, the 68040 now gets something to chew on:
The address of data108 is added to 0x53002000 and jumped to!
💡  Does 0x53002000 equal 0x00000000 for the C040?

Let’s assume the C040 executes the code at data108 for now. That is:

  • Clear the ITT/DTT registers
  • Set the MMU to 0x807FC040 (see decoding example above)
  • invalidate caches and wait’a’NOP to have that happened
  • then disable all caches
  • and jump to where A1 points to. In my SE/30 that’s 0x9D6E2, previously loaded from data107 in 0x58

Writing all this from the top of my head, I’m not 100% sure where this address is pointing to. I must be somewhat back into MacII_3rd (0x3FEA), because this is where the program execution resumes (Need to check this with Macsbug and will update).

For now, I’m tempted to call MacII_3rd something like ‘C040_MMU_setup‘… but I’d love to have this confirmed  💡 by somebody who knows more than me 😉

Next up will be continuing working further through the main:  procedure again… stay tuned.

Carrera in an SE/30 – the code part 1

The disassembled code of the Micromac Carrera 040 control panel is quite big: 6000+ lines of 68030/40 assembly…

While these posts might be entertaining and giving you an insight into classic MacOS driver code, they are also meant as a notebook to myself to get into the source quickly – especially after some weeks or months of distraction 😉

That said, I will not discuss each and every line of code. There are many parts which aren’t important (for now) or just not reached yet.
Still, it will take several parts/chapters to cover everything I worked on.

The complete code is available over here on GitHub and will updated every time I’m working on it.
Whenever I’m mentioning addresses I’m referring to this code on GitHub. NB: I will never use line-numbers as these might change during editing the source.
Also, when you’ll see a light-bulb  💡  somewhere, this is where I’m not sure and happy about enlightenment or comments from you 😉

This article is totally work-in-progress. E.g. every now and then my theories about what a certain code does changes, I learn new things and all the sudden whole blocks of code make sense… so this post will change/grow, too.

Approaching… difficulties.

What’s the main job of this code? From a 30000ft perspective the simple answer is “switching the Carrera040 on and off”, i.e. toggling between the hosts slower on-board 68030 and the insanely fast 68040 on the C040. At boot-time… as well as during the system is running (by user interaction).

Sounds pretty simple, huh? Lowering our flight altitude to 3000ft more things come into play:
Identify the hosting Macintosh. As mentioned in the previous chapter, the C040 was able to run in a Mac II, IIx, IIcx, IIvx, IIvi, IIvm, IIsi, IIci, LC and LCII… all of them different in many places. These differences have to be handled…
Down at 30ft we have to admit that there are differences between a 68030 and his younger brother 68040, mainly concerning caches, FPU and the MMU.
Finally hitting the ground, it’s becoming clear that it is everything but trivial to halt a running processor, save its complete context and start another (slightly different) processor with that. And back again…

Some given things before we start:

  • We will concentrate on the IIx “branch” as this machine is closest to the SE/30 like not-32bit-clean, memory-map, the GLUE chip, two real VIAs with the same register layout etc.
  • I learned from the code that the C040 is memory-mapped at 0x53000000 in some of the supported models, especially the IIx and IIci. This means 32bit addressing is a must (-> need “mode32” INIT or clean ROM)
  • I tried to comment as much as possible/understood inline (i.e in the code) – a good bit of 68k machine language knowledge is still required 😉
  • If something needs more explanation, I’ll try to provide this before the code quote or afterwards.

So this is the main routine (at 0x21FC):

main     MOVEM.L A4-A6,-(A7)
         MOVE.L  D0,D7
         MOVE.L  #$31E,D0    ; need 798 bytes
         _NewPtr ,CL_SY      ; allocate requested amount of memory (D0) in system
                             ; heap (returned in A0) and initialize to zeroes
         TST     D0          ; success?
         BNE.S   lae_6       ; nope, exit.
         LEA     data2,A1    ; else
         MOVE.L  A0,(A1) ; Init A5 world and save into data2
         MOVEA.L A0,A5
         MOVE    #$A89F,D0   ; UnimplTrap
         _GetTrapAddress     ; (D0/trapNum:Word):A0\ProcPtr 
         MOVE.L  A0,$29C(A5) ; save the trap addr into 2 places 
         MOVE.L  A0,$2A0(A5) ; in the A5 world
         BSR     sysDetect   ; Jump to Machine detection routine 
         BNE.S   lae_6       ; success?
         MOVE.L  D7,D0
         JSR     4(A6)       ; We jump to the subroutine set in the detection routine
                             ; for the second time, this time offset 4...
			     ; i.e. we skip the 1st 'BRA' there
         BNE.S   lae_6       ; success?
         BSR     instFPSP    ; Install Motos FPSP
         BNE.S   lae_6       ; success?
         JSR     8(A6)       ; That's the 3rd call in the handler call cascade (needs hack for MacsBug!)
         BNE.S   lae_6       ; success?
         BSR     proc32      ; works (get some RSC strings)
         BSR     proc43      ; install traps
         BNE.S   lae_6       ; success?
         JSR     12(A6)      ; That's the 4th call in the handler call cascade
         BNE.S   lae_6       ; success?
         BSR     proc41      ; works (atalk?)
         BSR     proc42      ; VIA stuff and such - BOOM
         BSR     proc29
         MOVEM.L (A7)+,A4-A6
         MOVEQ   #0,D0

As you can see, there are 10 calls to subroutines- currently it crashes inside the 8th subroutine, currently called proc42… But let’s check these subroutines one by one.

sysDetect

This is the subroutine I had to “patch” to initially make the driver work with an SE/30. It starts at 0x2022 and does these things:

  • Check if the ‘Gestalt‘ trap is available at all (very good style!) else throw an error
  • If it is, read the machines Gestalt code into D0, throw an error if zero
  • Decide which ‘handler’ to choose given the Gestalt code.

Based on their Gestalt codes there are four groups of Macs defined in the following lines (0x204C – 0x20AC):

  • Mac II/IIx/IIcx — “dirty Macs”, not 32bit clean, no PDS
    • “Expansion I/O Space” from 0x51000000 to 0x5FFFFFFF
    •  the C040 installs with an adapter right into the CPU socket in the II/IIx/IIcx
    • SE/30 is also “dirty”, need mode32 or IIsi ROM in slot
    • these machines also use the GLUE chip to emulate the VIA2 like the SE/30
  • Mac IIvx, IIvi, IIvm — special kind of PDS slot
    • there’s no mentioning of support on the MicroMac page
  • Mac IIsi, IIci
    • Kind of interesting because the si has the same PDS slot like the SE/30
    • Uses the RBV (Ram Based Video) controller which emulates the VIA2
    • Therefore totally different memory layout (VRAM at 0x00000000 mapped by the MMU etc.)
  • Mac LC, LCII, Color Classic
    • These share the same LC-PDS slot

If your Mac is one of those (or patched at 0x2058), you’ll branch into sys_check: (0x20BA) which will make sure you run at least System 6.0.5, have virtual memory switched off and jumps into the selected handler code (address saved in A6) at 0x20EA  for the first time.

Here’s the code of what’s discussed above:

2022: sysDetect: MOVE.L  #$A0AD,D0  ; Gestalt
2028:         _GetTrapAddress newOS ; (D0/trapNum:Word):A0\ProcPtr 
202A:         MOVE.L  A0,D2
202C:         MOVE.L  #$A89F,D0     ; UnimplTrap
2032:         _GetTrapAddress newTool; (D0/trapNum:Word):A0\ProcPtr 
2034:         CMP.L   A0,D2
2036:         BEQ     OS_bad
203A:         MOVE.L  #'mach',D0
2040:         _Gestalt ; (A0/selector:OSType):D0\OSErr 
2042:         BNE     bad_conf      ; If we can't read it, fire general Error Msg
2046:         MOVE.L  A0,D0
2048:         MOVE.L  D0,2(A5)
 
; Check for several Mac models which are grouped into 3, each having its own handler routine. 
; 1) Mac II/IIx/IIcx 
; 2) IIvx, IIvi, IIvm 
; 3) IIsi, IIci     
; 4) LC, LCII, Color Classic
 
204C:         LEA     MacII_handler,A6   ; -- The dirty gang
2050:         CMPI.L  #6,D0        ; MacII 
2056:         BEQ.S   sys_check
2058:         CMPI.L  #7,D0        ; MacIIx - we replace this by the SE/30 #9
205E:         BEQ.S   sys_check
2060:         CMPI.L  #8,D0        ; IIcx
2066:         BEQ.S   sys_check
 
2068:         LEA     V_handler,A6   ; -- The "V" Macs.
206C:         CMPI.L  #48,D0       ; IIvx
2072:         BEQ.S   sys_check
2074:         CMPI.L  #44,D0       ; IIvi
207A:         BEQ.S   sys_check
207C:         CMPI.L  #45,D0       ; IIvm
2082:         BEQ.S   sys_check
 
2084:         LEA     IIci_handler,A6   ; -- IIci and IIsi
                                        ; BOTH share the same "Expansion I/O Space" (0x5300 0000)
2088:         CMPI.L  #11,D0       ; IIci
208E:         BEQ.S   sys_check
2090:         CMPI.L  #18,D0       ; IIsi 
2096:         BEQ.S   sys_check
 
2098:         LEA     LC_handler,A6    ; -- The LC-PDS family
209C:         CMPI.L  #19,D0       ; LC
20A2:         BEQ.S   sys_check
20A4:         CMPI.L  #37,D0       ; LCII
20AA:         BEQ.S   sys_check
20AC:         CMPI.L  #49,D0       ; Color Classic
20B2:         BEQ.S   sys_check
 
; Any other Model/Gestalt will bring up an error alert-box 
 
20B4:         MOVE    #$1B5B,D0    ; "Carrera040 does not support this Macintosh model."
20B8:         BRA.S   RET_err      ; -> "TST     D0 & RTS"
 
; We found a supported model, so keep on going checking for the OS version...
 
20BA: sys_check: MOVE.L  #'sysv',D0    ; Check OS version
20C0:         _Gestalt              ; (A0/selector:OSType):D0\OSErr 
20C2:         BNE.S   bad_conf      ; If we can't read it, fire general Error Msg
20C4:         MOVE.L  A0,D0
20C6:         CMPI    #$605,D0     ; System 6.0.5
20CA:         BGE.S   OS_ok        ; or greater
20CC: OS_bad: MOVE    #$1B5C,D0    ; "Carrera040 does not work with this version of the operating system."
20D0:         BRA.S   RET_err      ;
20D2: OS_ok:  MOVE.L  #'vm  ',D0   ; Check for enabled Virtual Memory
20D8:         _Gestalt             ; (A0/selector:OSType):D0\OSErr 
20DA:         BNE.S   bad_conf     ; If we can't read it, fire general Error Msg
20DC:         MOVE.L  A0,D0
20DE:         BTST    #0,D0
20E2:         BEQ.S   VM_ok
20E4:         MOVE    #$1B5D,D0    ; "Carrera040 does not work with Virtual Memory turned on. 
                                   ; Please turn off Virtual Memory in the Memory control panel and restart your Mac."
20E8:         BRA.S   RET_err
20EA: VM_ok:  JSR     (A6)         ; This is the actual HANDLER CALL, been set in $204C-$2098
20EC:         BNE.S   RET_err
20EE:         MOVE.B  34(A5),D0    ; 34(A5) seems to contanin the Jumper settings at the lowest 3 bits and only three of them are valid:
20F2:         CMPI.B  #7,D0        ; 7 -> 111
20F6:         BEQ.S   RET_ok
20F8:         CMPI.B  #6,D0        ; 6 -> 110
20FC:         BEQ.S   RET_ok
20FE:         CMPI.B  #5,D0        ; and 5 -> 101 
2102:         BEQ.S   RET_ok
2104:         MOVE    #$1B5E,D0    ; "Carrera040 does not recognize the jumper settings on the Speedster card. 
                                   ; Please check the settings against the manual.
2108:         BRA.S   RET_err
210A: RET_ok: MOVEQ   #0,D0        ; clear D0 (no errors)
210C:RET_err: TST     D0           ; Set the Z-Flag (D0 contains Err-Code) and
210E:         RTS                  ; return from Subroutine
2110:bad_conf:MOVE    #$1B5A,D0    ; "Carrera040 does not support your system configuration."
2114:         BRA     RET_err

Yes, there’s also stuff after the call to the handler, but let’s check that handler first.
As said in the beginning, I chose to take the “IIx route”. The MacII_handler code is actually just another vector jump-table which will later be used with offsets:

414E:  MacII_handler:  BRA   MacII_1st ; From II, IIx & IIcx
4152:                  BRA     MacII_2nd
4156:                  BRA     MacII_3rd
415A:                  BRA     MacII_4th

Let’s have a look into the first call MacII_1st:

3D14:  MacII_1st  MOVEM.L D1-D3/A0-A2/A6,-(A7) ; 1st call from MacII handler
3D18:           PUSH.L  8
3D1C:           LEA     data105,A0
3D20:           MOVE.L  A0,8       ; Is that the Bus Error Handler at 0x00000008?
3D24:           MOVE    #$1B5F,D3  ; 7007
3D28:           MOVEQ   #1,D0
3D2A:           _SwapMMUMode  
3D2C:           PUSH.B  D0
3D2E:           MOVEA.L A7,A6
3D30:           BSR     read_5300k2
3D34:           MOVEQ   #0,D3
3D36:  data105  MOVEA.L A6,A7
3D38:           POP.B   D0
3D3A:           _SwapMMUMode  
3D3C:           POP.L   8
3D40:           MOVEQ   #0,D0   ; ?
3D42:           MOVE    D3,D0   ; overwriting?
3D44:           BNE.S   lae_153
3D46:           MOVEM.L D1-D2/A0-A2,-(A7)
3D4A:           LEA     53_cmd_0,A0
3D4E:           MOVE.L  A0,6(A5)
3D52:           LEA     53_cmd_1x,A0
3D56:           MOVE.L  A0,10(A5)
3D5A:           LEA     read_5300k2,A0
3D5E:           MOVE.L  A0,14(A5)
3D62:           LEA     53_cmd_5.3,A0
3D66:           MOVE.L  A0,18(A5)
3D6A:           LEA     53_cmd_5.1,A0
3D6E:           MOVE.L  A0,26(A5)
3D72:           LEA     53_cmd_5.3.5.1,A0
3D76:           MOVE.L  A0,22(A5)
3D7A:           MOVEM.L (A7)+,D1-D2/A0-A2
3D7E:           BSR     read_5300k2
3D82:           ANDI.B  #7,D0
3D86:           MOVE.B  D0,34(A5)
3D8A:           MOVEQ   #0,D0
3D8C:  lae_153: MOVEM.L (A7)+,D1-D3/A0-A2/A6
3D90:           TST     D0
3D92:           RTS

As you can see, even in such simple and short subroutines are some things I just don’t get? For example why is the effective address of data105 written to 0x8? Is that replacing the Error Handler in the VBR?
Anyhow, I think I got the overall meaning of the rest of it. What happens is this:

After switching into 32bit mode (_SwapMMUMode) it reads a longword from 0x53000000. As initially mentioned, the C040 is mapped to this address. There are 2 identical functions to read from there, that’s why this one here called read_5300k2.
It looks like reading is sufficient because the result (returned in D7) is immediately overwritten by a pop. Also that BNE after two moves is beyond me (0x3D40)…  OTOH the rest of the code is pretty clear: It’s ‘populating’ the A5-world with subroutines I’d call 53-commands. These commands write a specific byte sequence to 0x53000000, obviously communicating with the C040. For better understanding I’ve named them e.g. 53_cmd_5.3.5.1 meaning writing 5, then 3, then 5 and finally 1 to this address.
At the end, 0x5300k is read again, this time the result is masked to the last bit and written to 34(A5) – this represents the C040 jumper-settings by the way.  Return from Subroutine…

Back in sys_check: this jumper-setting will be checked immediately for three valid settings: 111, 110 or 101 representing the supported CPU types (68040,68LC040,68EC040). If the setting is ok we’re done with sysDetect:and return to main:.

2nd handler

Located at 0x3D94 this is kind of  a ’50/50 subroutine’. One half is totally obvious (check RAM, ROM and addressing mode) and the other half is all greek to me… e.g. what is all that PUSHing about? There’s not a single POP inside this routine (or subroutines call from within).  Here’s a wild guess of mine:
It looks like 1 to 4  ‘RAM range triplets’ being pushed onto the stack and after that gestaltPhysicalRAMSize (#’ram ‘) is called, for example:

    3D9A:          CLR.L   -(A7) ; faster 'PUSH.L #00000000'
    3D9C:          CLR.L   -(A7) ; PUSH.L #00000000
    3D9E:          CLR.L   -(A7) ; PUSH.L #00000000
 
    3DA0:          PUSH.L  #$100000
    3DA6:          PUSH.L  #$50F00041
    3DAC:          PUSH.L  #$50F00000
 
    3DB2:          PUSH.L  #$2000
    3DB8:          PUSH.L  #$53000041
    3DBE:          PUSH.L  #$53000000
 
    3DC4:          PUSH.L  #$2000
    3DCA:          PUSH.L  #1
    3DD0:          PUSH.L  #$53002000
 
    3DD6:         MOVE.L  #'ram ',D0  ; Returns the number of bytes of the physical RAM 
    3DDC:         _Gestalt ; (A0/selector:OSType):D0\OSErr

But the gestaltPhysicalRAMSize call does not take parameters and simply returns the amount of available RAM.

The good thing is, this sub-routine works flawlessly on the SE/30 and we can move on…

instFPSP

instFPSP is the next call in line. I’m not going to discuss this code in detail because it actually doesn’t do much. Still there are many inline comments in this routine if you like to know more. Here’s the background:

The FPU in the 68040 was made incapable of IEEE transcendental functions, which had been supported by both the 68881 and 68882 and were used by the popular  fractal generating software of the time and little else.  The Motorola floating point support package (FPSP) emulated these instructions  in software under interrupt. As this was an exception handler, heavy use of  the transcendental functions caused severe performance penalties.

TLDR; Check for FPU(type) and load the FPSP code from the resource-fork into RAM. Done. Return to main:.

Phew, that’s it for now. In the next post/chapter we’ll touch the 3rd handler, which was really hard to decipher but interesting stuff to learn, too.

Carrera 040 in an SE/30

As promised in my blog entry nearly one year ago, here’s the (monster) post about this project.

Background

Boy, what a ride! This is definitely my most complex (and still ongoing) software reverse engineering stunt ever!!
When starting this venture I was a blue-eyed Mac user and just-for-fun programmer and never imagined to learn this much about those machines I loved since 1985… by the way of a very nice guy I was finally able to get an SE/30. Immediately I thought of accelerating the cutie.
This first post will give you an insight about the workflow, hardware and software used. Following posts will then guide you deep into the code…

The MicroMac Carrera040

For many years I had a Carrera040 (or C040 for short)  – a Motorola 68040 accelerator for Apple Macintoshes – in my locker which I bought in wise foresight without even owning a Mac to plug it in. The C040 I got was meant for usage in a Macintosh IIci, plugged into its L2 cache-slot. That said, using special adapters, the C040 could also be used in other 68030 Macs like the IIx, IIcx, IIsi, IIvi/vx and the LC/LC II.

Is the Carrera a Speedster?

What’s this question about? Well, you might also have come about notions of an accelerator called the ‘Mobius Speedster‘ which is pretty similar to the C040.
Well, it is and my wild assumption is that at one point MicroMac bought the design from Mobius. There’s even a leftover in the C040’s ReadMe:
Applications that do not work with Quadra or Centris Macs are not likely to work on ‘040 accelerators, including the Carrera040. Generally, these incompatibilities are limited to the ‘040’s copy-back cache, or FAST mode on the Speedster.

So when I had my glorious SE/30 sitting on my desk it immediately came to my mind to make this card running in it.
You have to know, that the SE/30 is a somewhat shrinked-down version of a Mac IIx which again is pretty close to the IIci – and there was an adapter in existence to use another popular IIci accelerator in an SE/30 (Daystar Turbo 040). But it’s very rare and there’s next to no chance to find one. Anyhow, it’s doable, so I was hooked.
I stumbled across a cry for help in the 68kmla forum, a user owning such an adapter and a C040 tried to get it running in his SE/30… to no avail. So while still not having the proper adapter (yet) I thought “why not start looking into the driver while waiting for the hardware?”.
So the journey started…

MacNosy – a users nightmare, a hackers heaven.

My natural reflex is to reach deep into my tool-bag, get out my favorite disassembler/hex-viewer and start digging through its output. But for System 7 my bag was empty. Is there any disassembler at all?
While the good thing is, that most software packages which cost plenty of $$$ back then are abandonware today, the bad thing is that many are undocumented and unsupported. After some research it became clear that MacNosy was and still is the best m68k MacOS disassembler around.

Boy, this disassembler is powerful! But it seems to be written by Steve Jasic for, well, Steve Jasic. I know that kind of tools – I’ve written some of those… and never showed it to anybody because it was, erm, special. Prepare yourself for “everything will be different than you’ll expect it”. Steve gave a sh!# about UI or keyboard conventions. Cope with it.

Luckily there’s a very good review and some sort of documentation can be found here.

Same but different – which is where?

Does ‘A5-world’ ring a bell to you? No? Don’t worry, it was the same for me, even I am using Macs for a long time.
Even it’s an 68k system, there are so many things done different than e.g. in Amiga OS or Ataris TOS – so you have to learn a lot.

Because it would absolutely bloat this post, I will link to external pages explaining the used term. So watch for the first mentioning, it’ll be a URL…

The provided Carrera040 “drivers” consist of an INIT/Extension (“Startup Carrera”) and a Control Panel (“Carrera 040 1.8”).
In the provided readme file there’s the line “With version 1.8 we have included an extension which ensures the Carrera040 code to load very early in the boot process.

And indeed, the INIT code does not do much more than loading a specific resource from the control panels resource-fork.
So I concentrated on the control panel (CP for short). Using ResEdit, you’ll find the main detection and control-code in its resource fork called “SPDR’ (SPeedster DRiver, got it?).
While working through the code, commenting whatever I immediately understood (which wasn’t much in the beginning), I stumbled over several things you should also have an idea about before reading the disassembly in the coming chapters – so here’s a growing reading list:

Macsbug reloaded

During all that code-gazing, head-scratching and learning-new-things-every-day great luck struck and I virtually-met ‘Bolle’. A guy who created a clone of the mystical PDS-to-IIci-slot-adapter. Woohoo!

Even those 120pin DIN connectors are incredibly hard to find.

So after spending some Euros I was finally able to  jump into the ‘the real thing’ and try my patches in-vivo, or watch the code being executed. Thanks again, Bolle!

My C040 cramped into my beloved SE/30

The drill

The weapon-of-choice for watching code run is definitely Macsbug, the official debugger from Motorola, heavily modified by Apple through all the years until MacOS 9.2.
Back in the days my contact with Macsbug was very brief. When a program ‘bombed’, I’ve entered “g” (for Go) and hoped the system will somewhat heal and keeps running…

Ok, now I had to be somewhat more serious – and my skills had improved over the last 20 years, so my routine turned into single-stepping and tracing through the code, skip certain instructions which might kill the code, watching all the registers and most important and watch how the Carrera “driver” behaves in an SE/30 vs. IIci.
I even created some macros (which have to saved into Macsbug own resource-fork!) and started an endless try-and-crash drill.

The working drill is tedious: You step through the instructions, while following your steps in the disassembled source, to the point where it crashes. Remember/note the point (address) where it crashed and try again.
This means you have to manually trace closer to “the edge” but try not to fall off the cliff. And when you did – and I did many times – rinse & repeat.
Sometimes you can ‘skip’ complete function calls containing hundreds of instructions (called ‘Trace’), sometimes you have to sit-through (i.e. single-step) a very, very, very long loop just to be sure it works 100%.

The next post/chapters (work in progress – adding more while time allows) will finally dive into the control panels code.
While it’s all about this specific ‘driver’ I’m sure it’ll help everybody who starts the adventure of understanding pretty low-level 68k Macintosh code.

Fixing the Quadra 950 power-supply

One day the power-supply of my beloved Quadra went “bzzzzzt-poof” (sparks flying, smoke rising).
First I was desperate… try to find a Quadra 950 power-supply these days. And if you find one, it’ll cost you the same you’d pay for a complete system… and then you’re still not sure, if it won’t go poof very soon, too.

Then I thought: Hey, why don’t replace the whole thing by something more modern, much cheaper and smaller?!
It’s already been proven with other Mac systems that using a standard PC power-supply (PSU for short) is feasible, so why should a Quadra 950 power-supply be much different?
As there are some descriptions out there – mainly for those “10-pin connectors” (Mac IIcx, vx or PMac family) I thought it might be helpful to have a step-by-step how-to with lots of photos…

We’re interuppting this transmission for a stern word of warning:

Working on/in power supplies is dangerous. I take no responsibility for any damages, injuries or fatal strokes! If you’re unsure if you can do this hack, just don’t do it!

First (obviously), unplug the unit. Leave it unplugged for 15 minutes or so. In most PSUs there are bleeder resistors that will bleed down the stored voltages inside over time BUT THERE ARE NO GUARANTEES!!

When you open the unit, understand that there are two hazards:

  1. Stored high voltage (200V or so) on the “input” side of the unit.
  2. Stored high currents (tens of amps) on the “output” side of the unit.

Either 1 or 2 can be deadly. If you die, it’s social darwinism. You have been warned. 

Dismantle

So for the initial try I pulled a cheap ATX power-supply out of my junk-pile (wife-speak for “basement”) and put it next to the original, humongous Supplyzilla created by Apple. Ok, that should fit in 😉

Opening Supplyzilla is another thing. About 10 screws, a cut in my finger and 30 mins of cursing later I found the reason why I wasn’t able to get the two case-parts apart: Don’t forget to unplug the fan cable :-/
After this you can fold-open the case and have a good view onto huge capacitors, lots of dirt and 1990s tech…

There are two PCBs each mounted to a side of the case. Unscrew and separate them – because mine was dead anyhow (and I wasn’t planning to repair it) separation was easy. Snip!
I wasn’t able to get the external cable-harness leading out of the case. So this was a 2nd snip-snap… I planned to use the original PC cables anyhow. You can reuse the original plug or use the one from the ATX PSU (if it’s a 24pin one).

Here they are… rest in pieces 😛

Fitting room

Next up: The first fitting… even still in its enclosure it would nearly fit two times.
BTW: I like the sticker saying in German “VORSICHT! GEFAHRENZONE!” (Caution, danger-zone!) That’s soooo 80’s Top-Gun 😀

Because we have to modify the ATX PSU, it had to be stripped down, too. Compared to Supplyzilla it’s pretty deserted in there…
Yes, it’s a cheapo one looking at the passive PFC and could have be designed much better, but for the 1st try it’s OK.

2nd fitting making sure all heatsinks are well placed into the airflow, cables close to outlets (the fan will be replaced by a big one, of course).

You will be assimilated…

Next step was a meditational one. Loosening all 24 pins clamped into the ATX plug. I used the “cut staple method”. One to the left, one to the right, push them in, pull the cable… repeat.
Configuration of the Q950 plug will be described further down. I suggest to do this as one of the last steps. For now just tape all loosened pins together to save them from bending.

Having the preparations done, it was on to the workshop drilling some holes and cutting screw threads (optional but nicer than holes and nuts)…

…to have a proper seating for the stand-offs. We don’t want the PCB touch the case, don’t we?

All fits nicely. The PFC screwed to the case like in its original habitat – ugly, but will do for now.

The noise filter was directly soldered to the power-out plug. Again ugly, unprofessional but OK for a try.

So the filter went into the case and unneeded cables like 3V, were shortended and kept inside the power-supply.

Connecting the dots

Now for the plug-configuration. All documentation I’ve found on the web were wrong – dead-wrong, even leading to drama
So this is the correct pinout for Quadra 900/950’s:

Having this, the mapping from the “Mac-plug” to your ATX-PSUs cables is pretty straightforward:

  • Red (5 volts) stays red
  • Black (Ground) stays black
  • Yellow (+12V) also stays that way, too
  • Orange (-12) is blue in your ATX-PSU
  • Blue (5V standby) is the violet cable in ATX-PSUs
  • White is mostly unused on ATX-PSUs – we will recycle that cable for PFW…

As for 5V/GND wires don’t worry if your ATX PSU doesn’t offer enough of them (10 each). Mine just had five 5V wires, so I populated every 2nd slot in the plug.

Killing me softly

Finally the ‘circuit’ for using the Quadras soft-power feature had to be implemented – a simple rocker-switch connecting the green (Pwr_on) wire to GND would do, but this is just so below the mighty Quadras grandeur.
I opted for the very simple, more reasonable solution using a NPN-resistor… even a 74LS04 is an overkill in this case. Just my 2ct, though.

This is the schematic, nothing spectacular:

Yeah, the symbols for the resistors are European, that’s because… well, I am European 😉 As for the transistor use any NPN you can get easily. E.g. BC547 or BC549 (again, European numbering scheme, you’ll have to figure out JEDEC names if needed).

And here’s the PSU before its going to be closed and put back into my Q950.
Pink box: The soft-power ‘circuit’ (yeah, I was too lazy to create a PCB for a transistor and two resistors).
Green box: Unneeded cables been cut (3V, extra 12V, pwr_good).

Everything else was routed outside (All 5V/GND, +12V, -12V, 5V_tickle and the white cable was cut and reused for PFW.

Hints: DON’T use the GND from fan-connectors! They might be modulated to control the fan speed and other wired things. Better use one of the e.g. unused 12V cable-pairs.
Check everything before closing the case – Again CAUTION!!! High voltage. You touch, you die!!

Woo-hoo! Lady Quadra works again!

Outlook

That was a cheap fix… about $5 in parts if you have to buy a used ATX PSU. To make things easier and look more professional I plan to design a small PCB which fits right where one of the original PSU PCBs sat, providing the 4 drive connectors (4-pin) so that you can keep using the original drive power-cables. Also I plan to add:

  • the soft-power circuit
  • a temperature fan speed controller
  • (optional) internal 5¼” power-connectors for solder-free connection to the external 4pin connectors.

Nothing happening in 2018?

Hey Axel, what’s up? No new post for months… are you still alive?

Yes, shame on me… it’s been really silent the last months. But that does not necessarily mean I’m not working on something.
Actually I was pulled into a 68k Apple Macintosh vortex which really got be big time. As you probably know I have a soft spot for all things 68000 and was a Mac user from day one (well, ~1985). User means that even I did not own one until 1994 (i.e. paying for it with own money) I borrowed, worked, sat in front of many models…

Lang story short: Got my hands onto a SE/30, pulled an 040 accelerator from the basement, learned that it will never work in there… got lit 😉 Bought a IIci for checking and from there it went on uncontrolled… So the challenge is on to get this IIci accelerator working in sweet lil’ SE/30. Read the thread here (68kmla.org forum).

While I was a Mac user, I never went real deep into programming them. So here I am, digging through 6098 lines of disassembly and learn ‘new’ things like A5-world, Mac boot processes, the dirtiest tricks in MacsBug, INITs, how control panels work and last but not least patching Toolbox calls from the machine language side of things. It’s a steep learning curve, a total waste of time & energy. And I love it!

P.S.: I’ll write down the whole stunt into a post one day. But for now it’s totally work-in-progress…