Are you sure about the former? This is what I get with my Blizzard:
Code: Select all
BusSpeedTest 0.19 (mlelstv) Buffer: 262144 Bytes, Alignment: 32768
========================================================================
memtype addr op cycle calib bandwidth
fast $68518000 readw 67.5 ns normal 29.6 * 10^6 byte/s
fast $68518000 readl 112.3 ns normal 35.6 * 10^6 byte/s
fast $68518000 readm 113.0 ns normal 35.4 * 10^6 byte/s
fast $68518000 writew 82.3 ns normal 24.3 * 10^6 byte/s
fast $68518000 writel 166.3 ns normal 24.1 * 10^6 byte/s
fast $68518000 writem 163.8 ns normal 24.4 * 10^6 byte/s
chip $00080000 readw 906.6 ns normal 2.2 * 10^6 byte/s
chip $00080000 readl 908.0 ns normal 4.4 * 10^6 byte/s
chip $00080000 readm 907.9 ns normal 4.4 * 10^6 byte/s
chip $00080000 writew 709.3 ns normal 2.8 * 10^6 byte/s
chip $00080000 writel 709.0 ns normal 5.6 * 10^6 byte/s
chip $00080000 writem 709.0 ns normal 5.6 * 10^6 byte/s
Especially my write speeds are three times as fast as yours, which are also worse than on the A3000/25:
Code: Select all
BusSpeedTest 0.19 (mlelstv) Buffer: 262144 Bytes, Alignment: 32768
========================================================================
memtype addr op cycle calib bandwidth
fast $07A50000 readw 249.7 ns normal 8.0 * 10^6 byte/s
fast $07A50000 readl 330.2 ns normal 12.1 * 10^6 byte/s
fast $07A50000 readm 302.3 ns normal 13.2 * 10^6 byte/s
fast $07A50000 writew 246.8 ns normal 8.1 * 10^6 byte/s
fast $07A50000 writel 244.9 ns normal 16.3 * 10^6 byte/s
fast $07A50000 writem 219.6 ns normal 18.2 * 10^6 byte/s
chip $00018000 readw 817.3 ns normal 2.4 * 10^6 byte/s
chip $00018000 readl 812.9 ns normal 4.9 * 10^6 byte/s
chip $00018000 readm 636.1 ns normal 6.3 * 10^6 byte/s
chip $00018000 writew 574.0 ns normal 3.5 * 10^6 byte/s
chip $00018000 writel 573.4 ns normal 7.0 * 10^6 byte/s
chip $00018000 writem 577.2 ns normal 6.9 * 10^6 byte/s
Amazing how much faster chip mem is on the 3000. Note that I ran my chipmem test in PAL 8 color mode after a few tests with other modes, as that makes a significant difference. (Anything 16 colors or less should be fine.)
With a small buffer that fits in the cache our systems perform pretty much identically, one clock cycle per cache read/write, it seems:
Code: Select all
BusSpeedTest 0.19 (mlelstv) Buffer: 2048 Bytes, Alignment: 32768
========================================================================
memtype addr op cycle calib bandwidth
fast $684F8000 readw 20.3 ns normal 98.3 * 10^6 byte/s
fast $684F8000 readl 20.2 ns normal 198.0 * 10^6 byte/s
fast $684F8000 readm 20.3 ns normal 197.1 * 10^6 byte/s
fast $684F8000 writew 20.6 ns normal 97.3 * 10^6 byte/s
fast $684F8000 writel 20.0 ns normal 199.6 * 10^6 byte/s
fast $684F8000 writem 20.3 ns normal 197.2 * 10^6 byte/s
Hm, I opened the trapdoor to swap out my DIMM to see if that would make a difference with bustest (it didn't) and took a photo of the 68060:
So it seems that I have the rev 2, not the rev 5. And it gets pretty hot, even at the stock 50 MHz!
Not sure how much use it is to increase the CPU speed in an A1200 much more, as my impression is that it's the chip mem and the I/O that slows everything down. Although I guess some on-accelerator I/O like with the Vampire will take care of that.
Back to the FPU: I got "flops" from Github to test floating point performance on my 3000 and 1200. I also made the mistake of trying it on my Mac first...
Anyway, the A3000's 68030 can muster about 50 flops with the generic version that I assume is targeted for the 68000. (I used GCC, with -O2 optimizations in all cases. And note to use -DAmiga rather than -DUNIX.) The Blizzard is a bit more than ten times faster at a little over half a megaflop. Compiling for 68020 or 68030 didn't make a noticeable difference (and those two versions are the same size, probably identical.)
Things get more interesting with -m68030 -m68881, so 68030 and FPU. Now the A3000 results shot up to over a quarter megaflop! So that's 4 - 5.5 times faster than without the FPU. On the 68060 this version was also dramatically faster, with 3 - 13 megaflops, so 5 - 20 times faster than the no-FPU version.
So then I compiled for 68040 and 68060, which automatically enables the use of FPU code. Although those two executables weren't identical, the results were identical with the 68030+68881 version. I can claim actually identical because the results for the eight tests also show the size of the error in the calculation, and depending on the type of code used, this differs. 68030+68881, 68040 and 68060 all had the same error.
And now what we've all been waiting for: 68060 but no FPU (-m68060 -msoft-float). Those results are identical to the generic 68000 code: a bit over half a megaflop and at least 3 x slower than a 25 MHz 68030 + 68882. (Using the 68030 + 68881 version.)
Still not sure though what kind of software uses a significant amount of floating point math so you're going to see a difference. But unless the 68LC060 emulates the FPU instructions in software, I'm pretty sure the big issue is going to be having 68030 and up versions of programs assuming the presence of an FPU and then not working.