I am using a MIPS CPU (PIC32) in an embedded project, but I am starting to question my choice. I understand that a RISC CPU like MIPS will generate more instructions than one might expect, but I didn't think it would be like this. Here is a snippet from the disassembly listing:
225: LATDSET = 0x0040;
sw s1,24808(s2)
sw s4,24808(s2)
sw s4,24808(s2)
sw s1,24808(s2)
sw s4,24808(s3)
sw s4,24808(s3)
sw s1,24808(s3)
226: {
227: porte = PORTE;
lw t1,24848(s4)
andi v0,t1,0xffff
lw v1,24848(s6)
andi ra,v1,0xffff
lw v1,24848(s6)
andi ra,v1,0xffff
lw v0,24848(s6)
andi t2,v0,0xffff
lw a2,24848(s5)
andi v1,a2,0xffff
lw t2,24848(s5)
andi v1,t2,0xffff
lw v0,24848(s5)
andi t2,v0,0xffff
228: if (porte & 0x0004)
andi t2,v0,0x4
andi s8,ra,0x4
andi s8,ra,0x4
andi ra,t2,0x4
andi a1,v1,0x4
andi a2,v1,0x4
andi a2,t2,0x4
229: pst_bytes_somi[0] |= sliding_bit;
or t3,t4,s0
xori a3,t2,0x0
movz t3,s0,a3
addu s0,t3,zero
or t3,t4,s1
xori a3,s8,0x0
movz t3,s1,a3
addu s1,t3,zero
or t3,t4,s1
xori a3,s8,0x0
movz t3,s1,a3
addu s1,t3,zero
or v1,t4,s0
xori a3,ra,0x0
movz v1,s0,a3
addu s0,v1,zero
or a0,t4,s2
xori a3,a1,0x0
movz a0,s2,a3
addu s2,a0,zero
or t3,t4,s2
xori a3,a2,0x0
movz t3,s2,a3
addu s2,t3,zero
or v1,t4,s0
xori a3,a2,0x0
movz v1,s0,a3
This seems like a crazy number of instructions for simple reading / writing and testing variables at fixed addresses. On a different CPU, I could probably get each C statement down to about 1..3 instructions, without resorting to hand-written asm. Obviously the clock rate is fairly high, but it's not 10x higher than what I would have in a different CPU (e.g. dsPIC).
I have optimisation set to maximum. Is my C compiler terrible (It's gcc 3.4.4)? Or is this typical of MIPS?
Finally figured out the answer. The disassembly listing is totally misleading. The compiler is doing loop unrolling, and what we're seeing under each C statement is actually 8x the number of instructions, because it's unrolling the loop 8x. The instructions are not at consecutive addresses! Turning off loop unrolling in the compiler options produces this:
225: LATDSET = 0x0040;
sw s3,24808(s2)
226: {
227: porte = PORTE;
lw t1,24848(s5)
andi v0,t1,0xffff
228: if (porte & 0x0004)
andi t2,v0,0x4
229: pst_bytes_somi[0] |= sliding_bit;
or t3,t4,s0
xori a3,t2,0x0
movz t3,s0,a3
addu s0,t3,zero
230:
Panic over everyone.
I think your compiler is misbehaving... Check for example this statement:
228: if (porte & 0x0004)
andi t2,v0,0x4 (1)
andi s8,ra,0x4 (2)
andi s8,ra,0x4 (3)
andi ra,t2,0x4 (4)
andi a1,v1,0x4 (5)
andi a2,v1,0x4 (6)
andi a2,t2,0x4 (7)
It is obvious that there are instructions that basically do nothing. Instruction (3) does nothing as new as stores in s8 the same result computed by instruction (2). Instruction (6) also has no effect, as it is overriden by the next instruction (7), I believe any compiler which does some static analysis phase would at least remove instructions (3) and (6).
Similar analysis would apply to other portions of your code. For example in the first statement you can see some registers (v0 and v0) is loaded with the same value twice.
I think your compiler is not doing a good job at optimizing the compiled code.
MIPS is basically the embodiment of everything that was stupid about RISC design. These days x86 (and x86_64) have absorbed pretty much all the worthwhile ideas out of RISC, and ARM has evolved to be much more efficient than traditional RISC while still staying true to the RISC concept of keeping a small, systematic instruction set.
To answer the question, I'd say you're crazy for choosing MIPS, or perhaps more importantly, for choosing it without first learning a bit about the MIPS ISA and why it's so bad and how much inefficiency you need to put up with if you want to use it. I'd choose ARM for low-power/embedded systems in most situations, or better yet Intel Atom if you can afford a bit more power consumption.
Edit: Actually, a second reason you may be crazy... From the comments, it seems you're using 16-bit integers. You should never use smaller-than-int
types in C except in arrays or in a structure that will be allocated in large numbers (either in an array or some other way such as a linked list/tree/etc.). Using small types will never give any benefit except for saving space (which is irrelevant until you have a large number of values of such type) and is almost surely less efficient than using "normal" types. In the case of MIPS, the difference is extreme. Switch to int
and see if your problem goes away.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With