Does anyone know any Compiler which optimizes code for energy consumption for embedded devices?

Question

It's a general view that a faster code will consume less power because you can put CPU in idle state for more time but when we talk about energy consumption, is following a possibility:

Suppose there's a instruction sequence which gets executed in 1ms and during the execution process the average current consumption was say 40mA . .and your Vdd is 3.3V

so total energy consumed = V*I*t = 3.3 * 40*10^-3 * 1*10^-3 Joules = 13.2*10^-6 Joules

and in another case there's a instruction sequence which gets executed in 2ms and during execution process the average current consumption is 15mA . .and Vdd is 3.3V

so total energey consumed = V*I*t = 3.3 * 15*10^-3 * 2*10^-3 Joules = 9.9*10^-6 Joules

so the question comes to. .. . Is there any architecture which has different instruction sets for performing the same task with different current consumptions?

And if there are ...then is there any compiler which takes this into account and generates a code which is energy efficient?

Simon Richter · Accepted Answer

There is none that I know of, but I think this should be possible using a compiler framework like LLVM, by adapting the instruction scheduler's weighting algorithm.

Edit: there has been a talk about Energy Consumption Analytics in LLVM at FOSDEM.

Ira Baxter · Answer

Virtually any "code optimization" done by a compiler, that computes the answer more quickly than the non-optimized code, is "energy saving". (As another poster observed, avoiding cache misses is a big win). So the real question is, "what optimizations are explicitly intended to save energy, vs. reduce execution time?" (Note: some "optimizations" reduce code footprint size (by abstracting sequences of code into subroutines, etc.); this may actually cost more energy).

An unusual one, that I have not seen in any compiler, is changing the representation of the data. It turns out that the cost of storing/transmitting a zero bit, is different than the cost of storing a one bit. (My experience with TTL and CMOS is "zero" are are more expensive, because they are implemented in hardware as a kind of "active pull-down" through a resistor from the powersupply, causing current flow thus heat, whereas "ones" are implemented by letting a signal "float high" through the same pull down). If there is a bias, then one should implement the program code and data to maximize the number of one bits, rather than zero bits.

For data, this should be relatively straightforward to do. See this paper for a very nice survey and analysis of value found in memory; it contains some pretty wonderful charts. A common theme is A large number of memory locations are occupied by members of a small set of distinct values. In fact, only a very small number of values (up to 8) occupy up to 48% of memory locations, often being very small numbers (the papers shows for some programs that a significant fraction of the data transfers are for small values, e.g., 0 to 4, with zero being essentially the most common value). If zeros are truly more expensive to store/transfer than ones, small common values suggest storing values in their ones complement format. This is a pretty easy optimization to implement. Given that the values are not always the smallest N naturals, one could replace the Nth most frequent value in memory with N and store the complement of N, doing a lookup of the actual value closer to the processor. (The paper's author suggests a hardware "value reuse" cache, but that's not a compiler optimization).

This is a bit hard to organize for program code, since the instruction set determines what you can say, and usually the instruction set was designed independently of any energy measurements. Yet one could choose different instruction sequences (that's what optimizers do) and maximized for one bits in the instruction stream. I doubt this is very effective on conventional instruction set opcodes. Once certainly could place variables into locations whose address has large numbers of one bits, and prefer use registers with higher numbers rather than lower ones (on the x86, EAX is binary-register-number 000 and EDI is register number 111) One could go so far as to design an instruction set according to instruction execution frequencies, assigning opcode with larger numbers of one bits to frequently executed instructions.

Does anyone know any Compiler which optimizes code for energy consumption for embedded devices?

Tags:

compiler-construction

computer-architecture

energy

Abhishek Mathur

2 Answers

Simon Richter

Ira Baxter

Recent Activity

Donate For Us

Does anyone know any Compiler which optimizes code for energy consumption for embedded devices?

Tags:

compiler-construction

computer-architecture

energy

Abhishek Mathur

2 Answers

Simon Richter

Ira Baxter

Related questions

Recent Activity

Donate For Us