Software pipelining is a loop transformation that changes the initial loop so that parts of different iterations execute at the same time. This scheduling technique exploits architectural instruction level parallelism.
It may also produce better loop schedules when stalls, hazards or latencies exist between instructions in the initial loop, if they can be avoided in the transformed loop.
Note that the DSP56800e architecture provides limited parallelism by means of parallel move instructions. These limitations narrow down the applicability of this transformation.
An example of software pipelining transformation:
#include "intrinsics_56800e.h"
int x[100], y[100], i;
long res;
void main()
{
long t=0;
for (i=0; i<100; i++)
{
t = L_mac(t, x[i], y[i]);
}
res = t;
}
This code will compile the loop-body into one cycle:
rep R1 mac Y0,X0,A X:(R0)+,Y0 X:(R3)+,X0
where mac instruction from first iteration of the loop executes in parallel with load instructions from the second iteration of the initial loop.
This transformation applies to the inner most loops of a program, and currently is enabled only for DO loops.
It is controlled by the -[no]swp command line switch, and it is by default enabled for optimization levels higher than 2. Otherwise #pragma swplevel on/off may be used to control the transformation. When optimizing for size, software pipelining is disabled, as it usually increases program size.