Instruction Scheduler in LLVM

Instruction Scheduling in
LLVM
Hsiangkai Wang
[email protected]
Andes Technology
Agenda
Introduction to Instruction Scheduling
Scheduler in LLVM
Pipeline Modeling
Scheduler Customization
Instruction Scheduling
Diﬀerent operations take diﬀerent lengths of time.
Instruction scheduling is the process reordering

the operations in an attempt to decrease its
running time.
a( 1): load $x5_32, $x8_32, @a load, store: 3
b( 4): add $x5_32, $x5_32, $x5_32 add: 1
c( 5): load $x6_32, $x8_32, @b mul: 2
d( 8): mul $x5_32, $x5_32, $x6_32
e(10): load $x7_32, $x8_32, @c
f(13): mul $x5_32, $x5_32, $x7_32 a
g(15): load $x9_32, $x8_32, @d
h(18): mul $x5_32, $x5_32, $x9_32 x5_32
i(20): store $x5_32, $x8_32, @a
b c
x5_32 x6_32
d e
x5_32 x7_32
f g
x5_32 x9_32
h
x5_32
i
b( 4): add $x5_32, $x5_32, $x5_32 add: 1
c( 5): load $x6_32, $x8_32, @b mul: 2
d( 8): mul $x5_32, $x5_32, $x6_32
e(10): load $x7_32, $x8_32, @c
f(13): mul $x5_32, $x5_32, $x7_32 a 13
g(15): load $x9_32, $x8_32, @d
h(18): mul $x5_32, $x5_32, $x9_32 x5_32
i(20): store $x5_32, $x8_32, @a
b 10 c 12
a c e b d g f h i x5_32 x6_32
d 9 e 10
x5_32 x7_32
f 7 g 8
x5_32 x9_32
h 5
x5_32
i 3
b( 4): add $x5_32, $x5_32, $x5_32 add: 1
c( 5): load $x6_32, $x8_32, @b mul: 2
d( 8): mul $x5_32, $x5_32, $x6_32
e(10): load $x7_32, $x8_32, @c
f(13): mul $x5_32, $x5_32, $x7_32 a 13
g(15): load $x9_32, $x8_32, @d
h(18): mul $x5_32, $x5_32, $x9_32 x5_32
i(20): store $x5_32, $x8_32, @a
b 10 c 12
a c e b d g f h i x5_32 x6_32
d 9 e 10
x5_32 x7_32
a( 1): load $x5_32, $x8_32, @a
c( 2): load $x6_32, $x8_32, @b f 7 g 8
e( 3): load $x7_32, $x8_32, @c
x5_32
b( 4): add $x5_32, $x5_32, $x5_32 x9_32
d( 5): mul $x5_32, $x5_32, $x6_32 h 5
g( 6): load $x9_32, $x8_32, @d
f( 7): mul $x5_32, $x5_32, $x7_32 x5_32
h( 9): mul $x5_32, $x5_32, $x9_32
i(11): store $x5_32, $x8_32, @a i 3
Scheduler in LLVM SchedulerDAG
(unit latency) (2008)(Itineraries) -scheditins
(Itineraries) (2012)(SchedModel) -schedmodel
(2008) (2008)
ScheduleDAGSDNodes ScheduleDAGInstrs
(2008)
(2012) SchedulePostRA
ScheduleDAGMI
TDList
-pre-RA-sched=<value>
=fast
(2013)
=linearize
=list-burr LiveInterval ScheduleDAGMILive
=source RegPressure
=list-hybrid
=list-ilp
=vliw-td
Instruction
Register
Selector
MI MI
Allocator
(DAG)
-post-RA-scheduler
(2013) (2008)
SelectionDAGISel MachineScheduler PostRAScheduler
(SchedulePostRATDList)
ScheduleDAGSDNodes::Run (ScheduleDAGMILive)
-enable-misched (2013) PostMachine
(ScheduleDAGMI)
Scheduler
-enable-post-misched
TargetPassConfig::substitutePass(&PostRASchedulerID, &PostMachineSchedulerID)
Data Dependency Graph
a x10_32: data
a: $x10_32 = LUI @Arr x10_32: output

b: $x8_32 = CLI 10
c: SW $x8_32, $x10_32, @Arr b x8_32: data
d: $x10_32 = ADDI $x8_32, 0
e: Call @foo, implicit $x10_32 (ExitSU) c
x8_32: data
x10_32: anti
x10_32: artificial
e
Data Dependency Graph
a
a: SW $x10_32, $x27_32, 12 x10_32: anti
MayAliasMem
b: $x10_32 = LW $x9_32, 0
c: SW $x10_32, $x27_32, 0 b
d: $x10_32 = LW $x8_32, @Glob
... x10_32: output
MayAliasMem x10_32: data
MayAliasMem x10_32: anti
d
Pipeline Modeling for Target
Use target description to describe the pipeline model.
For architecture
Create scheduling categories for operands.
<Target>Schedule.td
Associate scheduling categories to instructions.
<Target>InstrInfo.td
For processor
Associate pipeline information to scheduling categories.
<Target>Schedule<Processor>.td
Associate per-operand SchedReadWrite types with Instructions by
modifying the Instruction definition to inherit from Sched.
Sched
+SchedRW
SchedReadWrite
SchedRW lists the per-operand types that
map to the machine model of an instruction. Associate with instructions
Define a scheduler Define a scheduler

resource associated with a resource associated with a
use operand. def operand.
SchedRead SchedWrite
Associate with target
Associate with subtargets
ProcReadAdvance ProcWriteResources
+Cycles
+ProcResources
+ValidWrites +ResourceCycles
+Latency
For use with InstRW or For use with InstRW or

ItinRW. ItinRW.
SchedReadAdvance SchedWriteRes
ReadAdvance WriteRes
+ReadType +WriteType
Define WriteRes and ReadAdvance to

InstRW: Map a set of opcodes to a list of SchedReadWrite types. associate processor resources and latency
This allow the sub target to easily override specific operations. with each SchedReadWrite type.
ItinRW: Map a set of itinerary classes to SchedReadWrite resources. inherentence

aggregate
Create scheduling categories
def ALUOut : SchedWrite; // For define operands of ALU op
def ALUIn : SchedRead; // For use operands of ALU op
def MULOut : SchedWrite; // For define operands of MUL op
def MULIn : SchedRead; // For use operands of MUL op
Associate instructions with scheduling categories

class ALU_ri<bits<3> funct3, string opcodestr>
: RVInstI<funct3, OPC_OP_IMM,
(outs GPR:$rd),
(ins GPR:$rs1, simm12:$imm12),
opcodestr, "$rd, $rs1, $imm12”>,
Sched<[ALUOut, ALUIn]>;
Associate pipeline information to scheduling categories
def UnitALU : ProcResource<1> { let BufferSize = 0; }
def UnitMDU : ProcResource<1> { let BufferSize = 0; }
def : WriteRes<ALUOut, [UnitALU]> { let Latency = 2; }

def : WriteRes<MULOut, [UnitMDU]> { let Latency = 4; }
def : ReadAdvance<ALUIn, 1>;
def : ReadAdvance<MULIn, 1>;
ISSUE ALU MDU WB
ALU
MDU
def : WriteRes<ALUOut, [UnitALU]> { let Latency = 2; }
def : WriteRes<MULOut, [UnitMDU]> { let Latency = 4; }
def : ReadAdvance<ALUIn, 1>;
def : ReadAdvance<MULIn, 1>;
Latency =
a: MUL r3, r3, r2 a MUL’s Latency - ADD’s Advance
b: ADD r4, r3, r2 r3: data
3
b
time ISSUE ALU MDU WB
t0 MUL
t1 ADD MUL
t2 stall MUL
t3 stall MUL
t4 ADD stall
GenericScheduler::tryCandidate
Physical register copies
Register pressure (Excess, CriticalMax)
Acyclic Latency
Cluster
Weak edges
Register pressure (CurrentMax)
Resource
Latency
Source order
Customize Scheduler for Target
Define your scheduling policy.
Define your scheduling strategy.
Add DAG mutations.

Implement overrideSchedPolicy
struct MachineSchedPolicy {
bool ShouldTrackPressure = false;
bool ShouldTrackLaneMasks = false;
bool OnlyTopDown = false;
bool OnlyBottomUp = false;
bool DisableLatencyHeuristic = false;
};
void
<Target>Subtarget::overrideSchedPolicy(MachineSchedPolicy &Policy,
unsigned NumRegionInstrs) const {
Policy.OnlyTopDown = false;
Policy.OnlyBottomUp = false;
Policy.ShouldTrackPressure = true;
}
Derive MachineSchedStrategy
MachineSchedStrategy
class YourStrategy : public GenericScheduler {
...
SUnit *pickNode(bool &IsTopNode) override {
// ...
// Your heuristic algorithm.
GenericSchedulerBase
//
return GenericScheduler::pickNode(IsTopNode); 
} 
};
GenericScheduler
DAG mutations
// Implement your mutation.
class CustomMutation : public ScheduleDAGMutation {
public:
void apply(ScheduleDAGInstrs *DAGInstrs) override;
};
std::unique_ptr<SchuduleDAGMutation>
llvm::createCustomMutation(const <Target>Subtarget *STI) {
return llvm::make_unique<CustomMutation>(STI); 
}
// Install
ScheduleDAGInstr *
createMachineScheduler(MachineSchedContext *C) const override {
const <Target>Subtarget &STI = C->MF->getSubtarget<<Target>Subtarget>();
ScheduleDAGMILive *DAG = createGenericSchedLive(C);
DAG->addMutation(createCustomMutation(STI));
return DAG; 
}
a a
b c b c
d d
Reference
Engineering a Compiler, 2nd Edition
2017 LLVM Developers’ Meeting: “Writing Great

Machine Schedulers”

Instruction Scheduler in LLVM

Uploaded by

Copyright:

Available Formats

Instruction Scheduler in LLVM

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Instruction Scheduler in LLVM

Uploaded by

Copyright:

Available Formats

Instruction Scheduling in

Diﬀerent operations take diﬀerent lengths of time.

Instruction scheduling is the process reordering

a: $x10_32 = LUI @Arr x10_32: output

MayAliasMem x10_32: anti

Create scheduling categories for operands.

Associate scheduling categories to instructions.

Associate pipeline information to scheduling categories.

Define a scheduler Define a scheduler

For use with InstRW or For use with InstRW or

Define WriteRes and ReadAdvance to

ItinRW: Map a set of itinerary classes to SchedReadWrite resources. inherentence

Associate instructions with scheduling categories

def : WriteRes<ALUOut, [UnitALU]> { let Latency = 2; }

ISSUE ALU MDU WB

Register pressure (Excess, CriticalMax)

Register pressure (CurrentMax)

Define your scheduling strategy.

Add DAG mutations.

2017 LLVM Developers’ Meeting: “Writing Great

You might also like