Internal CPU Plugin Optimizations

The CPU plugin supports several graph optimization algorithms, such as fusing or removing layers. Refer to the sections below for details.

NOTE: For layer descriptions, see the IR Notation Reference.

Fusing Convolution and Simple Layers

Merge of a convolution layer and any of the simple layers listed below:

Activation: ReLU, ELU, Sigmoid, Clamp
Depthwise: ScaleShift, PReLU
FakeQuantize

NOTE: You can have any number and order of simple layers.

A combination of a convolution layer and simple layers results in a single fused layer called Convolution:

flowchart TD
    subgraph subgraphA1[Runtime Graph]
    direction TB
    nodeA1(Input) --> nodeA2(Convolution)
    nodeA2(Convolution) --> nodeA3(Output)
    end
    subgraph subgraphB1[Original Graph]
    direction TB
    nodeB1(Input) --> nodeB2(Convolution)
    nodeB2(Convolution) --> nodeB3(Simple Layer)
    nodeB3(Simple Layer) --> nodeB4(...)
    nodeB4(...) --> nodeB5(Simple Layer)
    nodeB5(Simple Layer) --> nodeB6(Output)
    end
classDef no-bg-color fill:none,stroke-width:0px
classDef moss1 fill:#D7F3A2, stroke: #B1D272, color: #262626
classDef steel1 fill:#B9D6E5, stroke: #86B3CA, color: #262626
classDef daisy1 fill:#FFE17A, stroke: #FEC91B, color: #262626
class subgraphA1,subgraphB1,nodeB4 no-bg-color
class nodeA2 daisy1
class nodeB1,nodeB6,nodeA1,nodeA3 moss1
class nodeB2,nodeB3,nodeB5, steel1

Loading

Fusing Pooling and FakeQuantize Layers

A combination of Pooling and FakeQuantize layers results in a single fused layer called Pooling:

flowchart TD
    subgraph subgraphA1[Runtime Graph]
    direction TB
    nodeA1(Input) --> nodeA2(Pooling)
    nodeA2(Pooling) --> nodeA3(Output)
    end
    subgraph subgraphB1[Original Graph]
    direction TB
    nodeB1(Input) --> nodeB2("Pooling [Average]")
    nodeB2("Pooling [Average]") --> nodeB3(Fake Quantize)
    nodeB3(Fake Quantize) --> nodeB4(Output)
    end
classDef no-bg-color fill:none,stroke-width:0px
classDef moss1 fill:#D7F3A2, stroke: #B1D272, color: #262626
classDef steel1 fill:#B9D6E5, stroke: #86B3CA, color: #262626
classDef daisy1 fill:#FFE17A, stroke: #FEC91B, color: #262626
class subgraphA1,subgraphB1 no-bg-color
class nodeA2 daisy1
class nodeB1,nodeB4,nodeA1,nodeA3 moss1
class nodeB2,nodeB3 steel1

Loading

Fusing FullyConnected and Activation Layers

A combination of FullyConnected and Activation layers results in a single fused layer called FullyConnected:

flowchart TD
    subgraph subgraphA1[Runtime Graph]
    direction TB
    nodeA1(Input) --> nodeA2(FullyConnected)
    nodeA2(FullyConnected) --> nodeA3(Output)
    end
    subgraph subgraphB1[Original Graph]
    direction TB
    nodeB1(Input) --> nodeB2(FullyConnected)
    nodeB2(FullyConnected) --> nodeB3("Activation [ReLU]")
    nodeB3("Activation [ReLU]") --> nodeB4(Output)
    end
classDef no-bg-color fill:none,stroke-width:0px
classDef moss1 fill:#D7F3A2, stroke: #B1D272, color: #262626
classDef steel1 fill:#B9D6E5, stroke: #86B3CA, color: #262626
classDef daisy1 fill:#FFE17A, stroke: #FEC91B, color: #262626
class subgraphA1,subgraphB1 no-bg-color
class nodeA2 daisy1
class nodeB1,nodeB4,nodeA1,nodeA3 moss1
class nodeB2,nodeB3 steel1

Loading

Fusing Convolution and Depthwise Convolution Layers Grouped with Simple Layers

NOTE: This pattern is possible only on CPUs with support of Streaming SIMD Extensions 4.2 (SSE 4.2) and Intel AVX2 Instruction Set Architecture (ISA).

A combination of a group of a Convolution (or Binary Convolution) layer and simple layers and a group of a Depthwise Convolution layer and simple layers results in a single layer called Convolution (or Binary Convolution):

NOTE: Depthwise convolution layers should have the same values for the group, input channels, and output channels parameters.

flowchart TD
    subgraph subgraphA1[Runtime Graph]
    direction TB
    nodeA1(Input) --> nodeA2(Convolution)
    nodeA2(Convolution) --> nodeA3(Output)
    end
    subgraph subgraphB1[Original Graph]
    direction TB
    nodeB1(Input) --> nodeB2(Convolution)
    nodeB2(Convolution) --> nodeB3(Simple Layer)
    nodeB3(Simple Layer) --> nodeB4(...)
    nodeB4(...) --> nodeB5(Simple Layer)
    nodeB5(Simple Layer) --> nodeB6(Depthwise \n Convolution)
    nodeB6(Depthwise \n Convolution) --> nodeB7(Simple Layer)
    nodeB7(Simple Layer) --> nodeB8(...)
    nodeB8(...) --> nodeB9(Simple Layer)
    nodeB9(Simple Layer) --> nodeB10(Output)
    end
classDef no-bg-color fill:none,stroke-width:0px
classDef moss1 fill:#D7F3A2, stroke: #B1D272, color: #262626
classDef steel1 fill:#B9D6E5, stroke: #86B3CA, color: #262626
classDef daisy1 fill:#FFE17A, stroke: #FEC91B, color: #262626
class subgraphA1,subgraphB1,nodeB4,nodeB8 no-bg-color
class nodeA2 daisy1
class nodeB1,nodeA1,nodeA3,nodeB10 moss1
class nodeB2,nodeB3,nodeB5,nodeB6,nodeB7,nodeB9 steel1

Loading

Fusing Convolution and Sum Layers

A combination of convolution, simple, and Eltwise layers with the sum operation results in a single layer called Convolution:

flowchart TD
    subgraph subgraphA1[Runtime Graph]
    direction TB
    nodeA1(Input) --> nodeA4(Any Layer)
    nodeA4(Any Layer) --> nodeA2(Convolution)
    nodeA5(Input2) ---> nodeA2(Convolution)
    nodeA2(Convolution) --> nodeA3(Output)
    end
    subgraph subgraphB1[Original Graph]
    direction TB
    nodeB1(Input1) --> nodeB7(Any Layer)
    nodeB7(Any Layer) -----> nodeB2("Eltwise[op=sum]")
    nodeB8(Input) --> nodeB9(Convolution)
    nodeB9(Convolution) --> nodeB10(Simple Layer)
    nodeB10(Simple Layer) --> nodeB11(...)
    nodeB11(...) --> nodeB12(Simple Layer)
    nodeB12(Simple Layer) --> nodeB2("Eltwise[op=sum]")
    nodeB2("Eltwise[op=sum]") --> nodeB3(Simple Layer)
    nodeB3(Simple Layer) --> nodeB4(...)
    nodeB4(...) --> nodeB5(Simple Layer)
    nodeB5(Simple Layer) --> nodeB6(Output)
    end
classDef no-bg-color fill:none,stroke-width:0px
classDef moss1 fill:#D7F3A2, stroke: #B1D272, color: #262626
classDef steel1 fill:#B9D6E5, stroke: #86B3CA, color: #262626
classDef daisy1 fill:#FFE17A, stroke: #FEC91B, color: #262626
classDef coral1 fill:#FFB6B9, stroke: #FF848A, color: #262626
class subgraphA1,subgraphB1,nodeB4,nodeB11 no-bg-color
class nodeA2 daisy1
class nodeB1,nodeA5,nodeA1,nodeA3,nodeB6,nodeB8 moss1
class nodeB3,nodeB5,nodeA4,nodeB7,nodeB9,nodeB10,nodeB12 steel1
class nodeB2 coral1

Loading

Fusing a Group of Convolutions

If a topology contains the following pipeline, a CPU plugin merges split, convolution, and concatenation layers into a single convolution layer with the group parameter:

flowchart TD
    subgraph subgraphA1[Runtime Graph]
    direction TB
    nodeA1(Input) --> nodeA2(Convolution)
    nodeA2(Convolution) --> nodeA3(Output)
    end
    subgraph subgraphB1[Original Graph]
    direction TB
    nodeB1(Input) --> nodeB2(Split)
    nodeB2(Split) --> nodeB6(Convolution1)
    nodeB6(Convolution1) --> nodeB4(Concatenation)
    nodeB2(Split) --> nodeB3(Convolution3)
    nodeB2(Split) --> nodeB7(Convolution2)
    nodeB7(Convolution2) --> nodeB4(Concatenation)
    nodeB3(Convolution3) --> nodeB4(Concatenation)
    nodeB4(Concatenation) --> nodeB5(Output)

    end
classDef no-bg-color fill:none,stroke-width:0px
classDef moss1 fill:#D7F3A2, stroke: #B1D272, color: #262626
classDef steel1 fill:#B9D6E5, stroke: #86B3CA, color: #262626
classDef daisy1 fill:#FFE17A, stroke: #FEC91B, color: #262626
classDef coral-tint-2 fill:#FFB6B9, stroke: #FF848A, color: #262626
class subgraphA1,subgraphB1 no-bg-color
class nodeB4,nodeB2 coral-tint-2
class nodeA2 daisy1
class nodeB1,nodeA1,nodeA3,nodeB5 moss1
class nodeB3,nodeB6,nodeB7 steel1

Loading

NOTE: Parameters of the convolution layers must coincide.

Removing a Power Layer

CPU plugin removes a Power layer from a topology if it has the following parameters:

power = 1
scale = 1
offset = 0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

internal_cpu_plugin_optimization.md

internal_cpu_plugin_optimization.md

Internal CPU Plugin Optimizations

Fusing Convolution and Simple Layers

Fusing Pooling and FakeQuantize Layers

Fusing FullyConnected and Activation Layers

Fusing Convolution and Depthwise Convolution Layers Grouped with Simple Layers

Fusing Convolution and Sum Layers

Fusing a Group of Convolutions

Removing a Power Layer

See also

Files

internal_cpu_plugin_optimization.md

Latest commit

History

internal_cpu_plugin_optimization.md

File metadata and controls

Internal CPU Plugin Optimizations

Fusing Convolution and Simple Layers

Fusing Pooling and FakeQuantize Layers

Fusing FullyConnected and Activation Layers

Fusing Convolution and Depthwise Convolution Layers Grouped with Simple Layers

Fusing Convolution and Sum Layers

Fusing a Group of Convolutions

Removing a Power Layer

See also