ONE - On-device Neural Engine
All Data Structures Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Modules Pages
luci::QuantizeOnnxFakeQuantModelPass Class Reference

Pass to create a quantized graph from a graph fake-quantized on onnx. More...

#include <QuantizeOnnxFakeQuantModelPass.h>

Collaboration diagram for luci::QuantizeOnnxFakeQuantModelPass:

Data Structures

struct  Context
 

Public Member Functions

 QuantizeOnnxFakeQuantModelPass (std::unique_ptr< Context > &&ctx)
 
virtual const char * name (void) const
 
bool run (loco::Graph *graph)
 
- Public Member Functions inherited from logo::Pass
virtual ~Pass ()=default
 

Detailed Description

Pass to create a quantized graph from a graph fake-quantized on onnx.

Definition at line 32 of file QuantizeOnnxFakeQuantModelPass.h.

Constructor & Destructor Documentation

◆ QuantizeOnnxFakeQuantModelPass()

luci::QuantizeOnnxFakeQuantModelPass::QuantizeOnnxFakeQuantModelPass ( std::unique_ptr< Context > &&  ctx)
inline

Definition at line 41 of file QuantizeOnnxFakeQuantModelPass.h.

41 : _ctx{std::move(ctx)}
42 {
43 assert(_ctx); // FIX_CALLER_UNLESS
44 assert(_ctx->default_activation_dtype); // FIX_CALLER_UNLESS
45 }

Member Function Documentation

◆ name()

virtual const char * luci::QuantizeOnnxFakeQuantModelPass::name ( void  ) const
inlinevirtual

Reimplemented from logo::Pass.

Definition at line 47 of file QuantizeOnnxFakeQuantModelPass.h.

47{ return "luci::QuantizeOnnxFakeQuantModelPass"; }

◆ run()

bool luci::QuantizeOnnxFakeQuantModelPass::run ( loco::Graph g)
virtual

How QuantizeOnnxFakeQuantModel works?

  1. Activation is quantized as below.

Before

[node(fp32)] -> [OnnxQuantizeLinear] -> [OnnxDequantizeLinear]

After

[node(q)]

  1. Weight(constant) are quantized as below.

Before

[Const(q w/o qparam)] -> [OnnxDequantizeLinear]

After

[Const(q)]

  1. Quantize constant activations
  2. Quantize with predecessors' qparams
  3. Update qparams of special operators
  4. Insert Quantize Op if an Op's input dtype and output dtype mismatch

Implements logo::Pass.

Definition at line 65 of file QuantizeOnnxFakeQuantModelPass.cpp.

66{
67 LOGGER(l);
68 INFO(l) << "QuantizeOnnxFakeQuantModelPass Start" << std::endl;
69
70 // Quantize Onnx QuantizeLinear-DequantizeLinear pattern
71 {
72 QuantizeOnnxQDQPass pass;
73 pass.run(g);
74 }
75
76 // Quantize Onnx const-DequantizeLinear pattern
77 {
78 QuantizeOnnxDequantizeLinearPass pass;
79 pass.run(g);
80 }
81
82 // Quantize const input activation
83 for (auto node : loco::active_nodes(loco::output_nodes(g)))
84 {
85 auto circle_node = loco::must_cast<luci::CircleNode *>(node);
86
87 QuantizeConstInputActivation qcia(_ctx->default_activation_dtype);
88 circle_node->accept(&qcia);
89 }
90
91 // Quantize nodes using their predecessors' qparams
92 {
93 logo::Phase phase;
94 phase.emplace_back(std::make_unique<QuantizeWithPredecessorPass>());
95
97 phase_runner.run(phase);
98 }
99
100 // Backward propagation of activation qparam
101 {
102 PropagateQParamBackwardPass pqbp(_ctx->default_activation_dtype);
103 pqbp.run(g);
104 }
105
106 // Update qparam of output of special Ops
107 for (auto node : loco::active_nodes(loco::output_nodes(g)))
108 {
109 auto circle_node = loco::must_cast<luci::CircleNode *>(node);
110
111 if (is_quantized(circle_node))
112 {
113 QuantizeSpecialActivation qsa(circle_node->dtype());
114 circle_node->accept(&qsa);
115 }
116 }
117
118 // Insert QuantizeOp if input/output dtype does not match
119 for (auto node : loco::active_nodes(loco::output_nodes(g)))
120 {
121 auto circle_node = loco::must_cast<luci::CircleNode *>(node);
122
123 InsertQuantizeOpOnDTypeMismatch iqoodm;
124 circle_node->accept(&iqoodm);
125 }
126
127 // Update output dtype
128 auto graph_outputs = g->outputs();
129 for (auto node : loco::output_nodes(g))
130 {
131 auto circle_node = loco::must_cast<luci::CircleOutput *>(node);
132 auto from = loco::must_cast<luci::CircleNode *>(circle_node->from());
133 circle_node->dtype(from->dtype());
134
135 auto graph_output = graph_outputs->at(circle_node->index());
136 graph_output->dtype(circle_node->dtype());
137 }
138
139 INFO(l) << "QuantizeOnnxFakeQuantModelPass End" << std::endl;
140 return false; // one time run
141}
#define LOGGER(name)
Definition Log.h:65
#define INFO(name)
Definition Log.h:68
std::set< loco::Node * > active_nodes(const std::vector< loco::Node * > &roots)
Enumerate all the nodes required to compute "roots".
std::vector< Node * > output_nodes(Graph *)
Definition Graph.cpp:101
std::vector< std::unique_ptr< Pass > > Phase
Definition Phase.h:31
bool is_quantized(const CircleNode *node)

References loco::active_nodes(), INFO, luci::is_quantized(), LOGGER, loco::output_nodes(), luci::PropagateQParamBackwardPass::run(), luci::QuantizeOnnxDequantizeLinearPass::run(), and luci::QuantizeOnnxQDQPass::run().


The documentation for this class was generated from the following files: