ONE - On-device Neural Engine
Loading...
Searching...
No Matches
luci::QuantizeOnnxFakeQuantModelPass Class Reference

Pass to create a quantized graph from a graph fake-quantized on onnx. More...

#include <QuantizeOnnxFakeQuantModelPass.h>

Collaboration diagram for luci::QuantizeOnnxFakeQuantModelPass:

Data Structures

struct  Context
 

Public Member Functions

 QuantizeOnnxFakeQuantModelPass (std::unique_ptr< Context > &&ctx)
 
virtual const char * name (void) const
 
bool run (loco::Graph *graph)
 
- Public Member Functions inherited from logo::Pass
virtual ~Pass ()=default
 

Detailed Description

Pass to create a quantized graph from a graph fake-quantized on onnx.

Definition at line 32 of file QuantizeOnnxFakeQuantModelPass.h.

Constructor & Destructor Documentation

◆ QuantizeOnnxFakeQuantModelPass()

luci::QuantizeOnnxFakeQuantModelPass::QuantizeOnnxFakeQuantModelPass ( std::unique_ptr< Context > &&  ctx)
inline

Definition at line 41 of file QuantizeOnnxFakeQuantModelPass.h.

41 : _ctx{std::move(ctx)}
42 {
43 assert(_ctx); // FIX_CALLER_UNLESS
44 assert(_ctx->default_activation_dtype); // FIX_CALLER_UNLESS
45 }

Member Function Documentation

◆ name()

virtual const char * luci::QuantizeOnnxFakeQuantModelPass::name ( void  ) const
inlinevirtual

Reimplemented from logo::Pass.

Definition at line 47 of file QuantizeOnnxFakeQuantModelPass.h.

47{ return "luci::QuantizeOnnxFakeQuantModelPass"; }

◆ run()

bool luci::QuantizeOnnxFakeQuantModelPass::run ( loco::Graph g)
virtual

How QuantizeOnnxFakeQuantModel works?

  1. Activation is quantized as below.

Before

[node(fp32)] -> [OnnxQuantizeLinear] -> [OnnxDequantizeLinear]

After

[node(q)]

  1. Weight(constant) are quantized as below.

Before

[Const(q w/o qparam)] -> [OnnxDequantizeLinear]

After

[Const(q)]

  1. Quantize constant activations
  2. Quantize with predecessors' qparams
  3. Update qparams of special operators
  4. Insert Quantize Op if an Op's input dtype and output dtype mismatch

Implements logo::Pass.

Definition at line 63 of file QuantizeOnnxFakeQuantModelPass.cpp.

64{
65 LOGGER(l);
66 INFO(l) << "QuantizeOnnxFakeQuantModelPass Start" << std::endl;
67
68 // Quantize Onnx QuantizeLinear-DequantizeLinear pattern
69 {
70 QuantizeOnnxQDQPass pass;
71 pass.run(g);
72 }
73
74 // Quantize Onnx const-DequantizeLinear pattern
75 {
76 QuantizeOnnxDequantizeLinearPass pass;
77 pass.run(g);
78 }
79
80 // Quantize const input activation
81 for (auto node : loco::active_nodes(loco::output_nodes(g)))
82 {
83 auto circle_node = loco::must_cast<luci::CircleNode *>(node);
84
85 QuantizeConstInputActivation qcia(_ctx->default_activation_dtype);
86 circle_node->accept(&qcia);
87 }
88
89 // Quantize nodes using their predecessors' qparams
90 {
91 QuantizeWithPredecessorPass pass;
92 pass.run(g);
93 }
94
95 // Update qparam of output of special Ops
96 for (auto node : loco::active_nodes(loco::output_nodes(g)))
97 {
98 auto circle_node = loco::must_cast<luci::CircleNode *>(node);
99
100 if (is_quantized(circle_node))
101 {
102 QuantizeSpecialActivation qsa(circle_node->dtype());
103 circle_node->accept(&qsa);
104 }
105 }
106
107 // Insert QuantizeOp if input/output dtype does not match
108 for (auto node : loco::active_nodes(loco::output_nodes(g)))
109 {
110 auto circle_node = loco::must_cast<luci::CircleNode *>(node);
111
112 InsertQuantizeOpOnDTypeMismatch iqoodm;
113 circle_node->accept(&iqoodm);
114 }
115
116 // Update output dtype
117 auto graph_outputs = g->outputs();
118 for (auto node : loco::output_nodes(g))
119 {
120 auto circle_node = loco::must_cast<luci::CircleOutput *>(node);
121 auto from = loco::must_cast<luci::CircleNode *>(circle_node->from());
122 circle_node->dtype(from->dtype());
123
124 auto graph_output = graph_outputs->at(circle_node->index());
125 graph_output->dtype(circle_node->dtype());
126 }
127
128 INFO(l) << "QuantizeOnnxFakeQuantModelPass End" << std::endl;
129 return false; // one time run
130}
#define LOGGER(name)
Definition Log.h:65
#define INFO(name)
Definition Log.h:68
std::set< loco::Node * > active_nodes(const std::vector< loco::Node * > &roots)
Enumerate all the nodes required to compute "roots".
std::vector< Node * > output_nodes(Graph *)
Definition Graph.cpp:101
bool is_quantized(const CircleNode *node)

References loco::active_nodes(), INFO, luci::is_quantized(), LOGGER, loco::output_nodes(), luci::QuantizeOnnxDequantizeLinearPass::run(), luci::QuantizeOnnxQDQPass::run(), and luci::QuantizeWithPredecessorPass::run().

Referenced by package.infer.session::inference().


The documentation for this class was generated from the following files: