ONE - On-device Neural Engine
All Data Structures Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Modules Pages
onert::exec::MultiModelExecutors Class Reference

Class to gather executors. More...

#include <MultiModelExecutors.h>

Collaboration diagram for onert::exec::MultiModelExecutors:

Public Member Functions

 MultiModelExecutors (void)=delete
 
 MultiModelExecutors (std::unique_ptr< ir::ModelEdges > model_edges)
 
 MultiModelExecutors (const MultiModelExecutors &)=delete
 
 MultiModelExecutors (MultiModelExecutors &&)=default
 
 ~MultiModelExecutors ()=default
 
void emplace (const ir::ModelIndex &model_index, const ir::SubgraphIndex &subg_index, std::unique_ptr< IExecutor > exec) override
 Insert executor in executor set.
 
IExecutorat (const ir::ModelIndex &model_index, const ir::SubgraphIndex &subg_index) const override
 Return executor of index.
 
uint32_t inputSize () const override
 Return executor set's number of input.
 
uint32_t outputSize () const override
 Return executor set's number of output.
 
const ir::OperandInfoinputInfo (const ir::IOIndex &index) const override
 Return NN package input tensor info.
 
const ir::OperandInfooutputInfo (const ir::IOIndex &index) const override
 Return NN package output tensor info.
 
void execute (const ExecutionContext &ctx) override
 Execute NN package executor set.
 
- Public Member Functions inherited from onert::exec::IExecutors
virtual ~IExecutors ()=default
 Virtual IExecutors destructor.
 
IExecutorentryExecutor () const
 

Detailed Description

Class to gather executors.

Definition at line 47 of file MultiModelExecutors.h.

Constructor & Destructor Documentation

◆ MultiModelExecutors() [1/4]

onert::exec::MultiModelExecutors::MultiModelExecutors ( void  )
delete

◆ MultiModelExecutors() [2/4]

onert::exec::MultiModelExecutors::MultiModelExecutors ( std::unique_ptr< ir::ModelEdges model_edges)
inline

Definition at line 51 of file MultiModelExecutors.h.

52 : _executors{}, _model_edges{std::move(model_edges)}, _edge_quant_layers{},
53 _edge_quant_tensors{}, _edge_tensors{}, _is_created_edge_quant_layers{false},
54 _pkg_input_quant_layers{}, _pkg_output_quant_layers{}, _pkg_input_quant_tensors{},
55 _pkg_output_quant_tensors{}, _pkg_input_tensors{}, _pkg_output_tensors{}
56 {
57 for (const auto &edge : _model_edges->edges)
58 {
59 _edge_map[edge.from].emplace_back(edge.to);
60 }
61 }

◆ MultiModelExecutors() [3/4]

onert::exec::MultiModelExecutors::MultiModelExecutors ( const MultiModelExecutors )
delete

◆ MultiModelExecutors() [4/4]

onert::exec::MultiModelExecutors::MultiModelExecutors ( MultiModelExecutors &&  )
default

◆ ~MultiModelExecutors()

onert::exec::MultiModelExecutors::~MultiModelExecutors ( )
default

Member Function Documentation

◆ at()

IExecutor * onert::exec::MultiModelExecutors::at ( const ir::ModelIndex model_index,
const ir::SubgraphIndex subg_index 
) const
overridevirtual

Return executor of index.

Parameters
[in]model_indexModel index
[in]subg_indexSubgraph index
Returns
Executor

Implements onert::exec::IExecutors.

Definition at line 62 of file MultiModelExecutors.cc.

64{
65 return _executors.at(std::make_pair(model_index, subg_index)).get();
66}

Referenced by execute(), inputInfo(), and outputInfo().

◆ emplace()

void onert::exec::MultiModelExecutors::emplace ( const ir::ModelIndex model_index,
const ir::SubgraphIndex subg_index,
std::unique_ptr< IExecutor exec 
)
overridevirtual

Insert executor in executor set.

Parameters
[in]model_indexModel index
[in]subg_indexSubgraph index
[in]execExecutor to insert

Implements onert::exec::IExecutors.

Definition at line 55 of file MultiModelExecutors.cc.

58{
59 _executors.emplace(std::make_pair(model_index, subg_index), std::move(exec));
60}

◆ execute()

void onert::exec::MultiModelExecutors::execute ( const ExecutionContext ctx)
overridevirtual

Execute NN package executor set.

Parameters
[in]ctxExecution context

Implements onert::exec::IExecutors.

Definition at line 353 of file MultiModelExecutors.cc.

354{
355 auto &desc = ctx.desc;
356
357 // Check supported multi model package
358 checkSupportedMultimodel();
359
360 // TODO Move creating type-aware quantization layers for edges in compilation stage
361 createEdgeQuantLayers();
362
363 // TODO Create IOTensors only once and recreate them only if nnpkg info changes
364 CreatePkgIOTensors(desc);
365
366 // TODO Create type-aware quantization layers only once and recreate them only if type changes
367 createPkgIOQuantLayers(desc);
368
369 // TODO Find better way to schedule order of executors
370 auto const model_count = modelCount();
371
372 auto find_from = [&](const ir::ModelIndex &model_index, const ir::SubgraphIndex &subg_index,
373 const ir::IOIndex &io_index) {
374 for (const auto &edge : _model_edges->edges)
375 {
376 if ((std::get<ir::ModelIndex>(edge.to) == model_index) &&
377 (std::get<ir::SubgraphIndex>(edge.to) == subg_index) &&
378 (std::get<ir::IOIndex>(edge.to) == io_index))
379 return edge.from;
380 }
381
382 throw std::runtime_error{"Cannot find edge for model input"};
383 };
384
385 // Execute each model
386 // NOTE May be better to use vector instead of unordered_map for _executors
387 for (auto model_index = ir::ModelIndex{0}; model_index.value() < model_count; model_index++)
388 {
389 // Find executor
390 auto executor = at(model_index, ir::SubgraphIndex{0});
391
392 // Set IOTensors
393 // TODO Set internal IOTensors only once
394 std::vector<backend::IPortableTensor *> inputs_inter;
395 std::vector<backend::IPortableTensor *> outputs_inter;
396 auto const input_size = executor->inputSize();
397 auto const output_size = executor->outputSize();
398 inputs_inter.resize(input_size);
399 outputs_inter.resize(output_size);
400
401 // Set inputs of executor
402 // TODO Create layer to allocate/deallocate buffers of EdgeTensor for each executor
403 for (uint32_t i = 0; i < input_size; i++)
404 {
405 const auto input_pkg_index = find_input_index(_model_edges->pkg_inputs, model_index,
407 const auto input_io_desc = ir::IODesc{model_index, ir::SubgraphIndex{0}, ir::IOIndex{i}};
408 if (input_pkg_index != -1)
409 {
410 // Allocate type-aware quantization tensors for nnpkg inputs and set internal tensors
411 if (_pkg_input_quant_tensors.find(input_io_desc) != _pkg_input_quant_tensors.end())
412 {
413 _pkg_input_quant_tensors[input_io_desc]->allocate_buffer();
414
415 inputs_inter[i] = _pkg_input_quant_tensors[input_io_desc].get();
416 }
417 else
418 {
419 inputs_inter[i] = _pkg_input_tensors[input_io_desc].get();
420 }
421 }
422 else
423 {
424 auto from_iodesc = find_from(model_index, ir::SubgraphIndex{0}, ir::IOIndex{i});
425
426 // Supported only sequantial execution of models
427 assert(std::get<ir::ModelIndex>(from_iodesc).value() < model_index.value());
428 assert(std::get<ir::SubgraphIndex>(from_iodesc).value() == 0);
429 const auto to_iodesc = ir::IODesc{model_index, ir::SubgraphIndex{0}, ir::IOIndex{i}};
430 if (_edge_quant_tensors.find(to_iodesc) == _edge_quant_tensors.end())
431 {
432 inputs_inter[i] = _edge_tensors.at(from_iodesc).get();
433 }
434 else
435 {
436 inputs_inter[i] = _edge_quant_tensors.at(to_iodesc).get();
437 }
438 assert(inputs_inter[i]->buffer() != nullptr);
439 }
440 }
441
442 // Set outputs of executor
443 for (uint32_t i = 0; i < output_size; i++)
444 {
445 const auto output_pkg_index = find_output_index(_model_edges->pkg_outputs, model_index,
447 const auto output_io_desc = ir::IODesc{model_index, ir::SubgraphIndex{0}, ir::IOIndex{i}};
448 if (output_pkg_index != -1)
449 {
450 // Allocate type-aware quantization tensors for nnpkg outputs and set internal tensors
451 if (_pkg_output_quant_tensors.find(output_io_desc) != _pkg_output_quant_tensors.end())
452 {
453 _pkg_output_quant_tensors[output_io_desc]->allocate_buffer();
454
455 outputs_inter[i] = _pkg_output_quant_tensors[output_io_desc].get();
456 }
457 else
458 {
459 outputs_inter[i] = _pkg_output_tensors[output_io_desc].get();
460 }
461 }
462 else
463 {
464 // Allocate buffer of `from` tensors
465 const auto from_iodesc = ir::IODesc{model_index, ir::SubgraphIndex{0}, ir::IOIndex{i}};
466 _edge_tensors[from_iodesc]->allocate_buffer();
467 outputs_inter[i] = _edge_tensors[from_iodesc].get();
468
469 // Allocate buffer of tensors for type-aware quantization
470 for (const auto &to_iodesc : _edge_map[from_iodesc])
471 {
472 _edge_tensors[from_iodesc]->increase_ref();
473 if (_edge_quant_tensors.find(to_iodesc) != _edge_quant_tensors.end())
474 {
475 auto type_aware_quant_tensor = _edge_quant_tensors.at(to_iodesc).get();
476 type_aware_quant_tensor->allocate_buffer();
477
478 _edge_tensors[from_iodesc]->decrease_ref();
479 }
480 }
481 }
482 }
483
484 _pkg_input_quant_layers[{model_index, ir::SubgraphIndex{0}}]->run();
485
486 executor->execute(inputs_inter, outputs_inter, ctx.options);
487
488 _edge_quant_layers[{model_index, ir::SubgraphIndex{0}}]->run();
489 _pkg_output_quant_layers[{model_index, ir::SubgraphIndex{0}}]->run();
490
491 // Release input buffers that are no longer needed
492 for (uint32_t i = 0; i < input_size; i++)
493 {
494 const auto input_pkg_index = find_input_index(_model_edges->pkg_inputs, model_index,
496
497 const auto to_iodesc = ir::IODesc{model_index, ir::SubgraphIndex{0}, ir::IOIndex{i}};
498 if (input_pkg_index == -1)
499 {
500 if (_edge_quant_tensors.find(to_iodesc) != _edge_quant_tensors.end())
501 {
502 // Decrease reference count of tensor for type-aware quantization if input tensor is the
503 // tensor
504 const auto to_iodesc = ir::IODesc{model_index, ir::SubgraphIndex{0}, ir::IOIndex{i}};
505 if (_edge_quant_tensors.find(to_iodesc) != _edge_quant_tensors.end())
506 {
507 _edge_quant_tensors[to_iodesc]->decrease_ref();
508 }
509 }
510 else
511 {
512 // Decrease reference count of `from` tensor if input tensor is the `from` tensor
513 const auto from_iodesc = find_from(model_index, ir::SubgraphIndex{0}, ir::IOIndex{i});
514 _edge_tensors[from_iodesc]->decrease_ref();
515
516 // Decrease reference count of nnpkg inputs
517 if (_pkg_input_quant_tensors.find(to_iodesc) != _pkg_input_quant_tensors.end())
518 {
519 _pkg_input_quant_tensors[to_iodesc]->decrease_ref();
520 }
521 }
522 }
523 }
524
525 // Release output buffers if those buffers are no longer used other executors because of
526 // type-aware quantization
527 // FIXME if tensors for type-aware quantization unified for the same `from` tensor and same type
528 for (uint32_t i = 0; i < output_size; i++)
529 {
530 auto from_iodesc = ir::IODesc{model_index, ir::SubgraphIndex{0}, ir::IOIndex{i}};
531
532 // Check if other executors will use the buffer of edge tensor
533 const auto &to_list = _edge_map[from_iodesc];
534 if (to_list.size() == 0)
535 {
536 // This condition means `from_iodesc` tensor is an output of nnpkg
537 continue;
538 }
539
540 bool to_be_release =
541 !std::any_of(to_list.begin(), to_list.end(), [&](const ir::IODesc &to_iodesc) {
542 // This condition means another executor uses the buffer of edge tensor
543 return _edge_quant_tensors.find(to_iodesc) == _edge_quant_tensors.end();
544 });
545
546 if (to_be_release)
547 {
548 // This edge tensor's buffer won't be used in other executors
549 // Tensors for type-aware quantization take over the role of this edge tensor instead
550 _edge_tensors[from_iodesc]->decrease_ref();
551 }
552
553 // Decrease reference count of nnpkg outputs
554 if (_pkg_output_quant_tensors.find(from_iodesc) != _pkg_output_quant_tensors.end())
555 {
556 _pkg_output_quant_tensors[from_iodesc]->decrease_ref();
557 }
558 }
559 }
560}
IExecutor * at(const ir::ModelIndex &model_index, const ir::SubgraphIndex &subg_index) const override
Return executor of index.
T value() const
Return underlying value.
Definition Index.h:137
std::tuple< ModelIndex, SubgraphIndex, IOIndex > IODesc
Definition NNPkg.h:30
virtual uint32_t inputSize() const =0
Get input size.

References at(), onert::exec::ExecutionContext::desc, onert::exec::IExecutor::inputSize(), onert::exec::ExecutionContext::options, and onert::util::Index< T, DummyTag >::value().

◆ inputInfo()

const ir::OperandInfo & onert::exec::MultiModelExecutors::inputInfo ( const ir::IOIndex index) const
overridevirtual

Return NN package input tensor info.

Parameters
[in]indexInput index
Returns
Tensor info

Implements onert::exec::IExecutors.

Definition at line 72 of file MultiModelExecutors.cc.

73{
74 auto const [model_index, subg_index, io_index] = _model_edges->pkg_inputs[index.value()];
75 auto const executor = at(model_index, subg_index);
76 return executor->inputInfo(io_index.value());
77}
loco::GraphInputIndex index(const TFPlaceholder *node)
Definition TFNode.cpp:54

References at(), and onert::util::Index< T, DummyTag >::value().

◆ inputSize()

uint32_t onert::exec::MultiModelExecutors::inputSize ( ) const
overridevirtual

Return executor set's number of input.

Returns
Number of input

Implements onert::exec::IExecutors.

Definition at line 68 of file MultiModelExecutors.cc.

68{ return _model_edges->pkg_inputs.size(); }

◆ outputInfo()

const ir::OperandInfo & onert::exec::MultiModelExecutors::outputInfo ( const ir::IOIndex index) const
overridevirtual

Return NN package output tensor info.

Parameters
[in]indexOutput index
Returns
Tensor info

Implements onert::exec::IExecutors.

Definition at line 79 of file MultiModelExecutors.cc.

80{
81 auto const [model_index, subg_index, io_index] = _model_edges->pkg_outputs[index.value()];
82 auto const executor = at(model_index, subg_index);
83 return executor->outputInfo(io_index.value());
84}

References at(), and onert::util::Index< T, DummyTag >::value().

◆ outputSize()

uint32_t onert::exec::MultiModelExecutors::outputSize ( ) const
overridevirtual

Return executor set's number of output.

Returns
Number of output

Implements onert::exec::IExecutors.

Definition at line 70 of file MultiModelExecutors.cc.

70{ return _model_edges->pkg_outputs.size(); }

The documentation for this class was generated from the following files: