ONE - On-device Neural Engine
Loading...
Searching...
No Matches
arm_compute::NEFullyConnectedLayerEx Class Reference

#include <NEFullyConnectedLayerEx.h>

Collaboration diagram for arm_compute::NEFullyConnectedLayerEx:

Public Member Functions

 NEFullyConnectedLayerEx (std::shared_ptr< IMemoryManager > memory_manager=nullptr)
 
 NEFullyConnectedLayerEx (const NEFullyConnectedLayerEx &)=delete
 
 NEFullyConnectedLayerEx (NEFullyConnectedLayerEx &&)=delete
 
NEFullyConnectedLayerExoperator= (const NEFullyConnectedLayerEx &)=delete
 
NEFullyConnectedLayerExoperator= (NEFullyConnectedLayerEx &&)=delete
 
void configure (const ITensor *input, const ITensor *weights, const ITensor *biases, ITensor *output, FullyConnectedLayerInfo fc_info=FullyConnectedLayerInfo())
 
void run () override
 
void prepare () override
 

Static Public Member Functions

static Status validate (const ITensorInfo *input, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *output, FullyConnectedLayerInfo fc_info=FullyConnectedLayerInfo())
 

Detailed Description

Basic function to compute a Fully Connected layer on NEON. This function calls the following NEON kernels:

  1. NEIm2ColKernel (called when the input comes from a convolutional layer)
  2. NETranspose (if are_weights_reshaped is set to false and transpose_weights is set to true ) (called once)
  3. NEGEMMMatrixMultiplyKernel or NEGEMMLowpMatrixMultiplyCore (if quantized asymmetric)
Note
The fully connected layer accepts "weights" tensors only with 2 dimensions.
The difference from NEFullyConnectedLayer is that this class supports weights as input with performance loss.

Definition at line 70 of file NEFullyConnectedLayerEx.h.

Constructor & Destructor Documentation

◆ NEFullyConnectedLayerEx() [1/3]

arm_compute::NEFullyConnectedLayerEx::NEFullyConnectedLayerEx ( std::shared_ptr< IMemoryManager >  memory_manager = nullptr)

Constructor

Definition at line 145 of file NEFullyConnectedLayerEx.cpp.

146 : _memory_group(std::move(memory_manager)), _convert_weights(), _flatten_kernel(),
147 _reshape_weights_function(), _mm_gemm(), _mm_gemmlowp(), _flatten_output(),
148 _converted_weights_output(), _reshape_weights_output(), _are_weights_converted(true),
149 _are_weights_reshaped(false), _is_fc_after_conv(false), _is_quantized(false),
150 _is_prepared(false), _original_weights(nullptr)
151{
152}

◆ NEFullyConnectedLayerEx() [2/3]

arm_compute::NEFullyConnectedLayerEx::NEFullyConnectedLayerEx ( const NEFullyConnectedLayerEx )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ NEFullyConnectedLayerEx() [3/3]

arm_compute::NEFullyConnectedLayerEx::NEFullyConnectedLayerEx ( NEFullyConnectedLayerEx &&  )
delete

Default move constructor

Member Function Documentation

◆ configure()

void arm_compute::NEFullyConnectedLayerEx::configure ( const ITensor *  input,
const ITensor *  weights,
const ITensor *  biases,
ITensor *  output,
FullyConnectedLayerInfo  fc_info = FullyConnectedLayerInfo() 
)

Set the input and output tensors.

Parameters
[in]inputSource tensor. Data type supported: QASYMM8/F16/F32.
[in]weightsWeights tensor. The weights must be 2 dimensional. If this function is called after a Convolution Layer, the (transposed) weights will have as many rows as the product of the first 3 input's dimensions. If it is called after another FullyConnected Layer, the (transposed) weights will have as many rows as the input's first dimension. Data type supported: Same as input.
[in]biasesBias tensor. Can be nullptr. Data type supported:Same as input.
[out]outputDestination tensor. Its shape should be equal to the output of a matrix multiplication between:
  • The output of im2col on the input and the (transposed) 2D weights, if the function is called after a Convolution Layer
  • The input tensor and the (transposed) 2D weights, if the function is called after another FullyConnected Layer. Data type supported: Same as input.
[in]fc_info(Optional) Fully connected layer additional info

Definition at line 243 of file NEFullyConnectedLayerEx.cpp.

246{
247 // Perform validate step
248 ARM_COMPUTE_ERROR_ON_NULLPTR(input, weights, output);
249 ARM_COMPUTE_ERROR_THROW_ON(NEFullyConnectedLayerEx::validate(
250 input->info(), weights->info(), biases != nullptr ? biases->info() : nullptr, output->info(),
251 fc_info));
252
253 _are_weights_converted = true;
254 _are_weights_reshaped = fc_info.transpose_weights ? fc_info.are_weights_reshaped : true;
255 _is_fc_after_conv = true;
256 _is_quantized = is_data_type_quantized_asymmetric(input->info()->data_type());
257 _original_weights = weights;
258
259 // With the Fully Connected layer we can have 4 different cases:
260 // 1) Convolution layer -> Fully Connected layer without batches
261 // 2) Fully Connected layer -> Fully Connected layer without batches
262 // 3) Convolution layer -> Fully Connected layer with batches
263 // 4) Fully Connected layer -> Fully Connected layer with batches
264
265 const ITensor *weights_to_use = weights;
266
267 // Check if we have a fully connected layer with batches
268 const bool is_batched_fc_layer = output->info()->dimension(1) > 1;
269 if (is_batched_fc_layer)
270 {
271 _is_fc_after_conv =
272 (TensorShape::num_max_dimensions >= 4) &&
273 (std::equal(input->info()->tensor_shape().cbegin() + 3, input->info()->tensor_shape().cend(),
274 output->info()->tensor_shape().cbegin() + 1));
275 }
276 else
277 {
278 _is_fc_after_conv = input->info()->num_dimensions() > 1;
279 }
280
281 // Reshape weights if needed
282 if (!_are_weights_reshaped)
283 {
284 // Reshape the weights
285 _reshape_weights_function.configure(weights, &_reshape_weights_output);
286 weights_to_use = &_reshape_weights_output;
287 }
288
289 // Convert weights if needed
290 if (_is_fc_after_conv && (input->info()->data_layout() != fc_info.weights_trained_layout))
291 {
292 // Convert weights
293 _convert_weights.configure(weights_to_use, &_converted_weights_output,
294 input->info()->tensor_shape(), fc_info.weights_trained_layout);
295
296 weights_to_use = &_converted_weights_output;
297 _are_weights_converted = false;
298 }
299
300 if (_is_fc_after_conv)
301 {
302 // Fully Connected layer after a Convolution Layer without batches
303 configure_conv_fc(input, weights_to_use, biases, output, fc_info);
304 }
305 else
306 {
307 // Fully Connected layer after a Fully Connected Layer without batches
308 configure_fc_fc(input, weights_to_use, biases, output, fc_info);
309 }
310
311 _are_weights_reshaped = _are_weights_reshaped || fc_info.retain_internal_weights;
312}
static Status validate(const ITensorInfo *input, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *output, FullyConnectedLayerInfo fc_info=FullyConnectedLayerInfo())
volatile const char info[]

References validate().

◆ operator=() [1/2]

NEFullyConnectedLayerEx & arm_compute::NEFullyConnectedLayerEx::operator= ( const NEFullyConnectedLayerEx )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

NEFullyConnectedLayerEx & arm_compute::NEFullyConnectedLayerEx::operator= ( NEFullyConnectedLayerEx &&  )
delete

Default move assignment operator

References validate().

◆ prepare()

void arm_compute::NEFullyConnectedLayerEx::prepare ( )
override

Definition at line 451 of file NEFullyConnectedLayerEx.cpp.

452{
453 // DO NOTHING
454}

◆ run()

void arm_compute::NEFullyConnectedLayerEx::run ( )
override

Definition at line 399 of file NEFullyConnectedLayerEx.cpp.

400{
401 if (!_is_prepared)
402 {
403 if (!_are_weights_reshaped)
404 _reshape_weights_output.allocator()->allocate();
405 if (!_are_weights_converted)
406 _converted_weights_output.allocator()->allocate();
407 _is_prepared = true;
408 }
409
410 {
411 ARM_COMPUTE_ERROR_ON(!_original_weights->is_used());
412
413 // Reshape of the weights
414 if (!_are_weights_reshaped)
415 {
416 _reshape_weights_function.run();
417 }
418
419 // Convert weights if needed
420 if (!_are_weights_converted)
421 {
422 _convert_weights.run();
423 }
424
425 // Prepare GEMM prepare
426 if (!_is_quantized)
427 {
428 _mm_gemm.prepare();
429 }
430 }
431
432 MemoryGroupResourceScope scope_mg(_memory_group);
433
434 // Linearize input if it comes from a convolutional layer
435 if (_is_fc_after_conv)
436 {
437 _flatten_kernel.run();
438 }
439
440 // Run matrix multiply
441 if (_is_quantized)
442 {
443 _mm_gemmlowp.run();
444 }
445 else
446 {
447 _mm_gemm.run();
448 }
449}

Referenced by package.infer.session::inference().

◆ validate()

Status arm_compute::NEFullyConnectedLayerEx::validate ( const ITensorInfo *  input,
const ITensorInfo *  weights,
const ITensorInfo *  biases,
const ITensorInfo *  output,
FullyConnectedLayerInfo  fc_info = FullyConnectedLayerInfo() 
)
static

Static function to check if given info will lead to a valid configuration of NEFullyConnectedLayerEx

Parameters
[in]inputSource tensor info. Data type supported: QASYMM8/F16/F32.
[in]weightsWeights tensor info. The weights must be 2 dimensional. If this function is called after a Convolution Layer, the (transposed) weights will have as many rows as the product of the first 3 input's dimensions. If it is called after another FullyConnected Layer, the (transposed) weights will have as many rows as the input's first dimension. Data type supported: Same as input.
[in]biasesBias tensor info. Can be nullptr. Data type supported:Same as input.
[out]outputDestination tensor info. Its shape should be equal to the output of a matrix multiplication between:
  • The output of im2col on the input and the (transposed) 2D weights, if the function is called after a Convolution Layer
  • The input tensor and the (transposed) 2D weights, if the function is called after another FullyConnected Layer. Data type supported: Same as input.
[in]fc_info(Optional) Fully connected layer additional info
Returns
a status

Definition at line 314 of file NEFullyConnectedLayerEx.cpp.

317{
318 ARM_COMPUTE_UNUSED(fc_info.retain_internal_weights);
319 ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(input, weights, output);
320 ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(input, 1, DataType::QASYMM8, DataType::F16,
321 DataType::F32);
322 ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES(input, weights, output);
323 ARM_COMPUTE_RETURN_ERROR_ON(weights->num_dimensions() > 2);
324
325 bool weights_reshaped = fc_info.transpose_weights ? fc_info.are_weights_reshaped : true;
326 bool is_fc_after_conv = true;
327
328 const ITensorInfo &flatten_input =
329 TensorInfo(input->clone()->set_is_resizable(true).reset_padding().set_tensor_shape(
330 compute_flatten_shape(input)));
331 const ITensorInfo &reshaped_weights =
332 TensorInfo(weights->clone()->set_is_resizable(true).reset_padding().set_tensor_shape(
333 compute_transposed_shape(*weights)));
334 const ITensorInfo &converted_weights =
335 weights_reshaped ? TensorInfo(weights->clone()->set_is_resizable(true).reset_padding())
336 : TensorInfo(*reshaped_weights.clone());
337
338 // With the Fully Connected layer we can have 4 different cases:
339 // 1) Convolution layer -> Fully Connected layer without batches
340 // 2) Fully Connected layer -> Fully Connected layer without batches
341 // 3) Convolution layer -> Fully Connected layer with batches
342 // 4) Fully Connected layer -> Fully Connected layer with batches
343
344 const ITensorInfo *input_to_use = input;
345 const ITensorInfo *weights_to_use = weights;
346
347 // Check if we have a fully connected layer with batches
348 const bool is_batched_fc_layer = output->dimension(1) > 1;
349
350 if (is_batched_fc_layer)
351 {
352 is_fc_after_conv = (TensorShape::num_max_dimensions >= 4) &&
353 (std::equal(input->tensor_shape().cbegin() + 3, input->tensor_shape().cend(),
354 output->tensor_shape().cbegin() + 1));
355 }
356 else
357 {
358 is_fc_after_conv = input->num_dimensions() > 1;
359 }
360
361 if (!weights_reshaped)
362 {
363 // Validate reshape weights kernel
364 ARM_COMPUTE_RETURN_ON_ERROR(NETranspose::validate(weights, &reshaped_weights));
365 weights_to_use = &reshaped_weights;
366 }
367
368 if (is_fc_after_conv && (input->data_layout() != fc_info.weights_trained_layout))
369 {
370 // Validate convert weights kernel
371 ARM_COMPUTE_RETURN_ON_ERROR(NEConvertFullyConnectedWeights::validate(
372 weights_to_use, &converted_weights, input->tensor_shape(), fc_info.weights_trained_layout));
373 weights_to_use = &converted_weights;
374 }
375
376 if (is_fc_after_conv)
377 {
378 // Fully Connected layer after a Convolution Layer without batches
379 ARM_COMPUTE_RETURN_ERROR_ON(
380 (weights_to_use->dimension(1) !=
381 (input->dimension(0) * input->dimension(1) * input->dimension(2))));
382
383 // Validate flatten kernel
384 ARM_COMPUTE_RETURN_ON_ERROR(NEFlattenLayer::validate(input, &flatten_input));
385 input_to_use = &flatten_input;
386 }
387 else
388 {
389 // Fully Connected layer after a Fully Connected Layer without batches
390 ARM_COMPUTE_RETURN_ERROR_ON(input->dimension(0) != weights_to_use->dimension(1));
391 }
392 // Validate matrix multiply kernel
393 ARM_COMPUTE_RETURN_ON_ERROR(
394 validate_mm(*input_to_use, *weights_to_use, biases, *output, fc_info));
395
396 return Status{};
397}
luci::CircleConst * clone(luci::CircleConst *node)
Return cloned object of CircleConst node.

Referenced by configure(), and operator=().


The documentation for this class was generated from the following files: