Public Member Functions
None	__init__ (self, Union[List[np.ndarray], np.ndarray, str] input_dataset, Union[List[np.ndarray], np.ndarray, str] expected_dataset, int batch_size, Optional[Tuple[int,...]] input_shape=None, Optional[Tuple[int,...]] expected_shape=None, Any dtype=np.float32)

Iterator[Tuple[List[np.ndarray], List[np.ndarray]]]	__iter__ (self)

Tuple[List[np.ndarray], List[np.ndarray]]	__next__ (self)

Tuple["DataLoader", "DataLoader"]	split (self, float validation_split)

Data Fields
	batched_inputs

	batched_expecteds

	num_samples

	batch_size

	index

Protected Member Functions
List[np.ndarray]	_process_dataset (self, Union[List[np.ndarray], np.ndarray, str] data, Optional[Tuple[int,...]] shape, Any dtype=np.float32)

np.ndarray	_load_data (self, str file_path, Optional[Tuple[int,...]] shape, Any dtype=np.float32)

np.ndarray	_load_raw (self, str file_path, Tuple[int,...] shape, Any dtype)

Tuple[List[List[np.ndarray]], List[List[np.ndarray]]]	_create_batches (self)

Detailed Description

A flexible DataLoader to manage training and validation data.
Automatically detects whether inputs are paths or NumPy arrays.

Definition at line 6 of file dataloader.py.

Constructor & Destructor Documentation

◆ init()

None package.experimental.train.dataloader.DataLoader.__init__	(		self,
		Union[List[np.ndarray], np.ndarray, str]	input_dataset,
		Union[List[np.ndarray], np.ndarray, str]	expected_dataset,
		int	batch_size,
		Optional[Tuple[int, ...]]	input_shape = `None`,
		Optional[Tuple[int, ...]]	expected_shape = `None`,
		Any	dtype = `np.float32`
	)

Initialize the DataLoader.

Args:
    input_dataset (list of np.ndarray | np.ndarray | str):
        List of input arrays where each array's first dimension is the batch dimension,
        or a single NumPy array, or a file path.
    expected_dataset (list of np.ndarray | np.ndarray | str):
        List of expected arrays where each array's first dimension is the batch dimension,
        or a single NumPy array, or a file path.
    batch_size (int): Number of samples per batch.
    input_shape (tuple[int, ...], optional): Shape of the input data if raw format is used.
    expected_shape (tuple[int, ...], optional): Shape of the expected data if raw format is used.
    dtype (type, optional): Data type of the raw file (default: np.float32).

Definition at line 11 of file dataloader.py.

                 dtype: Any = np.float32) -> None:
        """
        Initialize the DataLoader.
 
        Args:
            input_dataset (list of np.ndarray | np.ndarray | str):
                List of input arrays where each array's first dimension is the batch dimension,
                or a single NumPy array, or a file path.
            expected_dataset (list of np.ndarray | np.ndarray | str):
                List of expected arrays where each array's first dimension is the batch dimension,
                or a single NumPy array, or a file path.
            batch_size (int): Number of samples per batch.
            input_shape (tuple[int, ...], optional): Shape of the input data if raw format is used.
            expected_shape (tuple[int, ...], optional): Shape of the expected data if raw format is used.
            dtype (type, optional): Data type of the raw file (default: np.float32).
        """
        self.batch_size: int = batch_size
        self.inputs: List[np.ndarray] = self._process_dataset(input_dataset, input_shape,
                                                              dtype)
        self.expecteds: List[np.ndarray] = self._process_dataset(
            expected_dataset, expected_shape, dtype)
        self.batched_inputs: List[List[np.ndarray]] = []
 
        # Verify data consistency
        self.num_samples: int = self.inputs[0].shape[0]  # Batch dimension
        if self.num_samples != self.expecteds[0].shape[0]:
            raise ValueError(
                "Input data and expected data must have the same number of samples.")
 
        # Precompute batches
        self.batched_inputs, self.batched_expecteds = self._create_batches()
 

References package.experimental.train.dataloader.DataLoader._process_dataset(), luci_interpreter_pal::lstm_internal::LstmSizeInfo.batch_size, onert_micro::OMTrainingContext.batch_size, nnfw_train_info.batch_size, package.experimental.train.dataloader.DataLoader.batch_size, package.experimental.train.dataloader.DataLoader.batched_inputs, Operation.inputs, Request.inputs, circlechef::CircleImport.inputs(), crew::Part.inputs, luci::CircleReader.inputs(), luci::PGroup.inputs, luci::pass::Expression.inputs, mio::circle::Reader.inputs(), moco::ModelSignature.inputs(), tflinspect::Reader.inputs(), tflchef::TFliteImport.inputs(), tflread::Reader.inputs(), luci_interpreter::CircleReader.inputs(), onert_micro::core::reader::OMCircleReader.inputs(), moco::tf::test::TFNodeBuildTester.inputs(), moco::test::TFNodeBuildTester.inputs(), moco::test::TFNodeBuildTester.inputs(), luci::CircleFakeQuant.inputs(), moco::TFFakeQuantWithMinMaxVars.inputs(), luci::CircleAddN.inputs(), luci::CircleCustom.inputs(), luci::CircleAddN.inputs(), luci::CircleCustom.inputs(), loco::Graph.inputs(), nnc::sir::CallFunction.inputs, nnkit::support::onnx::Runner.inputs(), ann::Operation.inputs(), loco::Graph.inputs(), luci::CircleFakeQuant.inputs(), moco::TFFakeQuantWithMinMaxVars.inputs(), validate_onnx2circle.OnnxRunner.inputs, onert_micro::execute::OMRuntimeKernel.inputs, nnfw_custom_kernel_params.inputs, package.common.basesession.BaseSession.inputs, onert::exec::IODescription.inputs, and package.experimental.train.dataloader.DataLoader.num_samples.

Member Function Documentation

◆ iter()

Iterator[Tuple[List[np.ndarray], List[np.ndarray]]] package.experimental.train.dataloader.DataLoader.__iter__ ( self )

Make the DataLoader iterable.

Returns:
    self

Definition at line 190 of file dataloader.py.

    def __iter__(self) -> Iterator[Tuple[List[np.ndarray], List[np.ndarray]]]:
        """
        Make the DataLoader iterable.
 
        Returns:
            self
        """
        self.index = 0
        return self
 

◆ next()

Tuple[List[np.ndarray], List[np.ndarray]] package.experimental.train.dataloader.DataLoader.__next__ ( self )

Return the next batch of data.

Returns:
    tuple: (inputs, expecteds) for the next batch.

Definition at line 200 of file dataloader.py.

    def __next__(self) -> Tuple[List[np.ndarray], List[np.ndarray]]:
        """
        Return the next batch of data.
 
        Returns:
            tuple: (inputs, expecteds) for the next batch.
        """
        if self.index >= len(self.batched_inputs):
            raise StopIteration
 
        # Retrieve precomputed batch
        input_batch = self.batched_inputs[self.index]
        expected_batch = self.batched_expecteds[self.index]
 
        self.index += 1
        return input_batch, expected_batch
 

◆ _create_batches()

Tuple[List[List[np.ndarray]], List[List[np.ndarray]]] package.experimental.train.dataloader.DataLoader._create_batches ( self )

protected

Precompute batches for inputs and expected outputs.

Returns:
    tuple: Lists of batched inputs and batched expecteds.

Definition at line 149 of file dataloader.py.

    def _create_batches(self) -> Tuple[List[List[np.ndarray]], List[List[np.ndarray]]]:
        """
        Precompute batches for inputs and expected outputs.
 
        Returns:
            tuple: Lists of batched inputs and batched expecteds.
        """
        batched_inputs: List[List[np.ndarray]] = []
        batched_expecteds: List[List[np.ndarray]] = []
 
        for batch_start in range(0, self.num_samples, self.batch_size):
            batch_end = min(batch_start + self.batch_size, self.num_samples)
 
            # Collect batched inputs
            inputs_batch = [
                input_array[batch_start:batch_end] for input_array in self.inputs
            ]
            if batch_end - batch_start < self.batch_size:
                # Resize the last batch to match batch_size
                inputs_batch = [
                    np.resize(batch, (self.batch_size, *batch.shape[1:]))
                    for batch in inputs_batch
                ]
 
            batched_inputs.append(inputs_batch)
 
            # Collect batched expecteds
            expecteds_batch = [
                expected_array[batch_start:batch_end] for expected_array in self.expecteds
            ]
            if batch_end - batch_start < self.batch_size:
                # Resize the last batch to match batch_size
                expecteds_batch = [
                    np.resize(batch, (self.batch_size, *batch.shape[1:]))
                    for batch in expecteds_batch
                ]
 
            batched_expecteds.append(expecteds_batch)
 
        return batched_inputs, batched_expecteds
 

◆ _load_data()

np.ndarray package.experimental.train.dataloader.DataLoader._load_data	(		self,
		str	file_path,
		Optional[Tuple[int, ...]]	shape,
		Any	dtype = `np.float32`
	)

protected

Load data from a file, supporting both .npy and raw formats.

Args:
    file_path (str): Path to the file to load.
    shape (tuple[int, ...], optional): Shape of the data if raw format is used.
    dtype (type, optional): Data type of the raw file (default: np.float32).

Returns:
    np.ndarray: Loaded data as a NumPy array.

Definition at line 83 of file dataloader.py.

                   dtype: Any = np.float32) -> np.ndarray:
        """
        Load data from a file, supporting both .npy and raw formats.
 
        Args:
            file_path (str): Path to the file to load.
            shape (tuple[int, ...], optional): Shape of the data if raw format is used.
            dtype (type, optional): Data type of the raw file (default: np.float32).
 
        Returns:
            np.ndarray: Loaded data as a NumPy array.
        """
        _, ext = os.path.splitext(file_path)
 
        if ext == ".npy":
            # Load .npy file
            return np.load(file_path)
        elif ext in [".bin", ".raw"]:
            # Load raw binary file
            if shape is None:
                raise ValueError(f"Shape must be provided for raw file: {file_path}")
            return self._load_raw(file_path, shape, dtype)
        else:
            raise ValueError(f"Unsupported file format: {ext}")
 

References package.experimental.train.dataloader.DataLoader._load_raw().

Referenced by package.experimental.train.dataloader.DataLoader._process_dataset().

◆ _load_raw()

np.ndarray package.experimental.train.dataloader.DataLoader._load_raw	(		self,
		str	file_path,
		Tuple[int, ...]	shape,
		Any	dtype
	)

protected

Load raw binary data.

Args:
    file_path (str): Path to the raw binary file.
    shape (tuple[int, ...]): Shape of the data to reshape into.
    dtype (type): Data type of the binary file.

Returns:
    np.ndarray: Loaded data as a NumPy array.

Definition at line 111 of file dataloader.py.

    def _load_raw(self, file_path: str, shape: Tuple[int, ...], dtype: Any) -> np.ndarray:
        """
        Load raw binary data.
 
        Args:
            file_path (str): Path to the raw binary file.
            shape (tuple[int, ...]): Shape of the data to reshape into.
            dtype (type): Data type of the binary file.
 
        Returns:
            np.ndarray: Loaded data as a NumPy array.
        """
        # Calculate the expected number of elements based on the provided shape
        expected_elements: int = int(np.prod(shape))
 
        # Calculate the expected size of the raw file in bytes
        expected_size: int = expected_elements * np.dtype(dtype).itemsize
 
        # Get the actual size of the raw file
        actual_size: int = os.path.getsize(file_path)
 
        # Check if the sizes match
        if actual_size != expected_size:
            raise ValueError(
                f"Raw file size ({actual_size} bytes) does not match the expected size "
                f"({expected_size} bytes) based on the provided shape {shape} and dtype {dtype}."
            )
 
        # Read and load the raw data
        with open(file_path, "rb") as f:
            data = f.read()
        array = np.frombuffer(data, dtype=dtype)
        if array.size != expected_elements:
            raise ValueError(
                f"Raw data size does not match the expected shape: {shape}. "
                f"Expected {expected_elements} elements, got {array.size} elements.")
        return array.reshape(shape)
 

Referenced by package.experimental.train.dataloader.DataLoader._load_data().

◆ _process_dataset()

List[np.ndarray] package.experimental.train.dataloader.DataLoader._process_dataset	(		self,
		Union[List[np.ndarray], np.ndarray, str]	data,
		Optional[Tuple[int, ...]]	shape,
		Any	dtype = `np.float32`
	)

protected

Process a dataset or file path.

Args:
    data (str | np.ndarray | list[np.ndarray]): Path to file or NumPy arrays.
    shape (tuple[int, ...], optional): Shape of the data if raw format is used.
    dtype (type, optional): Data type for raw files.

Returns:
    list[np.ndarray]: Loaded or passed data as NumPy arrays.

Definition at line 49 of file dataloader.py.

                         dtype: Any = np.float32) -> List[np.ndarray]:
        """
        Process a dataset or file path.
 
        Args:
            data (str | np.ndarray | list[np.ndarray]): Path to file or NumPy arrays.
            shape (tuple[int, ...], optional): Shape of the data if raw format is used.
            dtype (type, optional): Data type for raw files.
 
        Returns:
            list[np.ndarray]: Loaded or passed data as NumPy arrays.
        """
        if isinstance(data, list):
            # Check if all elements in the list are NumPy arrays
            if all(isinstance(item, np.ndarray) for item in data):
                return data
            raise ValueError("All elements in the list must be NumPy arrays.")
        if isinstance(data, np.ndarray):
            # If it's already a NumPy array and is not a list of arrays
            if data.ndim > 1:
                # If the array has multiple dimensions, split it into a list of arrays
                return [data[i] for i in range(data.shape[0])]
            else:
                # If it's a single array, wrap it into a list
                return [data]
        elif isinstance(data, str):
            # If it's a string, assume it's a file path
            return [self._load_data(data, shape, dtype)]
        else:
            raise ValueError("Data must be a NumPy array or a valid file path.")
 

References package.experimental.train.dataloader.DataLoader._load_data().

Referenced by package.experimental.train.dataloader.DataLoader.__init__().

◆ split()

Tuple["DataLoader", "DataLoader"] package.experimental.train.dataloader.DataLoader.split	(		self,
		float	validation_split
	)

Split the data into training and validation sets.

Args:
    validation_split (float): Ratio of validation data. Must be between 0.0 and 1.0.

Returns:
    tuple: Two DataLoader instances, one for training and one for validation.

Definition at line 217 of file dataloader.py.

    def split(self, validation_split: float) -> Tuple["DataLoader", "DataLoader"]:
        """
        Split the data into training and validation sets.
 
        Args:
            validation_split (float): Ratio of validation data. Must be between 0.0 and 1.0.
 
        Returns:
            tuple: Two DataLoader instances, one for training and one for validation.
        """
        if not (0.0 <= validation_split <= 1.0):
            raise ValueError("Validation split must be between 0.0 and 1.0.")
 
        split_index = int(len(self.inputs[0]) * (1.0 - validation_split))
 
        train_inputs = [input_array[:split_index] for input_array in self.inputs]
        val_inputs = [input_array[split_index:] for input_array in self.inputs]
        train_expecteds = [
            expected_array[:split_index] for expected_array in self.expecteds
        ]
        val_expecteds = [
            expected_array[split_index:] for expected_array in self.expecteds
        ]
 
        train_loader = DataLoader(train_inputs, train_expecteds, self.batch_size)
        val_loader = DataLoader(val_inputs, val_expecteds, self.batch_size)
 
        return train_loader, val_loader

References luci_interpreter_pal::lstm_internal::LstmSizeInfo.batch_size, onert_micro::OMTrainingContext.batch_size, nnfw_train_info.batch_size, package.experimental.train.dataloader.DataLoader.batch_size, Operation.inputs, Request.inputs, circlechef::CircleImport.inputs(), crew::Part.inputs, luci::CircleReader.inputs(), mio::circle::Reader.inputs(), moco::ModelSignature.inputs(), tflinspect::Reader.inputs(), tflchef::TFliteImport.inputs(), tflread::Reader.inputs(), luci_interpreter::CircleReader.inputs(), onert_micro::core::reader::OMCircleReader.inputs(), moco::tf::test::TFNodeBuildTester.inputs(), moco::test::TFNodeBuildTester.inputs(), moco::test::TFNodeBuildTester.inputs(), luci::CircleFakeQuant.inputs(), moco::TFFakeQuantWithMinMaxVars.inputs(), luci::CircleAddN.inputs(), luci::CircleCustom.inputs(), luci::CircleAddN.inputs(), luci::CircleCustom.inputs(), loco::Graph.inputs(), nnkit::support::onnx::Runner.inputs(), ann::Operation.inputs(), loco::Graph.inputs(), luci::CircleFakeQuant.inputs(), luci::PGroup.inputs, luci::pass::Expression.inputs, moco::TFFakeQuantWithMinMaxVars.inputs(), nnc::sir::CallFunction.inputs, validate_onnx2circle.OnnxRunner.inputs, onert_micro::execute::OMRuntimeKernel.inputs, nnfw_custom_kernel_params.inputs, package.common.basesession.BaseSession.inputs, and onert::exec::IODescription.inputs.

Field Documentation

◆ batch_size

package.experimental.train.dataloader.DataLoader.batch_size

Definition at line 159 of file dataloader.py.

Referenced by package.experimental.train.dataloader.DataLoader.__init__(), and package.experimental.train.dataloader.DataLoader.split().

◆ batched_expecteds

package.experimental.train.dataloader.DataLoader.batched_expecteds

Definition at line 47 of file dataloader.py.

Referenced by package.experimental.train.dataloader.DataLoader.__next__().

◆ batched_inputs

package.experimental.train.dataloader.DataLoader.batched_inputs

Definition at line 47 of file dataloader.py.

Referenced by package.experimental.train.dataloader.DataLoader.__init__(), and package.experimental.train.dataloader.DataLoader.__next__().

◆ index

package.experimental.train.dataloader.DataLoader.index

Definition at line 197 of file dataloader.py.

Referenced by package.experimental.train.dataloader.DataLoader.__next__().

◆ num_samples

package.experimental.train.dataloader.DataLoader.num_samples

Definition at line 159 of file dataloader.py.

Referenced by package.experimental.train.dataloader.DataLoader.__init__().

The documentation for this class was generated from the following file:

runtime/onert/api/python/package/experimental/train/dataloader.py

Public Member Functions

Data Fields

Protected Member Functions

Detailed Description

Constructor & Destructor Documentation

◆ __init__()

Member Function Documentation

◆ __iter__()

◆ __next__()

◆ _create_batches()

◆ _load_data()

◆ _load_raw()

◆ _process_dataset()

◆ split()

Field Documentation

◆ batch_size

◆ batched_expecteds

◆ batched_inputs

◆ index

◆ num_samples

◆ init()

◆ iter()

◆ next()