datagenerator.pertinent_negatives

Attributes

data

Classes

PertinentNegativesDataset

A dataset designed to investigate the impact of pertinent negative (PN) features

Module Contents

class datagenerator.pertinent_negatives.PertinentNegativesDataset(seed: int = 0, n_features: int = 5, n_samples: int = 10, distribution: str = 'normal', weight_range: Tuple[float, float] = (-1.0, 1.0), weights: torch.Tensor | None = None, pn_features: List[int] | None = None, pn_zero_likelihood: float = 0.5, pn_weight_factor: float = 10, baseline: str = 'zero')

Bases: xaiunits.datagenerator.WeightedFeaturesDataset

A dataset designed to investigate the impact of pertinent negative (PN) features on model predictions by introducing zero values in selected features, which are expected to significantly impact the output.

This dataset is useful for scenarios where the absence of certain features (indicated by zero values) provides important information for model predictions.

Inherits from:: WeightedFeaturesDataset: Class extending BaseFeaturesDataset with support for weighted features

pn_features

Indices of features considered as pertinent negatives.

Type:: list[int]

pn_zero_likelihood

Likelihood of a pertinent negative feature being set to zero.

Type:: float

pn_weight_factor

Weight factor applied to the pertinent negative features to emphasize their impact.

Type:: float

cat_features

Categorical features derived from the pertinent negatives.

Type:: list

labels

Generated labels with optional noise.

Type:: torch.Tensor

features

Name of the attribute representing the input features.

Type:: str

ground_truth_attribute

Name of the attribute considered as ground truth for analysis.

Type:: str

subset_data

List of attributes to be included in subsets.

Type:: list[str]

subset_attribute

Additional attributes to be considered in subsets.

Type:: list[str]

pn_zero_likelihood = 0.5

pn_weight_factor = 10

pn_features = [0]

cat_features = [0]

label_noise

labels

features = 'samples'

ground_truth_attribute = 'ground_truth'

subset_data = ['samples', 'weighted_samples', 'ground_truth']

subset_attribute

_intialize_pn_features(pn_features: List[int] | None) → List[int]

Validates and initializes the indices of features to be considered as pertinent negatives (PN).

Ensures that specified pertinent negative features are within the valid range of feature indices. Falls back to the first feature if pn_features is not specified or invalid.

Parameters:: pn_features (list of int, optional) – Indices of features specified as pertinent negatives.
Returns:: The validated list of indices for pertinent negative features.
Return type:: list[int]
Raises:: ValueError – If any specified pertinent negative feature index is out of the valid range or if the input is not a list.

_initialize_zeros_for_PN() → None

Sets the values of pertinent negative (PN) features to zero with a specified likelihood, across all samples in a vectorized manner.

This modification is performed directly on the samples attribute.

_get_new_weighted_samples() → None

Recalculates the weighted samples considering the introduction of zeros for pertinent negative features in a vectorized manner.

Adjusts the weight of features set to zero to emphasize their impact by using the pn_weight_factor. Updates the weighted_samples attribute with the new calculations.

_create_ground_truth_baseline(baseline: str) → None

Creates the ground truth baseline based on the specified baseline type (“zero” or “one”).

Parameters:: baseline (str) – Specifies the type of baseline to use. Must be either “zero” or “one”.
Raises:: KeyError – If the specified baseline is not “zero” or “one”.

__getitem__(idx: int, others: List[str] = ['ground_truth_attribute', 'baseline']) → Tuple[Any, Ellipsis]

Retrieve a sample and its associated label by index.

Parameters:

idx (int) – Index of the sample to retrieve.
others (list) – Additional items to retrieve. Defaults to [].

Returns:

Tuple containing the sample and its label.

Return type:

tuple

generate_model() → torch.nn.Module

Generates and returns a neural network model tailored for analyzing the impact of pertinent negatives.

The model is configured to incorporate the weights, pertinent negatives, and the pertinent negative weight factor.

Returns:

A neural network model designed to work with the dataset’s specific configuration,: including the pertinent negatives and their associated weight factor.

Return type:

model.PertinentNN

datagenerator.pertinent_negatives.data