datagenerator.conflicting

Attributes

data

Classes

ConflictingDataset

Generic synthetic dataset with feature cancellation capabilities.

Module Contents

class datagenerator.conflicting.ConflictingDataset(seed: int = 0, n_features: int = 2, n_samples: int = 10, distribution: str = 'normal', weight_range: Tuple[float, float] = (-1.0, 1.0), weights: torch.Tensor | None = None, cancellation_features: List[int] | None = None, cancellation_likelihood: float = 0.5)

Bases: xaiunits.datagenerator.WeightedFeaturesDataset

Generic synthetic dataset with feature cancellation capabilities.

Feature cancellations are based on likelihood. If cancellation_features are not provided, all features in each sample are candidates for cancellation, with a specified likelihood of each feature being canceled. Canceled features are negated in their contributions to the dataset, allowing for the analysis of model behavior under feature absence scenarios.

Inherits from:: WeightedFeaturesDataset: Class extending BaseFeaturesDataset with support for weighted features

cancellation_features

Indices of features subject to cancellation.

Type:: list of int, optional

cancellation_likelihood

Likelihood of feature cancellation, between 0 and 1.

Type:: float

cancellation_outcomes

Binary tensor indicating whether each feature in each sample is canceled.

Type:: torch.Tensor

cancellation_samples

Concatenation of samples with their cancellation outcomes.

Type:: torch.Tensor

cancellation_attributions

The attribution of each feature considering the cancellation.

Type:: torch.Tensor

cat_features

Categorical features derived from the cancellation samples.

Type:: list

ground_truth_attributions

Combined tensor of weighted samples and cancellation attributions for ground truth analysis.

Type:: torch.Tensor

Initializes a ConflictingDataset object.

Parameters:

seed (int) – Seed for random number generation, ensuring reproducibility. Defaults to 0.
n_features (int) – Number of features in each sample. Defaults to 2.
n_samples (int) – Number of samples to generate. Defaults to 10.
distribution (str) – Type of distribution to use for generating samples. Defaults to ‘normal’.
weight_range (tuple[float]) – Range (min, max) for generating random feature weights. Defaults to (-1.0, 1.0).
weights (torch.Tensor, optional) – Predefined weights for each feature. Defaults to None.
cancellation_features (list[int], optional) – Specific features to apply cancellations to. Defaults to None, applying to all features.
cancellation_likelihood (float) – Probability of each feature being canceled. Defaults to 0.5.

cancellation_features = None

cancellation_likelihood = 0.5

cancellation_outcomes

cancellation_samples

labels

cancellation_attributions

cat_features

ground_truth_attributions

features = 'cancellation_samples'

ground_truth_attribute = 'ground_truth_attributions'

subset_data = ['weighted_samples', 'cancellation_outcomes', 'cancellation_samples',...

_initialize_cancellation_features() → None

Validates and initializes the list of features subject to cancellation. If no specific features are provided, all features are considered candidates for cancellation.

Raises:: AssertionError – If cancellation_features is not a list, any element in cancellation_features is not an integer, the maximum element in cancellation_features is greater than the number of features, or cancellation_features is empty. Also, if cancellation_likelihood is not a float or is outside the range [0, 1].

_get_cancellations() → torch.Tensor

Generates a binary mask indicating whether each feature in each sample is canceled based on the specified likelihood.

This method considers only the features specified in cancellation_features for possible cancellation.

Returns:

An integer tensor of shape (n_samples, n_features) where 1 represents a canceled feature,: and 0 represents an active feature.

Return type:

torch.Tensor

_get_cancellation_samples() → torch.Tensor

Concatenates the original samples with their cancellation outcomes to form a comprehensive dataset.

This allows for analyzing the impact of feature cancellations directly alongside the original features.

Returns:: A tensor containing the original samples augmented with their corresponding cancellation outcomes.
Return type:: torch.Tensor

_get_cancellation_attributions() → torch.Tensor

Computes the attribution of each feature by negating the effect of canceled features.

This method helps understand the impact of each feature on the model output when certain features are systematically canceled.

Returns:

A tensor of the same shape as the weighted samples, where the values of canceled features are: negated to reflect their absence.

Return type:

torch.Tensor

generate_model() → torch.nn.Module

Instantiates and returns a neural network model for analyzing datasets with conflicting features.

The model is configured to use the specified features and weights, allowing for experimentation with feature cancellations.

Returns:: A neural network model designed to work with the specified features and weights.
Return type:: model.ConflictingFeaturesNN

datagenerator.conflicting.data