datagenerator.boolean

Attributes

data

Classes

`BooleanDataset`	Generic synthetic dataset based on a propositional formula.
`BooleanAndDataset`	Generic synthetic dataset based on a propositional formula.
`BooleanOrDataset`	Generic synthetic dataset based on a propositional formula.

Module Contents

class datagenerator.boolean.BooleanDataset(formula: sympy.core.function.FunctionClass, atoms: Iterable | None = None, seed: int = 0, n_samples: int = 10)

Bases: xaiunits.datagenerator.data_generation.BaseFeaturesDataset

Generic synthetic dataset based on a propositional formula.

The dataset corresponds to sampling rows from the truth table of the given propositional formula. If n_samples is no larger than the size of the truth table, then the generated dataset will always contain non-duplicate samples of the truth table. Otherwise, the dataset will still contain rows for the entire truth table but will also contain duplicates.

If the input for atoms is None, the corresponding attribute is by default assigned as the atoms that are extracted from the given formula.

Inherits from:: BaseFeaturesDataset: The base class for creating continuous feature datasets.

formula

A propositional formula for which the dataset is generated.

Type:: sympy.core.function.FunctionClass

atoms

The ordered collection of propositional atoms that were used within the propositional formula.

Type:: tuple

seed

Seed for random number generators to ensure reproducibility.

Type:: int

n_samples

Number of samples in the dataset.

Type:: int

Initializes a BooleanDataset object.

Parameters:

formula (sympy.core.function.FunctionClass) – A propositional formula for dataset generation.
atoms (Iterable, optional) – Ordered collection of propositional atoms used in the formula. Defaults to None.
seed (int) – Seed for random number generation, ensuring reproducibility. Defaults to 0.
n_samples (int) – Number of samples to generate for the dataset. Defaults to 10.

atoms

formula

subset_data = ['samples']

subset_attribute = ['perturb_function', 'default_metric', 'generate_model', 'name']

cat_features

name = 'BooleanDataset'

_initialize_samples_labels(n_samples: int) → Tuple[torch.Tensor, torch.Tensor]

Initializes the samples and labels of the dataset.

Parameters:

n_samples (int) – number of samples/labels contained in the dataset.

Returns:

Tuple containing the generated samples: and corresponding labels of the dataset.

Return type:

tuple[Tensor, Tensor]

perturb_function(cat_resample_prob: float = 0.2, run_infidelity_decorator: bool = True, multipy_by_inputs: bool = False) → Callable

Generates perturb function to be used for XAI method evaluation. Applies gaussian noise for continuous features, and resampling for categorical features.

Parameters:

cat_resample_prob (float) – Probability of resampling a categorical feature. Defaults to 0.2.
run_infidelity_decorator (bool) – Set to true if the returned fns is to be compatible with infidelity. Set flag to False for sensitivity. Defaults to True.
multiply_by_inputs (bool) – Parameters for decorator. Defaults to False.

Returns:

A perturbation function compatible with Captum.

Return type:

perturb_func (function)

generate_model() → torch.nn.Module

Generates a neural network model using the given propositional formula and atoms.

Returns:: A neural network model tailored to the dataset’s propositional formula.
Return type:: model.PropFormulaNN

property default_metric: Callable

The default metric for evaluating the performance of explanation methods applied to this dataset.

For this dataset, the default metric is the infidelity metric with the default perturb function.

Returns:

A class that wraps around the default metric to be instantiated: within the pipeline.

Return type:

type

__getitem__(idx: int, others: List[str] = []) → Tuple[Any, Ellipsis]

Retrieve a sample and its associated label by index.

Parameters:

idx (int) – Index of the sample to retrieve.
others (list) – Additional items to retrieve. Defaults to [].

Returns:

Tuple containing the sample and its label.

Return type:

tuple

class datagenerator.boolean.BooleanAndDataset(n_features: int = 2, n_samples: int = 10, seed: int = 0)

Bases: BooleanDataset

Generic synthetic dataset based on a propositional formula.

The dataset corresponds to sampling rows from the truth table of the given propositional formula. If n_samples is no larger than the size of the truth table, then the generated dataset will always contain non-duplicate samples of the truth table. Otherwise, the dataset will still contain rows for the entire truth table but will also contain duplicates.

If the input for atoms is None, the corresponding attribute is by default assigned as the atoms that are extracted from the given formula.

Inherits from:: BaseFeaturesDataset: The base class for creating continuous feature datasets.

formula

A propositional formula for which the dataset is generated.

Type:: sympy.core.function.FunctionClass

atoms

The ordered collection of propositional atoms that were used within the propositional formula.

Type:: tuple

seed

Seed for random number generators to ensure reproducibility.

Type:: int

n_samples

Number of samples in the dataset.

Type:: int

Initializes a BooleanDataset object.

Parameters:

formula (sympy.core.function.FunctionClass) – A propositional formula for dataset generation.
atoms (Iterable, optional) – Ordered collection of propositional atoms used in the formula. Defaults to None.
seed (int) – Seed for random number generation, ensuring reproducibility. Defaults to 0.
n_samples (int) – Number of samples to generate for the dataset. Defaults to 10.

n_features = 2

ground_truth

ground_truth_attribute = 'ground_truth'

create_baselines() → None

__getitem__(idx: int, others: List[str] = ['baseline', 'ground_truth_attribute']) → Tuple[Any, Ellipsis]

Retrieve a sample and its associated label by index.

Parameters:

idx (int) – Index of the sample to retrieve.
others (list) – Additional items to retrieve. Defaults to [].

Returns:

Tuple containing the sample and its label.

Return type:

tuple

generate_model() → torch.nn.Module

Generates a neural network model using the given propositional formula and atoms.

Returns:: A neural network model tailored to the dataset’s propositional formula.
Return type:: model.PropFormulaNN

create_ground_truth() → torch.Tensor

property default_metric: Callable

The default metric for evaluating the performance of explanation methods applied to this dataset.

For this dataset, the default metric is the infidelity metric with the default perturb function.

Returns:

A class that wraps around the default metric to be instantiated: within the pipeline.

Return type:

type

class datagenerator.boolean.BooleanOrDataset(n_features: int = 2, n_samples: int = 10, seed: int = 0)

Bases: BooleanDataset

Generic synthetic dataset based on a propositional formula.

The dataset corresponds to sampling rows from the truth table of the given propositional formula. If n_samples is no larger than the size of the truth table, then the generated dataset will always contain non-duplicate samples of the truth table. Otherwise, the dataset will still contain rows for the entire truth table but will also contain duplicates.

If the input for atoms is None, the corresponding attribute is by default assigned as the atoms that are extracted from the given formula.

Inherits from:: BaseFeaturesDataset: The base class for creating continuous feature datasets.

formula

A propositional formula for which the dataset is generated.

Type:: sympy.core.function.FunctionClass

atoms

The ordered collection of propositional atoms that were used within the propositional formula.

Type:: tuple

seed

Seed for random number generators to ensure reproducibility.

Type:: int

n_samples

Number of samples in the dataset.

Type:: int

Initializes a BooleanDataset object.

Parameters:

formula (sympy.core.function.FunctionClass) – A propositional formula for dataset generation.
atoms (Iterable, optional) – Ordered collection of propositional atoms used in the formula. Defaults to None.
seed (int) – Seed for random number generation, ensuring reproducibility. Defaults to 0.
n_samples (int) – Number of samples to generate for the dataset. Defaults to 10.

n_features = 2

ground_truth

ground_truth_attribute = 'ground_truth'

create_baselines() → None

__getitem__(idx: int, others: List[str] = ['baseline', 'ground_truth_attribute']) → Tuple[Any, Ellipsis]

Retrieve a sample and its associated label by index.

Parameters:

idx (int) – Index of the sample to retrieve.
others (list) – Additional items to retrieve. Defaults to [].

Returns:

Tuple containing the sample and its label.

Return type:

tuple

generate_model() → torch.nn.Module

Generates a neural network model using the given propositional formula and atoms.

Returns:: A neural network model tailored to the dataset’s propositional formula.
Return type:: model.PropFormulaNN

create_ground_truth() → torch.Tensor

property default_metric: Callable

The default metric for evaluating the performance of explanation methods applied to this dataset.

For this dataset, the default metric is the infidelity metric with the default perturb function.

Returns:

A class that wraps around the default metric to be instantiated: within the pipeline.

Return type:

type

datagenerator.boolean.data