Interpretability Guarantees with Merlin-Arthur Classifiers (Q657): Difference between revisions

From MaRDI portal
Importer (talk | contribs)
Created a new Item
 
Importer (talk | contribs)
Changed label, description and/or aliases in en, and other parts
 
description / endescription / en
Software published at Zenodo repository
Software published at Zenodo repository.
Property / description
 
The repository provides the codebase for the Merlin-Arthur Classifiers, a novel multi-agent framework designed to enhance interpretability in machine learning models. Inspired by the Merlin-Arthur protocol from interactive proof systems, this project introduces a method to ensure interpretability guarantees, as detailed in our AISTATS 2024 paper, Interpretability Guarantees with Merlin-Arthur Classifiers. The approach is tested on the MNIST and UCI Census datasets, employing a verifier (Arthur) and two provers (Merlin and Morgana) in a setup that mimics a min-max game to refine classification outcomes. Our objective is to contribute to the development of interpretable AI systems, providing a toolkit for researchers and practitioners to replicate our experiments, engage with our methodology, and extend it to new contexts. The repository includes comprehensive guidance on setup, usage, and customization for various datasets and training modes. Getting Started involves cloning the repository, setting up the Conda environment with the necessary dependencies, and initializing wandb for experiment tracking. Basic Usage outlines steps for regular and Merlin-Arthur training on supported datasets, with examples for different configurations and advanced features. Regular training examples for MNIST and UCI Census datasets demonstrate how to customize training parameters, while Merlin-Arthur training provides a template for engaging in the strategic min-max game that characterizes our interpretability-enhancing methodology. Advanced Features detail customization options for loss functions, optimization techniques, and regularization, enabling researchers to fine-tune the training process according to their specific needs. This repository is intended as a collaborative platform for advancing interpretability in AI, and we welcome contributions, feedback, and partnerships from the broader community.
Property / description: The repository provides the codebase for the Merlin-Arthur Classifiers, a novel multi-agent framework designed to enhance interpretability in machine learning models. Inspired by the Merlin-Arthur protocol from interactive proof systems, this project introduces a method to ensure interpretability guarantees, as detailed in our AISTATS 2024 paper, Interpretability Guarantees with Merlin-Arthur Classifiers. The approach is tested on the MNIST and UCI Census datasets, employing a verifier (Arthur) and two provers (Merlin and Morgana) in a setup that mimics a min-max game to refine classification outcomes. Our objective is to contribute to the development of interpretable AI systems, providing a toolkit for researchers and practitioners to replicate our experiments, engage with our methodology, and extend it to new contexts. The repository includes comprehensive guidance on setup, usage, and customization for various datasets and training modes. Getting Started involves cloning the repository, setting up the Conda environment with the necessary dependencies, and initializing wandb for experiment tracking. Basic Usage outlines steps for regular and Merlin-Arthur training on supported datasets, with examples for different configurations and advanced features. Regular training examples for MNIST and UCI Census datasets demonstrate how to customize training parameters, while Merlin-Arthur training provides a template for engaging in the strategic min-max game that characterizes our interpretability-enhancing methodology. Advanced Features detail customization options for loss functions, optimization techniques, and regularization, enabling researchers to fine-tune the training process according to their specific needs. This repository is intended as a collaborative platform for advancing interpretability in AI, and we welcome contributions, feedback, and partnerships from the broader community. / rank
 
Normal rank
Property / MaRDI profile type
 
Property / MaRDI profile type: MaRDI software profile / rank
 
Normal rank

Latest revision as of 09:14, 20 February 2025

Software published at Zenodo repository.
Language Label Description Also known as
English
Interpretability Guarantees with Merlin-Arthur Classifiers
Software published at Zenodo repository.

    Statements

    0 references
    27 February 2024
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    The repository provides the codebase for the Merlin-Arthur Classifiers, a novel multi-agent framework designed to enhance interpretability in machine learning models. Inspired by the Merlin-Arthur protocol from interactive proof systems, this project introduces a method to ensure interpretability guarantees, as detailed in our AISTATS 2024 paper, Interpretability Guarantees with Merlin-Arthur Classifiers. The approach is tested on the MNIST and UCI Census datasets, employing a verifier (Arthur) and two provers (Merlin and Morgana) in a setup that mimics a min-max game to refine classification outcomes. Our objective is to contribute to the development of interpretable AI systems, providing a toolkit for researchers and practitioners to replicate our experiments, engage with our methodology, and extend it to new contexts. The repository includes comprehensive guidance on setup, usage, and customization for various datasets and training modes. Getting Started involves cloning the repository, setting up the Conda environment with the necessary dependencies, and initializing wandb for experiment tracking. Basic Usage outlines steps for regular and Merlin-Arthur training on supported datasets, with examples for different configurations and advanced features. Regular training examples for MNIST and UCI Census datasets demonstrate how to customize training parameters, while Merlin-Arthur training provides a template for engaging in the strategic min-max game that characterizes our interpretability-enhancing methodology. Advanced Features detail customization options for loss functions, optimization techniques, and regularization, enabling researchers to fine-tune the training process according to their specific needs. This repository is intended as a collaborative platform for advancing interpretability in AI, and we welcome contributions, feedback, and partnerships from the broader community.
    0 references

    Identifiers

    0 references