Skip to content

Ethics Module

ARTEMIS includes an ethics module that ensures debates remain within ethical boundaries and arguments are evaluated for moral soundness.

Overview

The ethics module operates at three levels:

  1. Evaluation: Ethical criteria are weighted in argument scoring
  2. Monitoring: EthicsGuard detects ethical boundary violations
  3. Jury Perspective: Ethical jury perspective considers moral implications

Ethical Principles

ARTEMIS is built on core ethical principles:

Principle Description
Fairness Arguments shouldn't discriminate or show bias
Transparency Reasoning should be clear and honest
Non-harm Arguments shouldn't advocate for harmful actions
Respect Maintain respect for persons and values
Accuracy Claims should be truthful and verifiable

Ethics Guard

The EthicsGuard monitors arguments for ethical violations:

from artemis.safety import EthicsGuard, MonitorMode, EthicsConfig

# Configure ethics guard
ethics_config = EthicsConfig(
    harmful_content_threshold=0.3,
    bias_threshold=0.4,
    fairness_threshold=0.3,
    enabled_checks=["harmful_content", "bias", "fairness"],
)

guard = EthicsGuard(
    mode=MonitorMode.PASSIVE,
    config=ethics_config,
)

# Use in debate
debate = Debate(
    topic="Your topic",
    agents=agents,
    safety_monitors=[guard.process],
)

EthicsConfig Options

Option Type Default Description
harmful_content_threshold float 0.5 Threshold for harmful content detection
bias_threshold float 0.4 Threshold for bias detection
fairness_threshold float 0.3 Threshold for fairness violations
enabled_checks list[str] default Checks to enable

Threshold Levels

from artemis.safety import EthicsConfig

# Low sensitivity - only severe violations
low_config = EthicsConfig(harmful_content_threshold=0.7)

# Medium sensitivity - most violations
medium_config = EthicsConfig(harmful_content_threshold=0.5)

# High sensitivity - strict enforcement
high_config = EthicsConfig(harmful_content_threshold=0.3)

Ethical Evaluation in L-AE-CR

Arguments receive an ethics score as part of the adaptive evaluation:

from artemis.core.types import EvaluationCriteria

# Ethical alignment is one of the default criteria
criteria = EvaluationCriteria(
    logical_coherence=0.25,
    evidence_quality=0.25,
    causal_reasoning=0.20,
    ethical_alignment=0.15,  # Ethics weight
    persuasiveness=0.15,
)

What Ethical Evaluation Considers

  • Claim Fairness: Are claims fair to all parties?
  • Evidence Ethics: Is evidence used responsibly?
  • Conclusion Ethics: Are conclusions ethically sound?
  • Stakeholder Impact: Who is affected and how?

Ethical Jury Perspective

The jury includes an ethical perspective:

from artemis.core.types import JuryPerspective

# The ETHICAL perspective focuses on moral implications
JuryPerspective.ETHICAL

When creating a jury panel, the ethical perspective is automatically included:

from artemis.core.jury import JuryPanel

# Create panel - ethical perspective is assigned to one juror
panel = JuryPanel(evaluators=5, model="gpt-4o")

# The ethical juror focuses on:
# - Consideration of ethical principles
# - Attention to stakeholder welfare
# - Fairness and justice concerns
# - Long-term societal impact

Handling Ethical Debates

ARTEMIS can handle debates on complex ethical topics:

import asyncio
from artemis.core.agent import Agent
from artemis.core.debate import Debate
from artemis.core.jury import JuryPanel

async def ethical_debate():
    # Create agents for ethical debate
    agents = [
        Agent(
            name="utilitarian",
            role="Advocate arguing from consequentialist ethics",
            model="gpt-4o",
        ),
        Agent(
            name="deontologist",
            role="Advocate arguing from duty-based ethics",
            model="gpt-4o",
        ),
    ]

    # Create jury - will include ethical perspective
    jury = JuryPanel(evaluators=5, model="gpt-4o")

    debate = Debate(
        topic="Should autonomous vehicles prioritize passenger or pedestrian safety?",
        agents=agents,
        jury=jury,
        rounds=3,
    )

    debate.assign_positions({
        "utilitarian": "maximize overall welfare in collision scenarios",
        "deontologist": "respect individual rights regardless of outcomes",
    })

    result = await debate.run()
    return result

asyncio.run(ethical_debate())

Safety Integration

Ethics monitoring integrates with other safety monitors:

from artemis.safety import (
    SandbagDetector,
    DeceptionMonitor,
    EthicsGuard,
    MonitorMode,
    EthicsConfig,
)

# Configure all monitors
sandbag = SandbagDetector(mode=MonitorMode.PASSIVE, sensitivity=0.7)
deception = DeceptionMonitor(mode=MonitorMode.PASSIVE, sensitivity=0.6)
ethics = EthicsGuard(
    mode=MonitorMode.PASSIVE,
    config=EthicsConfig(harmful_content_threshold=0.3),
)

# Use all monitors together
debate = Debate(
    topic="Your topic",
    agents=agents,
    safety_monitors=[
        sandbag.process,
        deception.process,
        ethics.process,
    ],
)

Safety Alerts

When ethical violations are detected, they appear in the debate results:

result = await debate.run()

# Check for ethical safety alerts
for alert in result.safety_alerts:
    if "ethics" in alert.type.lower():
        print(f"Ethics alert: {alert.type}")
        print(f"Severity: {alert.severity:.0%}")
        print(f"Agent: {alert.agent}")

Best Practices

  1. Set Appropriate Sensitivity: Match sensitivity to debate context
  2. Define Clear Principles: Be explicit about ethical boundaries
  3. Include Ethical Perspectives: Use jury to consider moral implications
  4. Enable Transparency: Make ethics decisions explainable
  5. Review Edge Cases: Some arguments may need human review

Ethical Frameworks

When debating ethical topics, different ethical frameworks provide different perspectives:

Framework Focus What It Values
Utilitarian Greatest good Outcomes, consequences
Deontological Rules and duties Principles, rights
Virtue Ethics Character Intentions, virtues
Care Ethics Relationships Context, relationships

These frameworks can be represented by different agents in a debate:

agents = [
    Agent(
        name="consequentialist",
        role="Argues from utilitarian perspective focusing on outcomes",
        model="gpt-4o",
    ),
    Agent(
        name="deontologist",
        role="Argues from duty-based perspective focusing on principles",
        model="gpt-4o",
    ),
    Agent(
        name="virtue_ethicist",
        role="Argues from virtue ethics focusing on character",
        model="gpt-4o",
    ),
]

Next Steps