Jury Mechanism¶

ARTEMIS uses a multi-perspective jury system instead of a single evaluator. This provides more balanced verdicts and transparent decision-making.

Why a Jury?¶

Single-evaluator systems have limitations:

Bias: One perspective dominates
Opacity: Hard to understand decisions
Inconsistency: Results vary unpredictably

A jury addresses these by:

Multiple Perspectives: Different viewpoints considered
Deliberation: Jurors evaluate arguments independently then aggregate
Transparency: Each evaluation is explained

JuryPanel¶

The JuryPanel class manages multiple jury members:

from artemis.core.jury import JuryPanel

panel = JuryPanel(
    evaluators=5,              # Number of jury members
    model="gpt-4o",            # Model for jurors
    consensus_threshold=0.7,   # Required agreement (0-1)
)

JuryPanel Options¶

Option	Type	Default	Description
`evaluators`	int	3	Number of jury members
`model`	str	"gpt-4o"	Default model for jurors
`models`	list[str]	None	Per-juror model list
`jurors`	list[JurorConfig]	None	Full juror configuration
`consensus_threshold`	float	0.7	Required agreement
`criteria`	list[str]	default	Evaluation criteria

Per-Juror Model Configuration¶

You can assign different models to each juror for diverse evaluation perspectives:

Option A: Model List

A simple list of models distributed across jurors:

jury = JuryPanel(
    evaluators=3,
    models=["gpt-4o", "claude-sonnet-4-20250514", "gemini-2.0-flash"],
)

If fewer models than evaluators, they cycle:

# 5 jurors with 2 models = gpt-4o, claude, gpt-4o, claude, gpt-4o
jury = JuryPanel(
    evaluators=5,
    models=["gpt-4o", "claude-sonnet-4-20250514"],
)

Option B: JurorConfig Objects

Full control over each juror's perspective, model, and criteria:

from artemis.core.types import JurorConfig, JuryPerspective

jury = JuryPanel(
    jurors=[
        JurorConfig(
            perspective=JuryPerspective.ANALYTICAL,
            model="gpt-4o",
            criteria=["logical_consistency", "evidence_strength"],
        ),
        JurorConfig(
            perspective=JuryPerspective.ETHICAL,
            model="claude-sonnet-4-20250514",
            criteria=["ethical_alignment", "fairness"],
        ),
        JurorConfig(
            perspective=JuryPerspective.PRACTICAL,
            model="gemini-2.0-flash",
            criteria=["feasibility", "real_world_impact"],
        ),
    ],
    consensus_threshold=0.7,
)

JurorConfig Fields¶

Field	Type	Required	Description
`perspective`	JuryPerspective	Yes	Juror's evaluation perspective
`model`	str	Yes	Model identifier
`criteria`	list[str]	No	Custom criteria for this juror
`api_key`	str	No	API key override for this juror

Jury Perspectives¶

ARTEMIS includes five built-in perspectives via the JuryPerspective enum:

from artemis.core.types import JuryPerspective

# Available perspectives
JuryPerspective.ANALYTICAL   # Focus on logic and evidence
JuryPerspective.ETHICAL      # Focus on moral implications
JuryPerspective.PRACTICAL    # Focus on feasibility and impact
JuryPerspective.ADVERSARIAL  # Challenge all arguments
JuryPerspective.SYNTHESIZING # Find common ground

Perspective Details¶

Analytical Perspective: - Prioritizes valid reasoning - Evaluates logical consistency - Values strong evidence

Ethical Perspective: - Considers moral implications - Weighs stakeholder impact - Values fairness and justice

Practical Perspective: - Focuses on feasibility - Considers implementation - Values real-world evidence

Adversarial Perspective: - Questions all claims - Requires strong evidence - Identifies weak points

Synthesizing Perspective: - Seeks common ground - Recognizes valid points from all sides - Values constructive framing

Automatic Perspective Assignment¶

When you create a JuryPanel, perspectives are automatically distributed among jurors:

panel = JuryPanel(evaluators=5, model="gpt-4o")

# Check assigned perspectives
for juror in panel.jurors:
    print(f"{juror.juror_id}: {juror.perspective.value}")
# juror_0: analytical
# juror_1: ethical
# juror_2: practical
# juror_3: adversarial
# juror_4: synthesizing

Deliberation Process¶

Flow¶

graph TD
    A[Arguments Presented] --> B[Individual Evaluation]
    B --> C[Score Computation]
    C --> D[Consensus Building]
    D --> E[Verdict Generation]

Stages¶

Individual Evaluation: Each juror evaluates arguments independently from their perspective
Score Computation: Jurors compute scores for each agent based on evaluation criteria
Consensus Building: Weighted voting determines the winner based on confidence
Verdict Generation: Final verdict with reasoning and dissenting opinions

Verdict Structure¶

The Verdict returned by deliberation includes:

from artemis.core.debate import Debate

# After running a debate
result = await debate.run()
verdict = result.verdict

print(f"Decision: {verdict.decision}")       # Winner name or "draw"
print(f"Confidence: {verdict.confidence}")   # 0.0 to 1.0
print(f"Reasoning: {verdict.reasoning}")     # Explanation
print(f"Unanimous: {verdict.unanimous}")     # Whether all jurors agreed

# Score breakdown by agent
if verdict.score_breakdown:
    for agent, score in verdict.score_breakdown.items():
        print(f"  {agent}: {score:.2f}")

# Dissenting opinions
for dissent in verdict.dissenting_opinions:
    print(f"Dissent from {dissent.juror_id} ({dissent.perspective.value}):")
    print(f"  Position: {dissent.position}")
    print(f"  Reasoning: {dissent.reasoning}")

Verdict Fields¶

Field	Type	Description
`decision`	str	Winner name or "draw"
`confidence`	float	Confidence level (0-1)
`reasoning`	str	Explanation of verdict
`unanimous`	bool	Whether all jurors agreed
`score_breakdown`	dict	Scores by agent
`dissenting_opinions`	list	Dissenting juror opinions

Using the Jury¶

In Debates¶

from artemis.core.debate import Debate
from artemis.core.jury import JuryPanel
from artemis.core.agent import Agent

# Create agents
agents = [
    Agent(
        name="pro",
        role="Advocate supporting the proposition",
        model="gpt-4o",
    ),
    Agent(
        name="con",
        role="Advocate opposing the proposition",
        model="gpt-4o",
    ),
]

# Create jury panel
jury = JuryPanel(
    evaluators=5,
    model="gpt-4o",
    consensus_threshold=0.7,
)

# Create debate with jury
debate = Debate(
    topic="Should we adopt this policy?",
    agents=agents,
    jury=jury,
)

debate.assign_positions({
    "pro": "supports the policy",
    "con": "opposes the policy",
})

result = await debate.run()
print(f"Verdict: {result.verdict.decision}")
print(f"Confidence: {result.verdict.confidence:.0%}")

Custom Criteria¶

# Create jury with custom evaluation criteria
jury = JuryPanel(
    evaluators=3,
    model="gpt-4o",
    criteria=[
        "argument_quality",
        "evidence_strength",
        "logical_consistency",
        "persuasiveness",
        "ethical_alignment",
    ],
)

JuryMember¶

Individual jurors can be accessed and examined:

from artemis.core.jury import JuryPanel, JuryMember
from artemis.core.types import JuryPerspective

panel = JuryPanel(evaluators=3, model="gpt-4o")

# Access individual jurors
for juror in panel.jurors:
    print(f"ID: {juror.juror_id}")
    print(f"Perspective: {juror.perspective.value}")
    print(f"Criteria: {juror.criteria}")

# Get specific juror
juror = panel.get_juror("juror_0")
if juror:
    print(f"Found: {juror.juror_id}")

Consensus Calculation¶

The jury uses weighted voting to reach consensus:

Each juror evaluates and determines their winner
Votes are weighted by juror confidence
Agreement score is calculated
If below threshold, may result in "draw"

# Consensus threshold affects verdict
panel = JuryPanel(
    evaluators=5,
    consensus_threshold=0.6,  # Lower threshold = easier consensus
)

# Higher threshold requires stronger agreement
strict_panel = JuryPanel(
    evaluators=5,
    consensus_threshold=0.9,  # Requires near-unanimous agreement
)

Benefits of Jury System¶

1. Reduced Bias¶

Multiple perspectives prevent single-viewpoint dominance.

2. Transparent Decisions¶

Each juror's evaluation and reasoning is recorded.

3. Robust Verdicts¶

Consensus-based approach improves decision quality.

4. Explainable Results¶

Dissenting opinions provide insight into alternative viewpoints.

Next Steps¶

Learn about L-AE-CR Evaluation that feeds jury scores
Understand H-L-DAG Arguments that juries evaluate
Explore Ethics Module for ethical jury perspectives