Response Comparison Stage (Stage 6)
Overview
The Response Comparison stage implements ensemble diversification techniques to compare and combine multiple response candidates, creating an optimal integrated response that maximizes both quality and diversity. This stage follows the Response Scoring stage (Stage 5) and precedes the Threshold Verification stage (Stage 7).
Components
1. Response Comparison Service
Purpose: Orchestrates the entire response comparison process by coordinating the ensemble diversifier, diversity calculator, quality-diversity optimizer, and response combiner.
Key Functions:
process(prompt, context)
: Processes and compares multiple response candidates, returning an optimized combined responserefine(refinement_prompt, context, previous_output)
: Refines a combined response based on feedback, tracking changes_process_single_response(primary_response, evaluation_metrics)
: Processes a single response when no alternatives are available_generate_alternative_responses(primary_response, context, max_alternatives)
: Generates alternative responses when none are provided_track_refinement_changes(previous_output, refined_output)
: Tracks changes between previous and refined output
2. Ensemble Diversifier
Purpose: Implements ensemble diversification to create an optimal ensemble of response candidates, balancing between quality and diversity.
Key Functions:
diversify(primary_response, alternative_responses, diversity_scores, alpha)
: Creates a diversified ensemble of responses_extract_quality_scores(primary_response, alternative_responses)
: Extracts quality scores for all candidates_apply_greedy_diversification(candidates, quality_scores, diversity_scores, alpha)
: Applies greedy diversification algorithm_apply_mmr_diversification(candidates, quality_scores, diversity_scores, alpha)
: Applies Maximal Marginal Relevance (MMR) algorithm
3. Diversity Calculator
Purpose: Computes pairwise diversity scores between response candidates, measuring how different they are across multiple dimensions.
Key Functions:
calculate_diversity(primary_response, alternative_responses)
: Calculates diversity metrics between response candidates_calculate_content_diversity(response1, response2)
: Measures how different the actual content is_calculate_structure_diversity(response1, response2)
: Measures how differently the content is organized_calculate_emphasis_diversity(response1, response2)
: Measures how differently the responses emphasize information_tokenize(text)
: Tokenizes text into words for comparison_create_ngrams(tokens, n)
: Creates n-grams from tokens for content comparison_calculate_type_distribution(elements)
: Calculates distribution of element types_calculate_distribution_difference(dist1, dist2)
: Calculates difference between distributions
4. Quality-Diversity Optimizer
Purpose: Optimizes the balance between quality and diversity in the response ensemble using Pareto optimization.
Key Functions:
optimize(ensemble, evaluation_metrics, diversity_scores)
: Optimizes the quality-diversity trade-offget_trade_off_metrics()
: Returns metrics from the last optimization_extract_quality_scores(ensemble, evaluation_metrics)
: Extracts quality scores for candidates_calculate_candidate_diversity(ensemble, pairwise_diversity)
: Calculates diversity score for each candidate_identify_pareto_optimal(ensemble, quality_scores, diversity_scores)
: Identifies Pareto-optimal components_select_optimal_elements(ensemble, pareto_optimal, quality_scores, diversity_scores)
: Selects optimal elements_calculate_component_weights(ensemble, quality_scores, diversity_scores, pareto_optimal)
: Calculates weights for components_extract_elements(response)
: Extracts and copies elements from a response
5. Response Combiner
Purpose: Combines optimized components from multiple response candidates into a coherent, integrated response.
Key Functions:
combine(optimized_components, primary_response, evaluation_metrics)
: Combines optimized components_create_base_response(primary_response)
: Creates a base response structure from the primary response_restructure_sections(optimized_elements, primary_sections, pareto_indices)
: Restructures elements into sections_add_to_existing_sections(elements, sections)
: Adds elements to existing sections based on relevance_create_new_section(elements, sections, title_prefix)
: Creates a new section for a group of elements_calculate_section_match(element, section_info, section)
: Calculates element-section match scores_ensure_unique_section_titles(sections)
: Ensures all section titles are unique_count_components(elements)
: Counts elements from each source response
Process Flow
- Input Processing
- Receive primary response and metrics
- Check for alternative candidates
- Generate alternatives if needed
- Prepare comparison context
- Diversity Calculation
- Compute pairwise diversity scores
- Measure content diversity
- Assess structural differences
- Evaluate emphasis variations
- Ensemble Diversification
- Create optimal candidate ensemble
- Balance quality and diversity
- Apply diversification algorithm
- Include primary response
- Quality-Diversity Optimization
- Find optimal trade-offs
- Apply Pareto optimization
- Select optimal components
- Calculate weights
- Response Combination
- Integrate optimized components
- Preserve response structure
- Organize into sections
- Ensure coherence
- Output Generation
- Combine all metrics
- Calculate contribution ratios
- Track component sources
- Prepare final response
Key Algorithms
Greedy Diversification
The greedy algorithm iteratively selects candidates that maximize marginal contribution to the ensemble:
- Start with the primary response
-
For each subsequent selection, choose the candidate that maximizes:
score = quality * alpha + average_diversity * (1 - alpha)
Maximal Marginal Relevance (MMR)
The MMR algorithm balances relevance and novelty in each selection:
- Start with the primary response
-
For each subsequent selection, choose the candidate that maximizes:
MMR = quality * alpha - max_similarity * (1 - alpha)
Pareto Optimization
The Pareto optimization identifies candidates that are not dominated in quality-diversity space:
- A candidate is Pareto-optimal if no other candidate is better in both quality and diversity
- The primary response is always included in the Pareto-optimal set
Configuration
The stage can be configured through various parameters:
{
"alpha_parameter": 0.7,
"enable_alternative_generation": true,
"max_alternatives": 5,
"ensemble_diversifier": {
"min_ensemble_size": 3,
"max_ensemble_size": 7,
"diversity_threshold": 0.3,
"algorithm": "mmr"
},
"diversity_calculator": {
"content_weight": 0.4,
"structure_weight": 0.3,
"emphasis_weight": 0.3
},
"quality_diversity_optimizer": {
"quality_weight": 0.6,
"diversity_weight": 0.4,
"pareto_threshold": 0.05
},
"response_combiner": {
"max_elements": 50,
"max_sections": 10,
"preserve_section_order": true
}
}
Best Practices
- Ensemble Management
- Maintain diverse candidates
- Balance quality-diversity
- Track source contributions
- Monitor ensemble size
- Quality Control
- Preserve primary content
- Validate combinations
- Check coherence
- Ensure relevance
- Performance Optimization
- Cache diversity scores
- Optimize algorithms
- Monitor processing time
- Track resource usage
- Integration
- Maintain metadata
- Support refinement
- Track metrics
- Document decisions
Integration with Orchestrator
The Response Comparison stage implements the AbstractPipelineStage interface to seamlessly integrate with the metacognitive orchestrator:
- It exposes a standard process() method for comparison
- It supports a refine() method for re-evaluation after refinement
- It maintains metrics about its processing for monitoring
- It has a unique stage_id of “response_comparison”
Example Usage
# Extract the primary response and evaluation metrics from previous stages
primary_response = context.get("stage_outputs", {}).get("solution_generation", {})
evaluation_metrics = context.get("stage_outputs", {}).get("response_scoring", {})
# Perform response comparison
comparison_result = response_comparison_service.process(prompt, context)
# Access the combined response
combined_response = comparison_result.get("content", {})
primary_contribution = comparison_result.get("primary_contribution_ratio", 1.0)
diversity_metrics = comparison_result.get("diversity_metrics", {})
# Forward to threshold verification
verification_context = context.copy()
verification_context["stage_outputs"]["response_comparison"] = comparison_result