Q/A Generation
Question/Answer pair generation from statistical insights.
Generator
Main Q/A generation engine with template-based and LLM-powered approaches.
Q/A pair generation from statistical insights.
Converts facts into multiple question/answer pairs using:
1. Template-based generation
2. LLM paraphrasing and augmentation
-
class statqa.qa.generator.QAGenerator(use_llm=False, llm_provider='openai', llm_model=None, api_key=None, paraphrase_count=2)[source]
Bases: object
Generates Q/A pairs from statistical insights.
- Parameters:
use_llm (bool) – Whether to use LLM for paraphrasing
llm_provider (Literal['openai', 'anthropic']) – LLM provider (‘openai’ or ‘anthropic’)
llm_model (str | None) – Model name
api_key (str | None) – API key for LLM
paraphrase_count (int) – Number of paraphrased versions per question
- Raises:
-
-
export_qa_dataset(qa_results, output_format='jsonl')[source]
Export Q/A pairs in format suitable for LLM fine-tuning.
- Parameters:
qa_results (list[dict[str, Any]]) – Results from generate_batch
output_format (str) – ‘jsonl’, ‘openai’, or ‘anthropic’
- Return type:
list[str]
- Returns:
List of formatted strings (one per line for JSONL)
-
generate_batch(insights, formatted_answers)[source]
Generate Q/A pairs for multiple insights.
- Parameters:
insights (list[dict[str, Any]]) – List of statistical insights
formatted_answers (list[str]) – Corresponding natural language answers
- Return type:
list[dict[str, Any]]
- Returns:
List of insight dictionaries with added ‘qa_pairs’ field
-
generate_exploratory_questions(insight, context=None)[source]
Generate exploratory follow-up questions using LLM.
- Parameters:
insight (dict[str, Any]) – Statistical insight
context (str | None) – Optional dataset/domain context
- Return type:
list[str]
- Returns:
List of exploratory questions
-
generate_qa_pairs(insight, formatted_answer, variables=None, visual_data=None)[source]
Generate Q/A pairs from a statistical insight.
- Parameters:
insight (dict[str, Any]) – Statistical analysis result
formatted_answer (str) – Natural language answer
variables (list[str] | None) – List of variable names involved in the analysis
visual_data (dict[str, Any] | None) – Optional visual metadata to include with Q/A pairs
- Returns:
question, answer, type, provenance, visual
- Return type:
List of Q/A pair dictionaries with keys
-
generate_visual_metadata(insight, variables=None, plot_data=None)[source]
Generate visual metadata for a statistical insight.
- Parameters:
insight (dict[str, Any]) – Statistical analysis result
variables (list[str] | None) – List of variable names involved in the analysis
plot_data (dict[str, Any] | None) – Optional plot data (data and variable objects)
- Return type:
dict[str, Any] | None
- Returns:
Visual metadata dictionary or None if no visualization appropriate
Templates
Template-based question generation for different analysis types.
Question templates for Q/A pair generation.
Defines templates for converting facts into question/answer pairs.
-
class statqa.qa.templates.QuestionTemplate(question_type)[source]
Bases: object
Template for generating questions from statistical insights.
- Parameters:
question_type (QuestionType) – Type of question to generate
-
generate(insight, answer)[source]
Generate question/answer pairs from an insight.
- Parameters:
insight (dict[str, Any]) – Statistical insight dictionary
answer (str) – Formatted natural language answer
- Return type:
list[dict[str, str]]
- Returns:
List of Q/A pair dictionaries
- Raises:
ValueError – If question type is not supported
-
class statqa.qa.templates.QuestionType(*values)[source]
Bases: str, Enum
Types of questions that can be generated.
-
CAUSAL = 'causal'
-
COMPARATIVE = 'comparative'
-
CORRELATIONAL = 'correlational'
-
DESCRIPTIVE = 'descriptive'
-
DISTRIBUTIONAL = 'distributional'
-
TEMPORAL = 'temporal'
-
statqa.qa.templates.infer_question_type(insight)[source]
Infer the appropriate question type from an insight.
- Parameters:
insight (dict[str, Any]) – Statistical insight dictionary
- Return type:
QuestionType
- Returns:
Inferred question type