Data Models

This document provides an overview of the data models used in the Four-Sided Triangle RAG (Retrieval-Augmented Generation) system.

Base Model

The system uses a common BaseModel class that all other models inherit from, providing:

Documents are the primary source of information in the system.

DocumentStatus: Tracks document processing state (NEW, PROCESSING, PROCESSED, FAILED, ARCHIVED)
DocumentType: Categorizes documents (TEXT, PDF, WEBPAGE, CODE, SPREADSHEET, etc.)
DocumentChunk: Represents portions of documents for efficient processing
Document: Primary class representing a document with:
- Title, content, and source information
- Processing metadata
- Methods for state transitions
- Chunking functionality for breaking documents into processable pieces

Queries represent user questions or instructions.

QueryIntentType: Classifies query purpose (INFORMATIONAL, COMPUTATIONAL, COMPARISON, etc.)
ParameterType: Defines data types for query parameters
QueryParameter: Represents parameters extracted from queries
QueryIntent: Captures the classified intent of a query
QueryParameters: Collection of parameters from a query
QueryConstraint: Represents constraints in a query
Query: Primary class representing a user query with:
- Raw text
- Extracted parameters
- Classified intent
- Constraints and context

Responses are generated answers to user queries.

ResponseType: Classifies response types (ANSWER, ERROR, CLARIFICATION)
ResponseFormat: Defines output formats (TEXT, HTML, MARKDOWN, etc.)
CitationType: Classifies citation types (DOCUMENT, COMPUTATION, INFERENCE)
Citation: Represents source references
ResponseContent: Contains the main answer content
Explanation: Provides reasoning behind responses
ComputationResult: Contains results of computational queries
ResponseFeedback: Tracks user feedback on responses
ResponseMetrics: Stores evaluation metrics
Response: Primary class representing a system response with:
- Content and format
- Citations and explanations
- Metrics and feedback
- Methods for response manipulation and formatting
ResponseComparison: Facilitates comparison between multiple responses

Embeddings are vector representations of documents and queries.

EmbeddingType: Classifies embedding types
Embedding: Represents vector embeddings with:
- Vector data
- Dimensionality information
- Source tracking
- Similarity calculation methods

Domain knowledge represents specialized information about specific domains.

Working memory represents the system’s short-term memory during query processing.