🏗️ Text Operations – Solution Architecture
Purpose: This document defines the system design for the Text Operations (TextOps) capability in Constellation.
This architecture document explains how Text Operations is structured, why design decisions were made, and how it integrates with the Constellation platform. For step-by-step implementation instructions with code examples, see the Implementation Guides.
Epic Reference: This architecture implements the Text Operations Epic.
Architecture Overview
What We're Building
A reusable Text Operations service that provides 4 canonical operations:
- Text Rewriting (Anchor Primitive) — Transform existing text (humanize, tone adjustment, summarize)
- Text Generation (Extension) — Create new content from prompts
- Structured Extraction — Convert text → JSON / fields
- Retrieval + Answering (RAG) — Answer questions from user-provided text (Phase 2 capability)
Anchor Principle: Text Rewriting is the anchor primitive. Everything else is additive. Generation should never ship alone — it ships as an extension of rewriting.
What Makes It Reusable
- Zero business logic - No domain-specific prompts, mappers, or DTOs
- Configuration-driven - Temperature, models, timeouts via
application.yml - Simple API - Direct
ChatModel.chat()calls (no@AiServiceunless needed) - Structured outputs - JSON mode + Jackson parsing (not regex string parsing)
- Extensible - Products can add domain-specific wrappers without modifying core
Design Principles
Business-Agnostic Core
The Text Operations module provides primitive capabilities, not domain-specific features. It must work for any business: content generation, customer support, data extraction, marketing automation, etc.
Products extend this with domain-specific wrappers, but the core stays generic.
Simple Over Complex
- Use direct
ChatModel.chat()calls (not complex abstractions) - Use string formatting for prompts (not custom prompt managers)
- Use JSON mode + Jackson parsing (not regex string parsing)
- Keep error handling simple (no custom exception hierarchies)
Configuration Over Code
- Models, temperatures, timeouts configured via
application.yml - Constants centralized in
AiConstants - Environment-driven (no hardcoded values)
Extensible Over Fixed
- Products can add domain-specific wrappers
- Core API remains stable
- No product-specific code in the module
System Boundaries
What's Included
Core Operations (4 canonical operations):
- Text Rewriting (anchor primitive) — Transform existing text (text + instruction → rewritten text)
- Text Generation (extension) — Create new text (prompt → text)
- Structured Extraction — Convert text → JSON / fields (text → structured data)
- Retrieval + Answering (RAG) — Answer questions from user-provided text (Phase 2 capability)
Note: Summarization is a variant of rewriting. Translation, sentiment analysis, and classification are specialized cases that may be added later if fork products require them. They are not separate primitives.
Infrastructure:
- ChatModel beans (creative, balanced, factual)
- Service interface and implementation
- Request/Response DTOs
- Controller endpoints
Patterns:
- Direct ChatModel.chat() for simple operations
- JSON mode + Jackson parsing for structured outputs
- Multiple ChatModel beans for different temperatures
What's NOT Included
Domain-Specific Logic:
- ❌ Domain-specific prompts (CBT, customer support, etc.) - Products add these
- ❌ Business-specific mappers (CBT technique extraction, etc.) - Products add these
- ❌ Product-specific workflows - Products add these
Complex Abstractions:
- ❌ PromptTemplateManager - LangChain4j handles prompts directly
- ❌ AssistantFactory - Not needed for simple operations
- ❌ Custom exception hierarchies -
RuntimeExceptionis fine for boilerplate
Product Features:
- ❌ Tone dropdown presets - Product-specific UI
- ❌ Saved generations - Product-specific feature
- ❌ Templates library - Product-specific feature
- ❌ Branding and marketing - Product-specific
Component Design
Service Layer
Interface: TextOperationsService (or TextOpsService)
- Business-agnostic API
- Generic method names (not domain-specific)
- Simple parameters (strings, not complex request objects)
- Generic return types (not domain-specific DTOs)
Implementation: TextOperationsServiceImpl
- Direct
ChatModel.chat()calls - Simple prompt templates (string formatting)
- JSON parsing with Jackson (for structured outputs)
- Model selection (creative vs factual)
- Simple error handling
Configuration Layer
AIConfiguration.java:
- Extended with creative/factual
ChatModelbeans - Multiple beans with
@Qualifierannotations - Temperature-based differentiation
AiConstants.java:
- Extended with Text Operations constants
- Single source of truth for defaults
- Used in
@Valuedefaults
API Layer
AIController.java:
- Endpoints for Text Operations
- Maps requests to service calls
- Returns response DTOs
- Input validation
Module Structure
The module follows a layered architecture:
- Configuration Layer: Extended with creative/factual ChatModel beans
- Constants Layer: Extended with Text Operations constants
- Service Layer: Text Operations service interface and implementation
- API Layer: Controller endpoints for Text Operations
- Model Layer: Request and response DTOs
For detailed package structure and file organization, see the Implementation Guides.
Package Structure
The Text Operations module follows a layered package structure that separates concerns:
com.saas.springular.common.ai/
├── config/ # Configuration beans (@Bean methods)
├── constants/ # Constants class
├── model/ # Request/Response DTOs
├── service/ # Service interfaces
│ └── impl/ # Service implementations
└── controller/ # REST API endpoints
Layer Responsibilities:
- config/: Spring configuration beans. Creates
ChatModelbeans, configures timeouts, temperatures. No business logic. - constants/: Single source of truth for default values (timeouts, temperatures, limits). Used in
@Valuedefaults. - model/: Request/Response DTOs with Jakarta Validation annotations. Stateless records.
- service/: Business logic interfaces. Define capabilities (rewrite, generate, etc.). No Spring annotations.
- service/impl/: Service implementations. Inject
ChatModelbeans, handle model selection, error handling. Stateless. - controller/: REST endpoints. Handle HTTP concerns, validation, DTO mapping. Delegates to service layer.
Why This Structure:
- Clear separation of concerns
- Easy to test (mock services, mock models)
- Follows Spring Boot conventions
- Scalable (easy to add new capabilities)
For detailed file structure and implementation, see the Implementation Guides.
Statelessness and Thread Safety
Core Principle
All Text Operations services are stateless and thread-safe. This is essential for scalability and correctness.
Statelessness Rules
- No Mutable Instance Variables: Service implementations must not store per-request state
- Singleton Beans: All service beans are Spring singletons (default scope)
- Thread-Safe Dependencies: Injected dependencies (
ChatModelbeans) are thread-safe - No Conversation State: Text Operations are stateless. No session memory in services.
What "Stateless" Means
❌ Not Allowed: Service classes with mutable instance variables that store per-request state (e.g., lastPrompt, selectedModel).
✅ Allowed: Service classes with only immutable dependencies injected via constructor (e.g., @RequiredArgsConstructor). All methods are pure functions (same input → same output, no side effects on instance state).
For code examples of stateless service implementations, see the Implementation Guides.
Thread Safety
- Service Beans: Spring singletons are thread-safe when stateless
- ChatModel Beans: LangChain4j
ChatModelimplementations are thread-safe - DTOs: Immutable records (Java 14+) are inherently thread-safe
Verification Checklist
When reviewing code, verify:
- No mutable instance variables in service classes
- All dependencies are injected via constructor (
@RequiredArgsConstructor) - No per-request state stored in services
- Methods are pure functions (same input → same output, no side effects on instance state)
Why Statelessness Matters
- Scalability: Stateless services can be scaled horizontally
- Correctness: No race conditions or thread-safety issues
- Testability: Stateless code is easier to test
- Simplicity: Stateless code is easier to reason about
Request Lifecycle
Every Text Operations request follows a consistent lifecycle:
1. HTTP Request
↓
2. Controller (Input Validation)
↓
3. DTO Mapping
↓
4. Service Layer (Model Selection)
↓
5. ChatModel Invocation
↓
6. Response Parsing (if structured output)
↓
7. DTO Mapping
↓
8. HTTP Response
Step-by-Step Process
1. HTTP Request
- REST endpoint receives HTTP request
- Spring maps JSON to DTO
2. Controller (Input Validation)
- Jakarta Validation (
@Valid) validates DTO @Size,@NotBlankannotations enforce constraints- Invalid requests return 400 Bad Request (handled by
@ControllerAdvice)
3. DTO Mapping
- Controller extracts validated data from DTO
- Maps to service method parameters
4. Service Layer (Model Selection)
- Service selects appropriate
ChatModelbean (creative/balanced/factual) - Based on operation type or user preference
5. ChatModel Invocation
- Service calls
ChatModel.chat() - LangChain4j handles provider communication
- Timeout configured at bean level
6. Response Parsing (if structured output)
- For JSON responses: parse with Jackson
- For text responses: use directly
7. DTO Mapping
- Service maps result to response DTO
- Returns to controller
8. HTTP Response
- Controller returns ResponseEntity
- Spring serializes DTO to JSON
Error Handling in Lifecycle
- Validation Errors: Caught by
@ControllerAdvice, return 400 - Service Errors: Caught by service, logged, thrown as
RuntimeException - Provider Errors: Caught by service, mapped to error category, logged, thrown
- Parsing Errors: Caught by service, logged, thrown as
RuntimeException - All Errors: Handled by
ExceptionResponseHandler(existing Springular pattern)
Input Validation
All Text Operations inputs must be validated using Jakarta Validation annotations on DTOs.
Validation Patterns
Size Limits: Use @Size(max = AiConstants.MAX_PROMPT_LENGTH) for prompt/text length limits.
Required Fields: Use @NotBlank for required string fields with clear validation messages.
For detailed DTO examples with validation annotations, see the Implementation Guides (Rewriting and Generation).
Validation Rules
- Prompt/Text Length: Enforce maximum length (e.g., 4000 characters)
- Required Fields: Use
@NotBlankfor required string fields - Optional Fields: Allow null for optional parameters
- Type Validation: Jakarta Validation handles type mismatches automatically
Sanitization
Basic sanitization is handled automatically:
- Trim: Jakarta Validation
@NotBlanktrims whitespace - Empty Rejection:
@NotBlankrejects empty strings after trimming - No HTML/JS Injection: Spring automatically escapes output (if using templates)
Validation Error Handling
Validation errors are automatically handled by Spring:
- Controller receives request with
@Validannotation - Jakarta Validation runs automatically
- If validation fails:
MethodArgumentNotValidExceptionis thrown ExceptionResponseHandler(existing) catches it and returns 400 Bad Request- Response includes field-level error messages
Validation errors return JSON response with status: 400 and array of error messages.
What NOT to Validate (Yet)
- ❌ Prompt injection patterns (defer to Phase 2+)
- ❌ Content moderation (defer to Phase 2+)
- ❌ PII detection (defer to Phase 2+)
These are important but out of scope for Phase 1. Document as future considerations.
Patterns
Pattern 1: Two-Tier Service Architecture
Pattern: Simple operations use direct ChatModel.chat(), complex operations use @AiService with RAG/memory.
For Text Operations: Use direct ChatModel.chat() for all operations (no memory/RAG needed).
Rationale: Text Operations are stateless and don't need memory/RAG. Simple operations = simple code.
Pattern 2: Consolidated Configuration
Pattern: Single AIConfiguration.java with all beans.
ChatModelbean (OllamaChatModel)- Creative/Factual
ChatModelbeans - All configuration via
@Valuewith defaults fromAiConstants
Why It Works:
- Single place to configure AI
- Environment-driven via
application.yml - Constants class prevents magic numbers
For configuration examples, see Foundation.
Pattern 3: Constants Class Pattern
Pattern: AiConstants.java with all defaults.
Why It Works:
- Single source of truth
- Used in
@Valuedefaults - Prevents configuration drift
Pattern 4: Multiple ChatModel Beans
Pattern: Define multiple OllamaChatModel beans with different @Qualifier annotations and temperature settings.
Use Cases:
- Creative (0.9): Content generation
- Balanced (0.7): Default
- Factual (0.3): Summarization, translation, classification
Rationale: Different temperatures for different use cases. Temperature is configured at bean creation time in OllamaChatModel.
For code examples and configuration details, see the Implementation Guides.
Pattern 5: JSON Mode + Jackson Parsing
Pattern: Request JSON format in prompt, parse response with Jackson.
Use Case: Structured outputs (sentiment analysis, classification, extraction).
Prompt Pattern: Request JSON format in prompt with explicit format specification. Use clear instructions: "Return only valid JSON", specify exact format, no markdown wrapper.
Parsing Pattern: Parse response with Jackson ObjectMapper. Defensively handle markdown code blocks if present (trim, remove code block markers), then parse JSON. Handle JsonProcessingException by logging truncated response and throwing RuntimeException.
Why This Pattern:
- Reliable structured output
- Type-safe parsing
- Avoids brittle regex/string splitting
- Works reliably with most LLMs
For detailed code examples, see the Implementation Guides.
Pattern 6: Error Handling
Pattern: Catch provider-specific exceptions, categorize errors, log appropriately, throw RuntimeException.
Error Categories (not exception types):
-
Timeout: Request exceeded configured timeout
- Log with timeout duration
- Return user-friendly message
-
Provider Failure: LLM provider unavailable or returned error
- Log provider error details
- Return generic error message (don't expose provider internals)
-
Invalid Response: Response couldn't be parsed (e.g., invalid JSON)
- Log response snippet (truncated)
- Return parsing error message
-
Validation Error: Input validation failed
- Handled by Jakarta Validation +
@ControllerAdvice - Returns 400 Bad Request automatically
- Handled by Jakarta Validation +
Implementation Pattern: Service methods catch provider-specific exceptions (ModelTimeoutException, ModelInvocationException), categorize them, log appropriately, and throw RuntimeException with user-friendly messages.
Integration with Existing Exception Handler:
Springular already has ExceptionResponseHandler (@ControllerAdvice). All RuntimeException instances are automatically handled and converted to appropriate HTTP responses.
Why RuntimeException:
- Simple and straightforward
- Integrates with existing
@ControllerAdvicepattern - No need for custom exception hierarchies (over-engineering)
- Error categories are documented, not encoded in types
For code examples, see the Implementation Guides.
Reliability
Timeout Configuration
All ChatModel beans must configure timeouts. Timeout is set at bean creation time in OllamaChatModel.builder().timeout(Duration.ofMillis(AiConstants.DEFAULT_TIMEOUT_MS)).
Timeout Best Practices:
- Default: 5 minutes (300,000ms) for local Ollama
- Remote Providers: Adjust based on network latency
- Document: Timeout values in
AiConstantswith comments explaining rationale
Timeout Error Handling:
When timeout occurs:
ModelTimeoutExceptionis thrown by LangChain4j- Service catches and logs timeout
- Service throws
RuntimeExceptionwith user-friendly message @ControllerAdvicehandles and returns appropriate HTTP status
For configuration examples, see Foundation.
Error Recovery (Future Consideration)
Not Implemented in Phase 1:
- ❌ Retry logic (defer to Phase 2+)
- ❌ Fallback model routing (defer to Phase 2+)
- ❌ Circuit breakers (defer to Phase 2+)
These are valuable patterns but premature for Phase 1. Document as future considerations.
Observability
Structured Logging
All Text Operations must log structured information.
Required Log Fields:
operation: Operation type (e.g., "text-generation", "text-rewriting")model: Model identifier (e.g., "ollama-llama2:7b")promptLength: Length of input prompt/textlatency: Request duration in millisecondssuccess: Boolean indicating success/failureerrorCategory: Error category if failed (timeout, provider-failure, invalid-response)
Logging Pattern: Services log structured information at INFO level for successful operations, ERROR level for failures. Include operation type, model, prompt length (not content), latency, and success status.
For detailed logging examples, see the Implementation Guides.
Logging Policy
Prompt Content:
- ❌ Never log prompt content at INFO level (may contain PII or sensitive data)
- ✅ Log prompt length, operation type, model, latency
- ✅ Log prompt content only at DEBUG level (for development/debugging)
Rationale: Prompts may contain user data, PII, or sensitive information. Logging at INFO would expose this in production logs.
Token Tracking (Preparation)
Interface Preparation: Define token tracking interface (full implementation in Usage Tracking guide). Service methods call token tracker if available when provider returns token usage information.
Token Availability:
- Ollama (Local): May not return token counts
- OpenAI/Cloud Providers: Usually return token counts
- Policy: "If provider returns token counts, record them. If not, estimate and label as estimated."
For implementation details, see Usage Tracking.
Metrics (Future Consideration)
Not Implemented in Phase 1:
- ❌ Micrometer counters/timers (defer to Usage Tracking guide)
- ❌ Cost tracking (defer to Usage Tracking guide)
- ❌ Dashboards/alerts (defer to Phase 2+)
Structured logging provides sufficient observability for Phase 1. Metrics integration will be added in Usage Tracking guide.
Testing Strategy
Unit Tests
Pattern: Mock ChatModel bean, test service logic.
What to Test:
- ✅ Model selection logic (creative vs balanced vs factual)
- ✅ Error handling (timeout, provider errors)
- ✅ Response parsing (JSON parsing, error handling)
- ✅ Input validation edge cases (if service does additional validation)
For unit test examples, see the Implementation Guides.
Integration Tests
Pattern: Use @SpringBootTest with optional real provider.
When to Use Integration Tests:
- ✅ Verify end-to-end flow (DTO → Service → Model → Response)
- ✅ Test with real provider (optional, can be disabled)
- ✅ Verify timeout configuration
- ❌ Not for testing business logic (use unit tests)
Test Configuration
Use application-test.yml with shorter timeout values for tests. See Implementation Guides for configuration examples.
What NOT to Test (Yet)
- ❌ Contract tests with golden files (defer to Phase 2+)
- ❌ Performance/load tests (defer to Phase 2+)
- ❌ Provider-specific behavior (rely on LangChain4j)
Keep tests simple and focused on business logic.
Design Decisions
Decision 1: Direct ChatModel.chat() Instead of @AiService
Why: Text Operations are stateless and don't need memory/RAG.
When to Use @AiService:
- Conversation memory needed
- RAG needed
- Complex multi-step workflows
For Text Operations: Simple operations = simple code.
Decision 2: Multiple ChatModel Beans Instead of One
Why: Different temperatures for different use cases.
- Creative (0.9): Content generation
- Balanced (0.7): Default
- Factual (0.3): Summarization, translation, classification
Alternative Considered: Single bean with dynamic temperature. Rejected because temperature is configured at bean creation time in OllamaChatModel.
Decision 3: JSON Mode Instead of LangChain4j Structured Outputs
Why: JSON mode with prompt instructions is simpler and works reliably.
For Text Operations: Request JSON format in prompt, parse with Jackson. No need for complex structured output frameworks.
Decision 4: Simple Error Handling Instead of Custom Exceptions
Why: RuntimeException is fine for boilerplate. Spring @ControllerAdvice handles HTTP responses. Keep it simple.
Decision 5: Business-Agnostic API Instead of Domain-Specific
Why: Core must work for any business. Products add domain-specific wrappers.
Example: generateText(String prompt) not generateCBTQuestions(String scenario).
Extension Points
How Products Extend This Module
Products extend the Text Operations module by creating domain-specific wrapper services that:
- Add domain-specific prompt building logic
- Add business rules on top of classification/analysis results
- Compose multiple Text Operations for workflows
Pattern: Products inject TextOperationsService and add domain logic around it, without modifying the core module.
Examples:
- Marketing Content Service: Builds marketing-specific prompts, uses generation
- Content Moderation Service: Adds blocking rules based on classification results
- Customer Support Service: Composes generation + rewriting for response creation
For code examples and implementation patterns, see the Implementation Guides.
Constraints and Tradeoffs
Constraints
-
Stateless Operations: Text Operations are stateless. No conversation memory or RAG needed.
-
Temperature Configuration: Temperature is configured at bean creation time in OllamaChatModel, not dynamically.
-
JSON Parsing: Structured outputs use JSON mode + Jackson parsing. Not using LangChain4j structured output framework.
-
Error Handling: Simple error handling with
RuntimeException. No custom exception hierarchies. -
Business-Agnostic: Core module contains no domain-specific logic. Products add wrappers.
Tradeoffs
Simplicity vs Flexibility:
- ✅ Chosen: Simple direct
ChatModel.chat()calls - ❌ Not Chosen: Complex abstractions for flexibility
Multiple Beans vs Single Bean:
- ✅ Chosen: Multiple
ChatModelbeans with different temperatures - ❌ Not Chosen: Single bean with dynamic temperature (not supported by OllamaChatModel)
JSON Mode vs Structured Output Framework:
- ✅ Chosen: JSON mode with prompt instructions + Jackson parsing
- ❌ Not Chosen: LangChain4j structured output framework (adds complexity)
Simple Errors vs Custom Exceptions:
- ✅ Chosen:
RuntimeExceptionwith Spring@ControllerAdvice - ❌ Not Chosen: Custom exception hierarchies (over-engineering)
Integration Points
Phase 1 Foundation
Dependency: Requires Phase 1 Foundation (ChatModel infrastructure).
- ChatModel bean provided by Phase 1
- Ollama configuration provided by Phase 1
- Constants and configuration patterns from Phase 1
Extension: Text Operations extends Phase 1 with:
- Creative/factual
ChatModelbeans - Text Operations constants
- Text Operations service layer
Planned Features Integration
Text Rewriting & Style Controls (Anchor Primitive):
- First feature (MVP) — anchor primitive
- Uses creative
ChatModelbean - Includes style parameter mapping
- Does not depend on Generation
Text Generation (Extension):
- Second feature (MVP) — extension of rewriting
- Uses creative
ChatModelbean - Direct
ChatModel.chat()pattern - Should not ship alone — ships as extension of rewriting
AI Usage Limits & Cost Tracking:
- Operational feature
- Integrates with all Text Operations
- Tracks usage and costs
Implementation Guides
For step-by-step implementation instructions with code examples, see the Implementation Guides section (in priority order):
- Foundation - Port minimal infrastructure from POC (Phase 1)
- Rewriting - Build Text Rewriting capability (anchor primitive, first)
- Generation - Build Text Generation capability (extension, second)
- Usage Tracking - Build operational controls
Success Criteria
Architectural
- ✅ Business-agnostic design (works for any domain)
- ✅ Simple and straightforward (no over-engineering)
- ✅ Extensible (products can add wrappers)
- ✅ Follows Constellation patterns (two-tier architecture, constants class)
Functional
- ✅ All core capabilities work
- ✅ Appropriate temperature used per use case
- ✅ Structured outputs work (JSON parsing)
- ✅ Error handling works
- ✅ Integration with Phase 1 works
Reusability
- ✅ Works for marketing (content generation)
- ✅ Works for support (response generation)
- ✅ Works for moderation (classification)
- ✅ Works for translation (multi-language)
- ✅ No domain-specific code in core module