🏗️ Text Operations – Solution Architecture

Purpose: This document defines the system design for the Text Operations (TextOps) capability in Constellation.

This architecture document explains how Text Operations is structured, why design decisions were made, and how it integrates with the Constellation platform. For step-by-step implementation instructions with code examples, see the Implementation Guides.

Epic Reference: This architecture implements the Text Operations Epic.

Architecture Overview

What We're Building

A reusable Text Operations service that provides 4 canonical operations:

Text Rewriting (Anchor Primitive) — Transform existing text (humanize, tone adjustment, summarize)
Text Generation (Extension) — Create new content from prompts
Structured Extraction — Convert text → JSON / fields
Retrieval + Answering (RAG) — Answer questions from user-provided text (Phase 2 capability)

Anchor Principle: Text Rewriting is the anchor primitive. Everything else is additive. Generation should never ship alone — it ships as an extension of rewriting.

What Makes It Reusable

Zero business logic - No domain-specific prompts, mappers, or DTOs
Configuration-driven - Temperature, models, timeouts via application.yml
Simple API - Direct ChatModel.chat() calls (no @AiService unless needed)
Structured outputs - JSON mode + Jackson parsing (not regex string parsing)
Extensible - Products can add domain-specific wrappers without modifying core

Design Principles

Business-Agnostic Core

The Text Operations module provides primitive capabilities, not domain-specific features. It must work for any business: content generation, customer support, data extraction, marketing automation, etc.

Products extend this with domain-specific wrappers, but the core stays generic.

Simple Over Complex

Use direct ChatModel.chat() calls (not complex abstractions)
Use string formatting for prompts (not custom prompt managers)
Use JSON mode + Jackson parsing (not regex string parsing)
Keep error handling simple (no custom exception hierarchies)

Configuration Over Code

Models, temperatures, timeouts configured via application.yml
Constants centralized in AiConstants
Environment-driven (no hardcoded values)

Extensible Over Fixed

Products can add domain-specific wrappers
Core API remains stable
No product-specific code in the module

System Boundaries

What's Included

Core Operations (4 canonical operations):

Text Rewriting (anchor primitive) — Transform existing text (text + instruction → rewritten text)
Text Generation (extension) — Create new text (prompt → text)
Structured Extraction — Convert text → JSON / fields (text → structured data)
Retrieval + Answering (RAG) — Answer questions from user-provided text (Phase 2 capability)

Note: Summarization is a variant of rewriting. Translation, sentiment analysis, and classification are specialized cases that may be added later if fork products require them. They are not separate primitives.

Infrastructure:

ChatModel beans (creative, balanced, factual)
Service interface and implementation
Request/Response DTOs
Controller endpoints

Patterns:

Direct ChatModel.chat() for simple operations
JSON mode + Jackson parsing for structured outputs
Multiple ChatModel beans for different temperatures

What's NOT Included

Domain-Specific Logic:

❌ Domain-specific prompts (CBT, customer support, etc.) - Products add these
❌ Business-specific mappers (CBT technique extraction, etc.) - Products add these
❌ Product-specific workflows - Products add these

Complex Abstractions:

❌ PromptTemplateManager - LangChain4j handles prompts directly
❌ AssistantFactory - Not needed for simple operations
❌ Custom exception hierarchies - RuntimeException is fine for boilerplate

Product Features:

❌ Tone dropdown presets - Product-specific UI
❌ Saved generations - Product-specific feature
❌ Templates library - Product-specific feature
❌ Branding and marketing - Product-specific

Component Design

Service Layer

Interface: TextOperationsService (or TextOpsService)

Business-agnostic API
Generic method names (not domain-specific)
Simple parameters (strings, not complex request objects)
Generic return types (not domain-specific DTOs)

Implementation: TextOperationsServiceImpl

Direct ChatModel.chat() calls
Simple prompt templates (string formatting)
JSON parsing with Jackson (for structured outputs)
Model selection (creative vs factual)
Simple error handling

Configuration Layer

AIConfiguration.java:

Extended with creative/factual ChatModel beans
Multiple beans with @Qualifier annotations
Temperature-based differentiation

AiConstants.java:

Extended with Text Operations constants
Single source of truth for defaults
Used in @Value defaults

API Layer

AIController.java:

Endpoints for Text Operations
Maps requests to service calls
Returns response DTOs
Input validation

Module Structure

The module follows a layered architecture:

Configuration Layer: Extended with creative/factual ChatModel beans
Constants Layer: Extended with Text Operations constants
Service Layer: Text Operations service interface and implementation
API Layer: Controller endpoints for Text Operations
Model Layer: Request and response DTOs

For detailed package structure and file organization, see the Implementation Guides.

Package Structure

The Text Operations module follows a layered package structure that separates concerns:

com.saas.springular.common.ai/
├── config/          # Configuration beans (@Bean methods)
├── constants/       # Constants class
├── model/           # Request/Response DTOs
├── service/         # Service interfaces
│   └── impl/        # Service implementations
└── controller/      # REST API endpoints

Layer Responsibilities:

config/: Spring configuration beans. Creates ChatModel beans, configures timeouts, temperatures. No business logic.
constants/: Single source of truth for default values (timeouts, temperatures, limits). Used in @Value defaults.
model/: Request/Response DTOs with Jakarta Validation annotations. Stateless records.
service/: Business logic interfaces. Define capabilities (rewrite, generate, etc.). No Spring annotations.
service/impl/: Service implementations. Inject ChatModel beans, handle model selection, error handling. Stateless.
controller/: REST endpoints. Handle HTTP concerns, validation, DTO mapping. Delegates to service layer.

Why This Structure:

Clear separation of concerns
Easy to test (mock services, mock models)
Follows Spring Boot conventions
Scalable (easy to add new capabilities)

For detailed file structure and implementation, see the Implementation Guides.

Statelessness and Thread Safety

Core Principle

All Text Operations services are stateless and thread-safe. This is essential for scalability and correctness.

Statelessness Rules

No Mutable Instance Variables: Service implementations must not store per-request state
Singleton Beans: All service beans are Spring singletons (default scope)
Thread-Safe Dependencies: Injected dependencies (ChatModel beans) are thread-safe
No Conversation State: Text Operations are stateless. No session memory in services.

What "Stateless" Means

❌ Not Allowed: Service classes with mutable instance variables that store per-request state (e.g., lastPrompt, selectedModel).

✅ Allowed: Service classes with only immutable dependencies injected via constructor (e.g., @RequiredArgsConstructor). All methods are pure functions (same input → same output, no side effects on instance state).

For code examples of stateless service implementations, see the Implementation Guides.

Thread Safety

Service Beans: Spring singletons are thread-safe when stateless
ChatModel Beans: LangChain4j ChatModel implementations are thread-safe
DTOs: Immutable records (Java 14+) are inherently thread-safe

Verification Checklist

When reviewing code, verify:

No mutable instance variables in service classes
All dependencies are injected via constructor (@RequiredArgsConstructor)
No per-request state stored in services
Methods are pure functions (same input → same output, no side effects on instance state)

Why Statelessness Matters

Scalability: Stateless services can be scaled horizontally
Correctness: No race conditions or thread-safety issues
Testability: Stateless code is easier to test
Simplicity: Stateless code is easier to reason about

Request Lifecycle

Every Text Operations request follows a consistent lifecycle:

1. HTTP Request
   ↓
2. Controller (Input Validation)
   ↓
3. DTO Mapping
   ↓
4. Service Layer (Model Selection)
   ↓
5. ChatModel Invocation
   ↓
6. Response Parsing (if structured output)
   ↓
7. DTO Mapping
   ↓
8. HTTP Response

Step-by-Step Process

1. HTTP Request

REST endpoint receives HTTP request
Spring maps JSON to DTO

2. Controller (Input Validation)

Jakarta Validation (@Valid) validates DTO
@Size, @NotBlank annotations enforce constraints
Invalid requests return 400 Bad Request (handled by @ControllerAdvice)

3. DTO Mapping

Controller extracts validated data from DTO
Maps to service method parameters

4. Service Layer (Model Selection)

Service selects appropriate ChatModel bean (creative/balanced/factual)
Based on operation type or user preference

5. ChatModel Invocation

Service calls ChatModel.chat()
LangChain4j handles provider communication
Timeout configured at bean level

6. Response Parsing (if structured output)

For JSON responses: parse with Jackson
For text responses: use directly

7. DTO Mapping

Service maps result to response DTO
Returns to controller

8. HTTP Response

Controller returns ResponseEntity
Spring serializes DTO to JSON

Error Handling in Lifecycle

Validation Errors: Caught by @ControllerAdvice, return 400
Service Errors: Caught by service, logged, thrown as RuntimeException
Provider Errors: Caught by service, mapped to error category, logged, thrown
Parsing Errors: Caught by service, logged, thrown as RuntimeException
All Errors: Handled by ExceptionResponseHandler (existing Springular pattern)

Input Validation

All Text Operations inputs must be validated using Jakarta Validation annotations on DTOs.

Validation Patterns

Size Limits: Use @Size(max = AiConstants.MAX_PROMPT_LENGTH) for prompt/text length limits.

Required Fields: Use @NotBlank for required string fields with clear validation messages.

For detailed DTO examples with validation annotations, see the Implementation Guides (Rewriting and Generation).

Validation Rules

Prompt/Text Length: Enforce maximum length (e.g., 4000 characters)
Required Fields: Use @NotBlank for required string fields
Optional Fields: Allow null for optional parameters
Type Validation: Jakarta Validation handles type mismatches automatically

Sanitization

Basic sanitization is handled automatically:

Trim: Jakarta Validation @NotBlank trims whitespace
Empty Rejection: @NotBlank rejects empty strings after trimming
No HTML/JS Injection: Spring automatically escapes output (if using templates)

Validation Error Handling

Validation errors are automatically handled by Spring:

Controller receives request with @Valid annotation
Jakarta Validation runs automatically
If validation fails: MethodArgumentNotValidException is thrown
ExceptionResponseHandler (existing) catches it and returns 400 Bad Request
Response includes field-level error messages

Validation errors return JSON response with status: 400 and array of error messages.

What NOT to Validate (Yet)

❌ Prompt injection patterns (defer to Phase 2+)
❌ Content moderation (defer to Phase 2+)
❌ PII detection (defer to Phase 2+)

These are important but out of scope for Phase 1. Document as future considerations.

Patterns

Pattern 1: Two-Tier Service Architecture

Pattern: Simple operations use direct ChatModel.chat(), complex operations use @AiService with RAG/memory.

For Text Operations: Use direct ChatModel.chat() for all operations (no memory/RAG needed).

Rationale: Text Operations are stateless and don't need memory/RAG. Simple operations = simple code.

Pattern 2: Consolidated Configuration

Pattern: Single AIConfiguration.java with all beans.

ChatModel bean (OllamaChatModel)
Creative/Factual ChatModel beans
All configuration via @Value with defaults from AiConstants

Why It Works:

Single place to configure AI
Environment-driven via application.yml
Constants class prevents magic numbers

For configuration examples, see Foundation.

Pattern 3: Constants Class Pattern

Pattern: AiConstants.java with all defaults.

Why It Works:

Single source of truth
Used in @Value defaults
Prevents configuration drift

Pattern 4: Multiple ChatModel Beans

Pattern: Define multiple OllamaChatModel beans with different @Qualifier annotations and temperature settings.

Use Cases:

Creative (0.9): Content generation
Balanced (0.7): Default
Factual (0.3): Summarization, translation, classification

Rationale: Different temperatures for different use cases. Temperature is configured at bean creation time in OllamaChatModel.

For code examples and configuration details, see the Implementation Guides.

Pattern 5: JSON Mode + Jackson Parsing

Pattern: Request JSON format in prompt, parse response with Jackson.

Use Case: Structured outputs (sentiment analysis, classification, extraction).

Prompt Pattern: Request JSON format in prompt with explicit format specification. Use clear instructions: "Return only valid JSON", specify exact format, no markdown wrapper.

Parsing Pattern: Parse response with Jackson ObjectMapper. Defensively handle markdown code blocks if present (trim, remove code block markers), then parse JSON. Handle JsonProcessingException by logging truncated response and throwing RuntimeException.

Why This Pattern:

Reliable structured output
Type-safe parsing
Avoids brittle regex/string splitting
Works reliably with most LLMs

For detailed code examples, see the Implementation Guides.

Pattern 6: Error Handling

Pattern: Catch provider-specific exceptions, categorize errors, log appropriately, throw RuntimeException.

Error Categories (not exception types):

Timeout: Request exceeded configured timeout
- Log with timeout duration
- Return user-friendly message
Provider Failure: LLM provider unavailable or returned error
- Log provider error details
- Return generic error message (don't expose provider internals)
Invalid Response: Response couldn't be parsed (e.g., invalid JSON)
- Log response snippet (truncated)
- Return parsing error message
Validation Error: Input validation failed
- Handled by Jakarta Validation + @ControllerAdvice
- Returns 400 Bad Request automatically

Implementation Pattern: Service methods catch provider-specific exceptions (ModelTimeoutException, ModelInvocationException), categorize them, log appropriately, and throw RuntimeException with user-friendly messages.

Integration with Existing Exception Handler:

Springular already has ExceptionResponseHandler (@ControllerAdvice). All RuntimeException instances are automatically handled and converted to appropriate HTTP responses.

Why RuntimeException:

Simple and straightforward
Integrates with existing @ControllerAdvice pattern
No need for custom exception hierarchies (over-engineering)
Error categories are documented, not encoded in types

For code examples, see the Implementation Guides.

Reliability

Timeout Configuration

All ChatModel beans must configure timeouts. Timeout is set at bean creation time in OllamaChatModel.builder().timeout(Duration.ofMillis(AiConstants.DEFAULT_TIMEOUT_MS)).

Timeout Best Practices:

Default: 5 minutes (300,000ms) for local Ollama
Remote Providers: Adjust based on network latency
Document: Timeout values in AiConstants with comments explaining rationale

Timeout Error Handling:

When timeout occurs:

ModelTimeoutException is thrown by LangChain4j
Service catches and logs timeout
Service throws RuntimeException with user-friendly message
@ControllerAdvice handles and returns appropriate HTTP status

For configuration examples, see Foundation.

Error Recovery (Future Consideration)

Not Implemented in Phase 1:

❌ Retry logic (defer to Phase 2+)
❌ Fallback model routing (defer to Phase 2+)
❌ Circuit breakers (defer to Phase 2+)

These are valuable patterns but premature for Phase 1. Document as future considerations.

Observability

Structured Logging

All Text Operations must log structured information.

Required Log Fields:

operation: Operation type (e.g., "text-generation", "text-rewriting")
model: Model identifier (e.g., "ollama-llama2:7b")
promptLength: Length of input prompt/text
latency: Request duration in milliseconds
success: Boolean indicating success/failure
errorCategory: Error category if failed (timeout, provider-failure, invalid-response)

Logging Pattern: Services log structured information at INFO level for successful operations, ERROR level for failures. Include operation type, model, prompt length (not content), latency, and success status.

For detailed logging examples, see the Implementation Guides.

Logging Policy

Prompt Content:

❌ Never log prompt content at INFO level (may contain PII or sensitive data)
✅ Log prompt length, operation type, model, latency
✅ Log prompt content only at DEBUG level (for development/debugging)

Rationale: Prompts may contain user data, PII, or sensitive information. Logging at INFO would expose this in production logs.

Token Tracking (Preparation)

Interface Preparation: Define token tracking interface (full implementation in Usage Tracking guide). Service methods call token tracker if available when provider returns token usage information.

Token Availability:

Ollama (Local): May not return token counts
OpenAI/Cloud Providers: Usually return token counts
Policy: "If provider returns token counts, record them. If not, estimate and label as estimated."

For implementation details, see Usage Tracking.

Metrics (Future Consideration)

Not Implemented in Phase 1:

❌ Micrometer counters/timers (defer to Usage Tracking guide)
❌ Cost tracking (defer to Usage Tracking guide)
❌ Dashboards/alerts (defer to Phase 2+)

Structured logging provides sufficient observability for Phase 1. Metrics integration will be added in Usage Tracking guide.

Testing Strategy

Unit Tests

Pattern: Mock ChatModel bean, test service logic.

What to Test:

✅ Model selection logic (creative vs balanced vs factual)
✅ Error handling (timeout, provider errors)
✅ Response parsing (JSON parsing, error handling)
✅ Input validation edge cases (if service does additional validation)

For unit test examples, see the Implementation Guides.

Integration Tests

Pattern: Use @SpringBootTest with optional real provider.

When to Use Integration Tests:

✅ Verify end-to-end flow (DTO → Service → Model → Response)
✅ Test with real provider (optional, can be disabled)
✅ Verify timeout configuration
❌ Not for testing business logic (use unit tests)

Test Configuration

Use application-test.yml with shorter timeout values for tests. See Implementation Guides for configuration examples.

What NOT to Test (Yet)

❌ Contract tests with golden files (defer to Phase 2+)
❌ Performance/load tests (defer to Phase 2+)
❌ Provider-specific behavior (rely on LangChain4j)

Keep tests simple and focused on business logic.

Design Decisions

Decision 1: Direct `ChatModel.chat()` Instead of `@AiService`

Why: Text Operations are stateless and don't need memory/RAG.

When to Use @AiService:

Conversation memory needed
RAG needed
Complex multi-step workflows

For Text Operations: Simple operations = simple code.

Decision 2: Multiple `ChatModel` Beans Instead of One

Why: Different temperatures for different use cases.

Creative (0.9): Content generation
Balanced (0.7): Default
Factual (0.3): Summarization, translation, classification

Alternative Considered: Single bean with dynamic temperature. Rejected because temperature is configured at bean creation time in OllamaChatModel.

Decision 3: JSON Mode Instead of LangChain4j Structured Outputs

Why: JSON mode with prompt instructions is simpler and works reliably.

For Text Operations: Request JSON format in prompt, parse with Jackson. No need for complex structured output frameworks.

Decision 4: Simple Error Handling Instead of Custom Exceptions

Why: RuntimeException is fine for boilerplate. Spring @ControllerAdvice handles HTTP responses. Keep it simple.

Decision 5: Business-Agnostic API Instead of Domain-Specific

Why: Core must work for any business. Products add domain-specific wrappers.

Example: generateText(String prompt) not generateCBTQuestions(String scenario).

Extension Points

How Products Extend This Module

Products extend the Text Operations module by creating domain-specific wrapper services that:

Add domain-specific prompt building logic
Add business rules on top of classification/analysis results
Compose multiple Text Operations for workflows

Pattern: Products inject TextOperationsService and add domain logic around it, without modifying the core module.

Examples:

Marketing Content Service: Builds marketing-specific prompts, uses generation
Content Moderation Service: Adds blocking rules based on classification results
Customer Support Service: Composes generation + rewriting for response creation

For code examples and implementation patterns, see the Implementation Guides.

Constraints and Tradeoffs

Constraints

Stateless Operations: Text Operations are stateless. No conversation memory or RAG needed.
Temperature Configuration: Temperature is configured at bean creation time in OllamaChatModel, not dynamically.
JSON Parsing: Structured outputs use JSON mode + Jackson parsing. Not using LangChain4j structured output framework.
Error Handling: Simple error handling with RuntimeException. No custom exception hierarchies.
Business-Agnostic: Core module contains no domain-specific logic. Products add wrappers.

Tradeoffs

Simplicity vs Flexibility:

✅ Chosen: Simple direct ChatModel.chat() calls
❌ Not Chosen: Complex abstractions for flexibility

Multiple Beans vs Single Bean:

✅ Chosen: Multiple ChatModel beans with different temperatures
❌ Not Chosen: Single bean with dynamic temperature (not supported by OllamaChatModel)

JSON Mode vs Structured Output Framework:

✅ Chosen: JSON mode with prompt instructions + Jackson parsing
❌ Not Chosen: LangChain4j structured output framework (adds complexity)

Simple Errors vs Custom Exceptions:

✅ Chosen: RuntimeException with Spring @ControllerAdvice
❌ Not Chosen: Custom exception hierarchies (over-engineering)

Integration Points

Phase 1 Foundation

Dependency: Requires Phase 1 Foundation (ChatModel infrastructure).

ChatModel bean provided by Phase 1
Ollama configuration provided by Phase 1
Constants and configuration patterns from Phase 1

Extension: Text Operations extends Phase 1 with:

Creative/factual ChatModel beans
Text Operations constants
Text Operations service layer

Planned Features Integration

Text Rewriting & Style Controls (Anchor Primitive):

First feature (MVP) — anchor primitive
Uses creative ChatModel bean
Includes style parameter mapping
Does not depend on Generation

Text Generation (Extension):

Second feature (MVP) — extension of rewriting
Uses creative ChatModel bean
Direct ChatModel.chat() pattern
Should not ship alone — ships as extension of rewriting

AI Usage Limits & Cost Tracking:

Operational feature
Integrates with all Text Operations
Tracks usage and costs

Implementation Guides

For step-by-step implementation instructions with code examples, see the Implementation Guides section (in priority order):

Foundation - Port minimal infrastructure from POC (Phase 1)
Rewriting - Build Text Rewriting capability (anchor primitive, first)
Generation - Build Text Generation capability (extension, second)
Usage Tracking - Build operational controls

Success Criteria

Architectural

✅ Business-agnostic design (works for any domain)
✅ Simple and straightforward (no over-engineering)
✅ Extensible (products can add wrappers)
✅ Follows Constellation patterns (two-tier architecture, constants class)

Functional

✅ All core capabilities work
✅ Appropriate temperature used per use case
✅ Structured outputs work (JSON parsing)
✅ Error handling works
✅ Integration with Phase 1 works

Reusability

✅ Works for marketing (content generation)
✅ Works for support (response generation)
✅ Works for moderation (classification)
✅ Works for translation (multi-language)
✅ No domain-specific code in core module

Architecture Overview​

What We're Building​

What Makes It Reusable​

Design Principles​

Business-Agnostic Core​

Simple Over Complex​

Configuration Over Code​

Extensible Over Fixed​

System Boundaries​

What's Included​

What's NOT Included​

Component Design​

Service Layer​

Configuration Layer​

API Layer​

Module Structure​

Package Structure​

Statelessness and Thread Safety​

Core Principle​

Statelessness Rules​

What "Stateless" Means​

Thread Safety​

Verification Checklist​

Why Statelessness Matters​

Request Lifecycle​

Step-by-Step Process​

Error Handling in Lifecycle​

Input Validation​

Validation Patterns​

Validation Rules​

Sanitization​

Validation Error Handling​

What NOT to Validate (Yet)​

Patterns​

Pattern 1: Two-Tier Service Architecture​

Pattern 2: Consolidated Configuration​

Pattern 3: Constants Class Pattern​

Pattern 4: Multiple ChatModel Beans​

Pattern 5: JSON Mode + Jackson Parsing​

Pattern 6: Error Handling​

Reliability​

Timeout Configuration​

Error Recovery (Future Consideration)​

Observability​

Structured Logging​

Logging Policy​

Token Tracking (Preparation)​

Metrics (Future Consideration)​

Testing Strategy​

Unit Tests​

Integration Tests​

Test Configuration​

What NOT to Test (Yet)​

Design Decisions​

Decision 1: Direct ChatModel.chat() Instead of @AiService​

Decision 2: Multiple ChatModel Beans Instead of One​

Decision 3: JSON Mode Instead of LangChain4j Structured Outputs​

Decision 4: Simple Error Handling Instead of Custom Exceptions​

Decision 5: Business-Agnostic API Instead of Domain-Specific​

Extension Points​

How Products Extend This Module​

Constraints and Tradeoffs​

Constraints​

Tradeoffs​

Integration Points​

Phase 1 Foundation​

Planned Features Integration​

Implementation Guides​

Success Criteria​

Architectural​

Functional​

Reusability​