Skip to main content

🏗️ Text Operations – Solution Architecture

Purpose: This document defines the system design for the Text Operations (TextOps) capability in Constellation.

This architecture document explains how Text Operations is structured, why design decisions were made, and how it integrates with the Constellation platform. For step-by-step implementation instructions with code examples, see the Implementation Guides.

Epic Reference: This architecture implements the Text Operations Epic.


Architecture Overview

What We're Building

A reusable Text Operations service that provides 4 canonical operations:

  1. Text Rewriting (Anchor Primitive) — Transform existing text (humanize, tone adjustment, summarize)
  2. Text Generation (Extension) — Create new content from prompts
  3. Structured Extraction — Convert text → JSON / fields
  4. Retrieval + Answering (RAG) — Answer questions from user-provided text (Phase 2 capability)

Anchor Principle: Text Rewriting is the anchor primitive. Everything else is additive. Generation should never ship alone — it ships as an extension of rewriting.

What Makes It Reusable

  • Zero business logic - No domain-specific prompts, mappers, or DTOs
  • Configuration-driven - Temperature, models, timeouts via application.yml
  • Simple API - Direct ChatModel.chat() calls (no @AiService unless needed)
  • Structured outputs - JSON mode + Jackson parsing (not regex string parsing)
  • Extensible - Products can add domain-specific wrappers without modifying core

Design Principles

Business-Agnostic Core

The Text Operations module provides primitive capabilities, not domain-specific features. It must work for any business: content generation, customer support, data extraction, marketing automation, etc.

Products extend this with domain-specific wrappers, but the core stays generic.

Simple Over Complex

  • Use direct ChatModel.chat() calls (not complex abstractions)
  • Use string formatting for prompts (not custom prompt managers)
  • Use JSON mode + Jackson parsing (not regex string parsing)
  • Keep error handling simple (no custom exception hierarchies)

Configuration Over Code

  • Models, temperatures, timeouts configured via application.yml
  • Constants centralized in AiConstants
  • Environment-driven (no hardcoded values)

Extensible Over Fixed

  • Products can add domain-specific wrappers
  • Core API remains stable
  • No product-specific code in the module

System Boundaries

What's Included

Core Operations (4 canonical operations):

  1. Text Rewriting (anchor primitive) — Transform existing text (text + instruction → rewritten text)
  2. Text Generation (extension) — Create new text (prompt → text)
  3. Structured Extraction — Convert text → JSON / fields (text → structured data)
  4. Retrieval + Answering (RAG) — Answer questions from user-provided text (Phase 2 capability)

Note: Summarization is a variant of rewriting. Translation, sentiment analysis, and classification are specialized cases that may be added later if fork products require them. They are not separate primitives.

Infrastructure:

  • ChatModel beans (creative, balanced, factual)
  • Service interface and implementation
  • Request/Response DTOs
  • Controller endpoints

Patterns:

  • Direct ChatModel.chat() for simple operations
  • JSON mode + Jackson parsing for structured outputs
  • Multiple ChatModel beans for different temperatures

What's NOT Included

Domain-Specific Logic:

  • ❌ Domain-specific prompts (CBT, customer support, etc.) - Products add these
  • ❌ Business-specific mappers (CBT technique extraction, etc.) - Products add these
  • ❌ Product-specific workflows - Products add these

Complex Abstractions:

  • ❌ PromptTemplateManager - LangChain4j handles prompts directly
  • ❌ AssistantFactory - Not needed for simple operations
  • ❌ Custom exception hierarchies - RuntimeException is fine for boilerplate

Product Features:

  • ❌ Tone dropdown presets - Product-specific UI
  • ❌ Saved generations - Product-specific feature
  • ❌ Templates library - Product-specific feature
  • ❌ Branding and marketing - Product-specific

Component Design

Service Layer

Interface: TextOperationsService (or TextOpsService)

  • Business-agnostic API
  • Generic method names (not domain-specific)
  • Simple parameters (strings, not complex request objects)
  • Generic return types (not domain-specific DTOs)

Implementation: TextOperationsServiceImpl

  • Direct ChatModel.chat() calls
  • Simple prompt templates (string formatting)
  • JSON parsing with Jackson (for structured outputs)
  • Model selection (creative vs factual)
  • Simple error handling

Configuration Layer

AIConfiguration.java:

  • Extended with creative/factual ChatModel beans
  • Multiple beans with @Qualifier annotations
  • Temperature-based differentiation

AiConstants.java:

  • Extended with Text Operations constants
  • Single source of truth for defaults
  • Used in @Value defaults

API Layer

AIController.java:

  • Endpoints for Text Operations
  • Maps requests to service calls
  • Returns response DTOs
  • Input validation

Module Structure

The module follows a layered architecture:

  • Configuration Layer: Extended with creative/factual ChatModel beans
  • Constants Layer: Extended with Text Operations constants
  • Service Layer: Text Operations service interface and implementation
  • API Layer: Controller endpoints for Text Operations
  • Model Layer: Request and response DTOs

For detailed package structure and file organization, see the Implementation Guides.


Package Structure

The Text Operations module follows a layered package structure that separates concerns:

com.saas.springular.common.ai/
├── config/ # Configuration beans (@Bean methods)
├── constants/ # Constants class
├── model/ # Request/Response DTOs
├── service/ # Service interfaces
│ └── impl/ # Service implementations
└── controller/ # REST API endpoints

Layer Responsibilities:

  • config/: Spring configuration beans. Creates ChatModel beans, configures timeouts, temperatures. No business logic.
  • constants/: Single source of truth for default values (timeouts, temperatures, limits). Used in @Value defaults.
  • model/: Request/Response DTOs with Jakarta Validation annotations. Stateless records.
  • service/: Business logic interfaces. Define capabilities (rewrite, generate, etc.). No Spring annotations.
  • service/impl/: Service implementations. Inject ChatModel beans, handle model selection, error handling. Stateless.
  • controller/: REST endpoints. Handle HTTP concerns, validation, DTO mapping. Delegates to service layer.

Why This Structure:

  • Clear separation of concerns
  • Easy to test (mock services, mock models)
  • Follows Spring Boot conventions
  • Scalable (easy to add new capabilities)

For detailed file structure and implementation, see the Implementation Guides.


Statelessness and Thread Safety

Core Principle

All Text Operations services are stateless and thread-safe. This is essential for scalability and correctness.

Statelessness Rules

  1. No Mutable Instance Variables: Service implementations must not store per-request state
  2. Singleton Beans: All service beans are Spring singletons (default scope)
  3. Thread-Safe Dependencies: Injected dependencies (ChatModel beans) are thread-safe
  4. No Conversation State: Text Operations are stateless. No session memory in services.

What "Stateless" Means

❌ Not Allowed: Service classes with mutable instance variables that store per-request state (e.g., lastPrompt, selectedModel).

✅ Allowed: Service classes with only immutable dependencies injected via constructor (e.g., @RequiredArgsConstructor). All methods are pure functions (same input → same output, no side effects on instance state).

For code examples of stateless service implementations, see the Implementation Guides.

Thread Safety

  • Service Beans: Spring singletons are thread-safe when stateless
  • ChatModel Beans: LangChain4j ChatModel implementations are thread-safe
  • DTOs: Immutable records (Java 14+) are inherently thread-safe

Verification Checklist

When reviewing code, verify:

  • No mutable instance variables in service classes
  • All dependencies are injected via constructor (@RequiredArgsConstructor)
  • No per-request state stored in services
  • Methods are pure functions (same input → same output, no side effects on instance state)

Why Statelessness Matters

  • Scalability: Stateless services can be scaled horizontally
  • Correctness: No race conditions or thread-safety issues
  • Testability: Stateless code is easier to test
  • Simplicity: Stateless code is easier to reason about

Request Lifecycle

Every Text Operations request follows a consistent lifecycle:

1. HTTP Request

2. Controller (Input Validation)

3. DTO Mapping

4. Service Layer (Model Selection)

5. ChatModel Invocation

6. Response Parsing (if structured output)

7. DTO Mapping

8. HTTP Response

Step-by-Step Process

1. HTTP Request

  • REST endpoint receives HTTP request
  • Spring maps JSON to DTO

2. Controller (Input Validation)

  • Jakarta Validation (@Valid) validates DTO
  • @Size, @NotBlank annotations enforce constraints
  • Invalid requests return 400 Bad Request (handled by @ControllerAdvice)

3. DTO Mapping

  • Controller extracts validated data from DTO
  • Maps to service method parameters

4. Service Layer (Model Selection)

  • Service selects appropriate ChatModel bean (creative/balanced/factual)
  • Based on operation type or user preference

5. ChatModel Invocation

  • Service calls ChatModel.chat()
  • LangChain4j handles provider communication
  • Timeout configured at bean level

6. Response Parsing (if structured output)

  • For JSON responses: parse with Jackson
  • For text responses: use directly

7. DTO Mapping

  • Service maps result to response DTO
  • Returns to controller

8. HTTP Response

  • Controller returns ResponseEntity
  • Spring serializes DTO to JSON

Error Handling in Lifecycle

  • Validation Errors: Caught by @ControllerAdvice, return 400
  • Service Errors: Caught by service, logged, thrown as RuntimeException
  • Provider Errors: Caught by service, mapped to error category, logged, thrown
  • Parsing Errors: Caught by service, logged, thrown as RuntimeException
  • All Errors: Handled by ExceptionResponseHandler (existing Springular pattern)

Input Validation

All Text Operations inputs must be validated using Jakarta Validation annotations on DTOs.

Validation Patterns

Size Limits: Use @Size(max = AiConstants.MAX_PROMPT_LENGTH) for prompt/text length limits.

Required Fields: Use @NotBlank for required string fields with clear validation messages.

For detailed DTO examples with validation annotations, see the Implementation Guides (Rewriting and Generation).

Validation Rules

  1. Prompt/Text Length: Enforce maximum length (e.g., 4000 characters)
  2. Required Fields: Use @NotBlank for required string fields
  3. Optional Fields: Allow null for optional parameters
  4. Type Validation: Jakarta Validation handles type mismatches automatically

Sanitization

Basic sanitization is handled automatically:

  • Trim: Jakarta Validation @NotBlank trims whitespace
  • Empty Rejection: @NotBlank rejects empty strings after trimming
  • No HTML/JS Injection: Spring automatically escapes output (if using templates)

Validation Error Handling

Validation errors are automatically handled by Spring:

  1. Controller receives request with @Valid annotation
  2. Jakarta Validation runs automatically
  3. If validation fails: MethodArgumentNotValidException is thrown
  4. ExceptionResponseHandler (existing) catches it and returns 400 Bad Request
  5. Response includes field-level error messages

Validation errors return JSON response with status: 400 and array of error messages.

What NOT to Validate (Yet)

  • ❌ Prompt injection patterns (defer to Phase 2+)
  • ❌ Content moderation (defer to Phase 2+)
  • ❌ PII detection (defer to Phase 2+)

These are important but out of scope for Phase 1. Document as future considerations.


Patterns

Pattern 1: Two-Tier Service Architecture

Pattern: Simple operations use direct ChatModel.chat(), complex operations use @AiService with RAG/memory.

For Text Operations: Use direct ChatModel.chat() for all operations (no memory/RAG needed).

Rationale: Text Operations are stateless and don't need memory/RAG. Simple operations = simple code.

Pattern 2: Consolidated Configuration

Pattern: Single AIConfiguration.java with all beans.

  • ChatModel bean (OllamaChatModel)
  • Creative/Factual ChatModel beans
  • All configuration via @Value with defaults from AiConstants

Why It Works:

  • Single place to configure AI
  • Environment-driven via application.yml
  • Constants class prevents magic numbers

For configuration examples, see Foundation.

Pattern 3: Constants Class Pattern

Pattern: AiConstants.java with all defaults.

Why It Works:

  • Single source of truth
  • Used in @Value defaults
  • Prevents configuration drift

Pattern 4: Multiple ChatModel Beans

Pattern: Define multiple OllamaChatModel beans with different @Qualifier annotations and temperature settings.

Use Cases:

  • Creative (0.9): Content generation
  • Balanced (0.7): Default
  • Factual (0.3): Summarization, translation, classification

Rationale: Different temperatures for different use cases. Temperature is configured at bean creation time in OllamaChatModel.

For code examples and configuration details, see the Implementation Guides.

Pattern 5: JSON Mode + Jackson Parsing

Pattern: Request JSON format in prompt, parse response with Jackson.

Use Case: Structured outputs (sentiment analysis, classification, extraction).

Prompt Pattern: Request JSON format in prompt with explicit format specification. Use clear instructions: "Return only valid JSON", specify exact format, no markdown wrapper.

Parsing Pattern: Parse response with Jackson ObjectMapper. Defensively handle markdown code blocks if present (trim, remove code block markers), then parse JSON. Handle JsonProcessingException by logging truncated response and throwing RuntimeException.

Why This Pattern:

  • Reliable structured output
  • Type-safe parsing
  • Avoids brittle regex/string splitting
  • Works reliably with most LLMs

For detailed code examples, see the Implementation Guides.

Pattern 6: Error Handling

Pattern: Catch provider-specific exceptions, categorize errors, log appropriately, throw RuntimeException.

Error Categories (not exception types):

  1. Timeout: Request exceeded configured timeout

    • Log with timeout duration
    • Return user-friendly message
  2. Provider Failure: LLM provider unavailable or returned error

    • Log provider error details
    • Return generic error message (don't expose provider internals)
  3. Invalid Response: Response couldn't be parsed (e.g., invalid JSON)

    • Log response snippet (truncated)
    • Return parsing error message
  4. Validation Error: Input validation failed

    • Handled by Jakarta Validation + @ControllerAdvice
    • Returns 400 Bad Request automatically

Implementation Pattern: Service methods catch provider-specific exceptions (ModelTimeoutException, ModelInvocationException), categorize them, log appropriately, and throw RuntimeException with user-friendly messages.

Integration with Existing Exception Handler:

Springular already has ExceptionResponseHandler (@ControllerAdvice). All RuntimeException instances are automatically handled and converted to appropriate HTTP responses.

Why RuntimeException:

  • Simple and straightforward
  • Integrates with existing @ControllerAdvice pattern
  • No need for custom exception hierarchies (over-engineering)
  • Error categories are documented, not encoded in types

For code examples, see the Implementation Guides.


Reliability

Timeout Configuration

All ChatModel beans must configure timeouts. Timeout is set at bean creation time in OllamaChatModel.builder().timeout(Duration.ofMillis(AiConstants.DEFAULT_TIMEOUT_MS)).

Timeout Best Practices:

  • Default: 5 minutes (300,000ms) for local Ollama
  • Remote Providers: Adjust based on network latency
  • Document: Timeout values in AiConstants with comments explaining rationale

Timeout Error Handling:

When timeout occurs:

  1. ModelTimeoutException is thrown by LangChain4j
  2. Service catches and logs timeout
  3. Service throws RuntimeException with user-friendly message
  4. @ControllerAdvice handles and returns appropriate HTTP status

For configuration examples, see Foundation.

Error Recovery (Future Consideration)

Not Implemented in Phase 1:

  • ❌ Retry logic (defer to Phase 2+)
  • ❌ Fallback model routing (defer to Phase 2+)
  • ❌ Circuit breakers (defer to Phase 2+)

These are valuable patterns but premature for Phase 1. Document as future considerations.


Observability

Structured Logging

All Text Operations must log structured information.

Required Log Fields:

  • operation: Operation type (e.g., "text-generation", "text-rewriting")
  • model: Model identifier (e.g., "ollama-llama2:7b")
  • promptLength: Length of input prompt/text
  • latency: Request duration in milliseconds
  • success: Boolean indicating success/failure
  • errorCategory: Error category if failed (timeout, provider-failure, invalid-response)

Logging Pattern: Services log structured information at INFO level for successful operations, ERROR level for failures. Include operation type, model, prompt length (not content), latency, and success status.

For detailed logging examples, see the Implementation Guides.

Logging Policy

Prompt Content:

  • Never log prompt content at INFO level (may contain PII or sensitive data)
  • ✅ Log prompt length, operation type, model, latency
  • ✅ Log prompt content only at DEBUG level (for development/debugging)

Rationale: Prompts may contain user data, PII, or sensitive information. Logging at INFO would expose this in production logs.

Token Tracking (Preparation)

Interface Preparation: Define token tracking interface (full implementation in Usage Tracking guide). Service methods call token tracker if available when provider returns token usage information.

Token Availability:

  • Ollama (Local): May not return token counts
  • OpenAI/Cloud Providers: Usually return token counts
  • Policy: "If provider returns token counts, record them. If not, estimate and label as estimated."

For implementation details, see Usage Tracking.

Metrics (Future Consideration)

Not Implemented in Phase 1:

  • ❌ Micrometer counters/timers (defer to Usage Tracking guide)
  • ❌ Cost tracking (defer to Usage Tracking guide)
  • ❌ Dashboards/alerts (defer to Phase 2+)

Structured logging provides sufficient observability for Phase 1. Metrics integration will be added in Usage Tracking guide.


Testing Strategy

Unit Tests

Pattern: Mock ChatModel bean, test service logic.

What to Test:

  • ✅ Model selection logic (creative vs balanced vs factual)
  • ✅ Error handling (timeout, provider errors)
  • ✅ Response parsing (JSON parsing, error handling)
  • ✅ Input validation edge cases (if service does additional validation)

For unit test examples, see the Implementation Guides.

Integration Tests

Pattern: Use @SpringBootTest with optional real provider.

When to Use Integration Tests:

  • ✅ Verify end-to-end flow (DTO → Service → Model → Response)
  • ✅ Test with real provider (optional, can be disabled)
  • ✅ Verify timeout configuration
  • ❌ Not for testing business logic (use unit tests)

Test Configuration

Use application-test.yml with shorter timeout values for tests. See Implementation Guides for configuration examples.

What NOT to Test (Yet)

  • ❌ Contract tests with golden files (defer to Phase 2+)
  • ❌ Performance/load tests (defer to Phase 2+)
  • ❌ Provider-specific behavior (rely on LangChain4j)

Keep tests simple and focused on business logic.


Design Decisions

Decision 1: Direct ChatModel.chat() Instead of @AiService

Why: Text Operations are stateless and don't need memory/RAG.

When to Use @AiService:

  • Conversation memory needed
  • RAG needed
  • Complex multi-step workflows

For Text Operations: Simple operations = simple code.

Decision 2: Multiple ChatModel Beans Instead of One

Why: Different temperatures for different use cases.

  • Creative (0.9): Content generation
  • Balanced (0.7): Default
  • Factual (0.3): Summarization, translation, classification

Alternative Considered: Single bean with dynamic temperature. Rejected because temperature is configured at bean creation time in OllamaChatModel.

Decision 3: JSON Mode Instead of LangChain4j Structured Outputs

Why: JSON mode with prompt instructions is simpler and works reliably.

For Text Operations: Request JSON format in prompt, parse with Jackson. No need for complex structured output frameworks.

Decision 4: Simple Error Handling Instead of Custom Exceptions

Why: RuntimeException is fine for boilerplate. Spring @ControllerAdvice handles HTTP responses. Keep it simple.

Decision 5: Business-Agnostic API Instead of Domain-Specific

Why: Core must work for any business. Products add domain-specific wrappers.

Example: generateText(String prompt) not generateCBTQuestions(String scenario).


Extension Points

How Products Extend This Module

Products extend the Text Operations module by creating domain-specific wrapper services that:

  • Add domain-specific prompt building logic
  • Add business rules on top of classification/analysis results
  • Compose multiple Text Operations for workflows

Pattern: Products inject TextOperationsService and add domain logic around it, without modifying the core module.

Examples:

  • Marketing Content Service: Builds marketing-specific prompts, uses generation
  • Content Moderation Service: Adds blocking rules based on classification results
  • Customer Support Service: Composes generation + rewriting for response creation

For code examples and implementation patterns, see the Implementation Guides.


Constraints and Tradeoffs

Constraints

  1. Stateless Operations: Text Operations are stateless. No conversation memory or RAG needed.

  2. Temperature Configuration: Temperature is configured at bean creation time in OllamaChatModel, not dynamically.

  3. JSON Parsing: Structured outputs use JSON mode + Jackson parsing. Not using LangChain4j structured output framework.

  4. Error Handling: Simple error handling with RuntimeException. No custom exception hierarchies.

  5. Business-Agnostic: Core module contains no domain-specific logic. Products add wrappers.

Tradeoffs

Simplicity vs Flexibility:

  • ✅ Chosen: Simple direct ChatModel.chat() calls
  • ❌ Not Chosen: Complex abstractions for flexibility

Multiple Beans vs Single Bean:

  • ✅ Chosen: Multiple ChatModel beans with different temperatures
  • ❌ Not Chosen: Single bean with dynamic temperature (not supported by OllamaChatModel)

JSON Mode vs Structured Output Framework:

  • ✅ Chosen: JSON mode with prompt instructions + Jackson parsing
  • ❌ Not Chosen: LangChain4j structured output framework (adds complexity)

Simple Errors vs Custom Exceptions:

  • ✅ Chosen: RuntimeException with Spring @ControllerAdvice
  • ❌ Not Chosen: Custom exception hierarchies (over-engineering)

Integration Points

Phase 1 Foundation

Dependency: Requires Phase 1 Foundation (ChatModel infrastructure).

  • ChatModel bean provided by Phase 1
  • Ollama configuration provided by Phase 1
  • Constants and configuration patterns from Phase 1

Extension: Text Operations extends Phase 1 with:

  • Creative/factual ChatModel beans
  • Text Operations constants
  • Text Operations service layer

Planned Features Integration

Text Rewriting & Style Controls (Anchor Primitive):

  • First feature (MVP) — anchor primitive
  • Uses creative ChatModel bean
  • Includes style parameter mapping
  • Does not depend on Generation

Text Generation (Extension):

  • Second feature (MVP) — extension of rewriting
  • Uses creative ChatModel bean
  • Direct ChatModel.chat() pattern
  • Should not ship alone — ships as extension of rewriting

AI Usage Limits & Cost Tracking:

  • Operational feature
  • Integrates with all Text Operations
  • Tracks usage and costs

Implementation Guides

For step-by-step implementation instructions with code examples, see the Implementation Guides section (in priority order):

  • Foundation - Port minimal infrastructure from POC (Phase 1)
  • Rewriting - Build Text Rewriting capability (anchor primitive, first)
  • Generation - Build Text Generation capability (extension, second)
  • Usage Tracking - Build operational controls

Success Criteria

Architectural

  • ✅ Business-agnostic design (works for any domain)
  • ✅ Simple and straightforward (no over-engineering)
  • ✅ Extensible (products can add wrappers)
  • ✅ Follows Constellation patterns (two-tier architecture, constants class)

Functional

  • ✅ All core capabilities work
  • ✅ Appropriate temperature used per use case
  • ✅ Structured outputs work (JSON parsing)
  • ✅ Error handling works
  • ✅ Integration with Phase 1 works

Reusability

  • ✅ Works for marketing (content generation)
  • ✅ Works for support (response generation)
  • ✅ Works for moderation (classification)
  • ✅ Works for translation (multi-language)
  • ✅ No domain-specific code in core module