Subsections of Architecture
System Overview
ContentGrid is a model- and API-first Enterprise Content Management (ECM) platform that generates complete,
standards-based REST APIs directly from your data model. The platform consists of two primary environments: a Management
Platform for application configuration and a Runtime Platform for execution.
What Makes ContentGrid Different
Traditional ECM systems often rely on folder hierarchies, generic database schemas, and monolithic architectures.
ContentGrid takes a fundamentally different approach:
Model-First Design: Define your data model with entities, attributes, and relations. ContentGrid automatically
generates intuitive REST APIs that mirror your domain model—URLs like /invoices and /invoices/{id} emerge naturally
from your schema.
Relational Database Foundation: Unlike ECMs that use generic key-value storage, ContentGrid leverages PostgreSQL’s
full capabilities by mapping entities directly to tables and columns. This enables efficient queries, proper indexes,
and standard database tooling.
Attribute-Based Access Control: Instead of folder-based permissions, ContentGrid uses ABAC policies that evaluate
entity and user attributes. This provides flexible, fine-grained access control independent of data organization.
Small Core, Extensible Ecosystem: The platform core provides only essential ECM functionality. Everything else is
implemented as external automations that integrate seamlessly through standard protocols.
Architecture at a Glance
graph TB
subgraph "Runtime Platform"
Gateway[Gateway<br/>Routing & Authorization]
Keycloak[Keycloak<br/>Authentication]
OPA[OPA<br/>Policy Evaluation]
Solon[Solon<br/>Policy Collection]
Navigator[Navigator<br/>Web UI]
subgraph "Application Instance"
AppServer[Application Server<br/>Dynamic API]
end
subgraph "Persistent Storage"
DB[(PostgreSQL<br/>Metadata & Structure)]
S3[(S3-Compatible<br/>Content Storage)]
end
end
Gateway --> OPA
Gateway --> AppServer
AppServer --> Solon
Solon --> OPA
AppServer --> DB
AppServer --> S3
Users[Clients] --> Keycloak
Users --> Gateway
Users --> Navigator
MP[Management Platform]
MP -->|Configure| Gateway
MP -->|Configure| AppServer
The Management Platform is where you define, configure, and deploy ContentGrid applications. It consists of three
primary components:
- Architect: Source of truth for application models (entities, permissions, configurations)
- Scribe: Transforms models into deployment artifacts (application model, database migrations, policies, OpenAPI specs)
- Captain: Orchestrates infrastructure provisioning and Kubernetes deployment
The platform runs as a SaaS service—you don’t need to host or manage it yourself. For complete details on how these
components work together to enable the full application lifecycle, see Management Platform.
The Runtime Platform provides the infrastructure and services to run ContentGrid applications securely and efficiently.
Key components include:
- Gateway: Entry point for all requests, handles routing, authentication, and policy evaluation
- Keycloak: Identity and access management via OpenID Connect
- OPA: Centralized policy engine for attribute-based access control
- Navigator: Shared React frontend that dynamically adapts to any application model
- Solon: Collects policies from all applications and bundles them for OPA.
- Pathfinder: Automatic Ingress management and TLS certificate provisioning
All components run in Kubernetes and use dynamic service discovery to automatically detect and integrate new
applications. For complete details on platform architecture, request flow, and operational characteristics,
see Runtime Platform.
Application Server
Each ContentGrid application runs as an instance of the Application Server—a single shared container image configured
with application-specific artifacts. The server uses a configuration-driven approach rather than code generation,
enabling rapid deployment of model changes.
For complete details on architecture, query construction, and performance characteristics,
see Application Server.
Data Storage
ContentGrid separates structured metadata from binary content for optimal performance:
PostgreSQL: Each application has its own database with schema generated automatically from the data model. Entities
map to tables, attributes to columns, and relations to foreign keys. Flyway manages migrations.
S3-Compatible Storage: Binary content (documents, images, videos) is stored in dedicated S3 buckets per application.
Encryption: Transparent application side encryption at rest.
For complete details on encryption architecture, key management, and range request support,
see Data Storage.
Deployment Pipeline
The deployment pipeline automates the complete path from application model to running service:
- Artifact Generation: Scribe generates a versioned ZIP artifact with model, migrations, and policies
- Infrastructure Provisioning: Captain provisions database, S3 bucket, and Keycloak realm
- Kubernetes Deployment: Captain creates all necessary Kubernetes resources
- Application Startup: Application Server fetches artifact, runs migrations, and begins serving
For complete details on the deployment see Deployment Pipeline.
Integration Points
The platform provides several mechanisms for external automations to extend functionality:
- Service accounts: Automations authenticate via OIDC client credentials flow
- Webhook Notifications: Automations subscribe to create/update/delete events
External automations integrate seamlessly, they can use the same HAL/HAL-FORMS patterns as core functionality, making
them familiar to API consumers.
Key Architectural Benefits
Developer Experience: Standards-based APIs (HAL, HAL-FORMS, RFC 9110) reduce learning curves and enable existing
tooling. Model-first approach means intuitive URLs that match your domain.
Scalability: Each application is independently scalable. Kubernetes-native architecture enables horizontal scaling
of application instances. Database and storage scale independently.
Security: ABAC policies evaluated at query time prevent unauthorized data from being loaded. Content encryption
ensures data confidentiality. Separate isolation per application.
Flexibility: Small core with automation extension points allows customization without forking. Replace or disable
automations independently. Multiple implementations of the same functionality can coexist.
Operational Simplicity: Single shared container image for all applications. Configuration-driven rather than code
changes. Standard Kubernetes operations for deployment and scaling.
Next Steps
For deeper understanding of specific subsystems:
Architecture Principles
This document explains the design philosophy driving ContentGrid’s architecture and the principles that guide technical
decisions. Understanding these principles helps evaluate how the platform aligns with your requirements and technical
values.
Foundational Principles
API-First
The application exposes a REST API as the only interaction point—for both end-users and external automations. There is
no separate “admin API” or backend interface; everything flows through the same well-defined API.
The REST API uses the HAL response format. HAL’s
hypermedia links allow referencing related endpoints and discovering capabilities dynamically. At the same time, the
core data is available as plain JSON objects—developers who don’t need HAL’s hypermedia features can simply use the
standard JSON payload.
For a single application with a known model, the API structure is regular and predictable. HAL becomes essential when
building generic tools that need to work across multiple different models (e.g., different projects or organizations).
These generic consumers use HAL to adapt automatically to any data model.
Model-First
The API is generated directly from your relational content model. Entities with attributes and relations map naturally
to URLs you’d expect from a hand-crafted application. For example, an invoice entity generates /invoices and
/invoices/{id} endpoints automatically.
The data APIs don’t expose model abstractions like “entity,” “attribute,” or “relation” in their payload structure.
These concepts are internal implementation details. Your API consumers work with invoices, suppliers, and
documents—domain concepts, not meta-model concepts.
For developers building generic integrations, a separate model API (/profile and /profile/{entity}) does expose the
entity and attribute metadata. This information is static for a particular application version and enables automations
to adapt to any model without hardcoded entity knowledge.
Small Core
The application core provides only essential ECM functionality that cannot be implemented externally. Everything else is
pushed to external automations.
Why keep the core small?
-
Scalability: Core functionality is critical for the functioning of ContentGrid. Keeping it small makes it easier
to scale this part horizontally, making sure it can handle the requests with good response times.
-
Reduced Security Surface: Fewer features in the core mean fewer places where permission checks can be
misconfigured. A smaller codebase is easier to audit and secure.
-
Easier Maintenance: A well-defined, focused core is easier to understand, test, and maintain. The dynamic nature
of model-driven applications already introduces complexity—keeping the core small manages this complexity.
Extensibility
Extensions add capabilities beyond the core platform while remaining loosely coupled. They can be developed in-house, by
customers, or by third parties.
Extensions are not implicitly trusted. Token-based authentication ensures users and extensions have appropriate
access without exposing primary credentials. Extensions can act on behalf of users (delegated access) or under their
own identity (system access). When extensions act on behalf of users, they receive at most the user’s privileges—never
more.
This security model enables safe integration of third-party services while maintaining ContentGrid’s permission model.
Extensions access data through the same REST API as any other client, ensuring consistent authorization enforcement.
Summary
ContentGrid’s architecture principles prioritize developer experience, operational flexibility, and data security. By
keeping the core small and designing for extensibility, the platform enables rapid
development while maintaining long-term maintainability.
The model-first, API-first approach generates intuitive, standards-compliant REST APIs from your domain model. The clear
separation between core and automations provides flexibility without sacrificing integration quality.
The Management Platform is where you define, configure, and deploy ContentGrid applications. It runs as a SaaS
service—you don’t need to host or manage it yourself. This page explains its internal architecture and how it transforms
application models into deployable artifacts.
The Management Platform consists of three primary components that work together to enable the full application
lifecycle: from model definition through artifact generation to deployment orchestration.
graph TB
subgraph "Management Platform"
Architect[Architect<br/>Model Definition]
Scribe[Scribe<br/>Artifact Generation]
Captain[Captain<br/>Deployment Orchestration]
end
subgraph "Runtime Platform"
K8sResources[Resources<br/>ConfigMaps, Secrets, Deployments]
end
subgraph "Persistent Storage"
DB[(PostgreSQL<br/>Metadata & Structure)]
S3[(S3-Compatible<br/>Content Storage)]
end
ArtifactStorage[(S3 Artifact Storage)]
Architect --> Scribe
Scribe --> Captain
Captain -->|Upload Artifact| ArtifactStorage
Captain -->|Deploy| K8sResources
Captain -->|Provision database| DB
Captain -->|Provision bucket| S3
Captain -->|Provision realm| Keycloak[Keycloak<br/>Authentication]
Core Components
Architect
Architect serves as the source of truth for application models. It stores and manages all configuration that defines how
your ContentGrid application behaves. The concepts Architect manages are described in
the concepts section.
What Architect Stores:
- Domain Model: Entity definitions with attributes and relations
- Permission Model: Access control policies and rules
- Automation Configurations: Settings for external automation integrations
Versioning and History: Architect maintains a complete history of model changes, enabling audit trails and the
ability to understand how the application evolved over time.
API Access: Architect provides an API for retrieving application models. When you make changes through the
management interface, Architect persists them and makes them available to other components.
Scribe
Scribe transforms application models from Architect into deployable artifacts. It’s the compilation step that bridges
the gap between high-level model definitions and concrete implementation artifacts.
Artifact Generation Process:
- Fetch Model: Scribe retrieves the complete application model from Architect
- Generate Migrations: Analyzes model changes and generates SQL migration scripts (Flyway format)
- For new applications, generates CREATE TABLE statements
- For updates, compares current model to previous version and generates ALTER TABLE statements
- Ensures database schema stays synchronized with the model
- Compile Policies: Converts permission definitions into Rego policies for OPA
- Translates user-friendly permission rules into efficient policy evaluation code
- Generates policy packages with proper namespacing per application
- Generate OpenAPI Specification: Creates OpenAPI documentation for the REST API based on the model
- Package Artifact: Creates a ZIP file containing:
application-model.json: Complete model definition
migrations/: SQL scripts for database schema changes
policies.rego: Access control policies
openapi.yaml: API specification
manifest.json: Metadata (organization, project, version, changeset, timestamps)
Artifact Metadata: The manifest includes traceability information—which changeset triggered the build, what version
of Scribe generated it, and when. This enables audit trails and troubleshooting.
Reproducibility: Given the same input model and version of Scribe, the output artifact is deterministic. This
ensures consistent behavior across environments and deployments.
Captain
Captain orchestrates the entire deployment process. It coordinates between Scribe, infrastructure provisioning, and
Kubernetes deployment, handling all the complexity of getting an application from model to running service.
Responsibilities:
- Application Configuration: Storing application settings
- IAM: Creating Realms, managing users, groups and attributes
- Artifact Management: Request artifact generation from Scribe and upload to shared artifact storage
- Infrastructure Provisioning: Provision databases, S3 buckets, and Keycloak realms for new applications
- Kubernetes Orchestration: Create and update Kubernetes resources (Deployments, Services, ConfigMaps, Secrets)
- Lifecycle Management: Handle application creation, updates, deletion
- Credential Management: Generate and distribute database credentials, S3 access keys, and authentication secrets
- Environment Configuration: Inject environment-specific configuration (database URLs, domain names, etc.)
Zero-Configuration Deployment: Captain abstracts away the complexity of Kubernetes and cloud infrastructure,
providing a simple “deploy this application” interface to the management platform. When you deploy an application,
Captain handles all the details automatically.
Deployment Flow
When you deploy an application, the Management Platform components work together to transform application models into
running services. The high-level flow involves:
- Deploy Request: User initiates deployment through the management interface
- Artifact Generation: Captain requests Scribe to generate an artifact from Architect’s model
- Infrastructure Deployment: Captain stores artifact, provisions infrastructure and coordinates Kubernetes
deployment
- Application Startup: Application Server fetches artifacts and begins serving
For detailed information about the deployment pipeline and lifecycle, see Deployment Pipeline.
Summary
The Management Platform provides a complete application lifecycle management system:
- Architect: Source of truth for application models
- Scribe: Transforms models into deployable artifacts with migrations and policies
- Captain: Orchestrates infrastructure provisioning and Kubernetes deployment
The platform abstracts away the complexity of cloud infrastructure and deployment orchestration, providing a simple
interface for defining and deploying applications. Its artifact-based approach ensures reproducible, auditable
deployments with full version history.
For details on how applications run after deployment, see Runtime Platform.
The Runtime Platform provides the infrastructure and services required to run, secure, and manage ContentGrid
applications. It integrates several components to handle authentication, authorization, policy enforcement, routing, and
frontend delivery.
The Runtime Platform is designed as a Kubernetes-native system. Components dynamically discover resources like
ConfigMaps, Secrets, and Services to route requests, enforce policies, and configure applications. This dynamic
discovery enables zero-configuration deployment of new applications—the platform automatically detects and integrates
them.
graph TB
subgraph "Runtime Platform"
Gateway[Gateway<br/>Entry Point & Routing]
Keycloak[Keycloak<br/>Authentication]
OPA[OPA<br/>Policy Evaluation]
Solon[Solon<br/>Policy Collection]
Navigator[Navigator<br/>Shared Frontend]
Liaison[Liaison<br/>Config Service]
Pathfinder[Pathfinder<br/>Ingress Management]
Ingress[Kubernetes Ingress]
subgraph "Application Instance"
AppServer[Application Server]
end
end
Client --> Gateway
Client --> Keycloak
Gateway --> OPA
Gateway --> AppServer
Client --> Navigator
Navigator --> |webbrowser request| Liaison
AppServer --> Solon
Solon --> OPA
Pathfinder -.->|Creates| Ingress
Ingress -.->|Routes to| Gateway
Core Components
Gateway
The Gateway serves as the entry point for all ContentGrid applications. It handles routing and
coordinates with authentication and authorization services.
Primary Responsibilities:
- Route requests to the appropriate application based on the request’s domain
- Enforce CORS policies configured per application
- Coordinate user authentication with Keycloak
- Communicate with Open Policy Agent (OPA) for policy evaluation
Dynamic Routing: The Gateway maintains a mapping from domains to application IDs by reading Kubernetes ConfigMaps.
When a request arrives for a specific domain name, the Gateway uses this mapping to determine the corresponding
application and routes to the appropriate Service for that application.
CORS Configuration: Each application’s CORS origins are configured in a ConfigMap. The Gateway reads these
configurations and merges CORS settings for both the API backend and the Navigator frontend, ensuring cross-origin
requests are properly handled.
Application Server
The Application Server serves dynamic REST APIs generated from application models. Each ContentGrid application runs as an instance of the same Application Server container, with behavior determined by the application artifact loaded at startup.
The Application Server follows a configuration-driven approach where a single container image serves all applications, enabling consistent operations and rapid iteration without code generation.
For complete details on the Application Server architecture and components, see Application Server.
Keycloak
Keycloak provides authentication and identity management for the platform. Each application has a corresponding realm in
Keycloak, though applications can share a realm (typically one realm per organization).
Key Functions:
- Authenticate users via OpenID Connect (OIDC)
- Store user attributes used in authorization policies
- Issue JWT tokens containing user identity and attributes
- Manage OAuth clients for both API access and frontend applications
User attributes stored in Keycloak (such as department, role, or clearance level) are included in JWT tokens and used by
applications when evaluating attribute-based access control policies.
Keycloak is an open source project. More information about Keycloak can be found on keycloak.org.
Open Policy Agent
OPA is a centralized policy engine that evaluates attribute-based access control (ABAC) policies for all applications in
the platform.
Key Functions:
- Evaluates Rego policies to determine authorization decisions
- Performs partial evaluation to return residual expressions when complete evaluation isn’t possible
- Receives policy bundles from Solon containing all application policies
- Queried by the Gateway before requests reach applications
When the Gateway receives a request, it queries OPA with user attributes and request context. OPA evaluates the relevant
policy and returns either a decision (allow/deny) or a residual expression that the Gateway encodes in a JWT for the
application to apply at the database level.
Open Policy Agent is an open source project. More information about Open Policy Agent can be found on openpolicyagent.org.
Solon
Solon collects Rego policy files from all applications and makes them available to OPA for policy evaluation.
Policy Collection:
- Discovers applications by querying Kubernetes Services with policy annotations
- Fetches policy files from application management endpoints via HTTP
- Bundles all policies together for OPA consumption
- Keeps OPA’s policy bundle up to date as applications are deployed or updated
Solon acts as the bridge between individual applications (which serve their own policy files) and the centralized OPA
instance (which needs all policies to evaluate authorization requests).
Navigator
Navigator is a shared React frontend application used by all ContentGrid applications. Rather than deploying separate
frontends per application, a single Navigator instance dynamically adapts to each application’s data model.
Adaptive Behavior:
- Discovers entities and available operations through HAL links
- Renders forms dynamically using HAL-FORMS templates
- Adapts to user permissions automatically (forms only show permitted actions)
- No application-specific code required—purely hypermedia-driven
Deployment Model: Pathfinder creates a separate Ingress resource for Navigator for each application, routing based
on domain. The Navigator instance then loads application-specific configuration from Liaison based on the request’s
Host header.
Liaison
Liaison serves configuration for Navigator on a per-application basis. It acts as a configuration service that provides
the necessary settings for Navigator to connect to the correct application and authentication realm.
Configuration Delivery:
- Serves Navigator configuration based on the domainname of the request
- Provides OIDC client ID and issuer URL for authentication
- Enables a single Navigator instance to serve multiple applications
Pathfinder
Pathfinder automatically creates and manages Kubernetes Ingress resources for applications. It watches ConfigMaps and
translates them into Ingress configurations, enabling external access to application services.
Two Deployment Variants:
- Pathfinder: Creates Ingress resources for application API backends
- Pathfinder for webapp: Creates Ingress resources for Navigator frontend
Resource Management:
- Reads ConfigMaps with domain routing configuration
- Creates Ingress resources with appropriate routing rules
- Coordinates with cert-manager for TLS certificate provisioning
Certificate Management: When Pathfinder creates an Ingress, cert-manager automatically provisions TLS certificates.
Pathfinder adds annotations to ConfigMaps indicating which cluster issuer to use, and cert-manager handles the
certificate lifecycle.
Application Deployment
When an application is deployed to the Runtime Platform, several Kubernetes resources are created to integrate it with
the platform services.
Application Service
A Kubernetes Service makes the application accessible to the Gateway and other platform components. The Service is
labeled with the application ID and service type, enabling dynamic discovery.
Key Labels:
app.contentgrid.com/application-id: Unique identifier for the application
app.contentgrid.com/deployment-id: Unique identifier for the deployment
app.contentgrid.com/service-type: Type of service (e.g., api, webapp)
Service Discovery: The Gateway uses these labels to discover Services. When routing a request, the Gateway queries
for Services matching the application ID determined from the domain mapping.
OPA Integration: The Service also includes an annotation (authz.contentgrid.com/policy-package) indicating the OPA
policy package location. This enables the platform to collect policies from applications.
Request Flow
Understanding how a request flows through the platform illustrates how these components work together.
sequenceDiagram
autonumber
participant Client
participant Keycloak
participant Gateway
participant OPA as Centralized OPA
participant App as Application Server
participant DB as PostgreSQL
Note over Client, Keycloak: Authentication Flow
Client ->> Keycloak: Login (if no valid token)
Keycloak -->> Client: JWT with user attributes
Note over Client, DB: API Request Flow
Client ->> Gateway: HTTPS Request + JWT
Gateway ->> Gateway: Validate JWT signature
Gateway ->> Gateway: Look up application by domain
Gateway ->> OPA: Authorization query (JWT claims)
OPA -->> Gateway: Allow/Deny + Residual expression
Gateway ->> Gateway: Encode residual in new JWT
Gateway ->> App: Forward request + JWT with residual
App ->> App: Decode residual from JWT
App ->> DB: Query with authorization filter
DB -->> App: Filtered results
App -->> Client: HAL JSON response
Step-by-Step:
- Authentication: Client authenticates directly with Keycloak and receives a JWT with user attributes
- Request with Token: Client makes HTTPS request to Ingress with JWT in Authorization header
- JWT Validation: Gateway validates JWT signature using Keycloak’s public keys
- Gateway Routing: Gateway maps domain to application ID
- Policy Evaluation: Gateway queries centralized OPA with user attributes from JWT
- Residual Encoding: OPA returns residual expression that Gateway encodes in a new JWT
- Application Processing: Gateway forwards request with JWT containing residual to Application Server
- Data Access: Application decodes residual, translates to SQL filter, and queries database
- Response: Application formats response as HAL JSON and returns through Gateway
Scaling and High Availability
The Runtime Platform is designed for horizontal scaling and high availability:
Application Servers: Scale horizontally by increasing replica count. Each replica is stateless (except for database
connections) and can handle requests independently. Kubernetes Services load balance across replicas.
Gateway: Runs as a highly available Deployment with multiple replicas. All replicas share the same configuration (
from ConfigMaps), and the Ingress load balances across them.
Keycloak: Can be deployed in clustered mode for high availability. Database-backed session storage enables failover
between instances.
Navigator and Liaison: Stateless services that scale horizontally. Liaison reads configuration from Kubernetes API
on each request (with caching), so all replicas have consistent configuration.
Database and Storage: PostgreSQL and S3 are external to the platform and have their own high-availability
mechanisms (e.g., PostgreSQL replication, S3 redundancy).
Operational Characteristics
Zero-Configuration Deployment: Adding a new application requires only creating the standard Kubernetes resources (
Deployment, Service, ConfigMaps, Secrets). The platform automatically discovers and integrates the application.
Independent Scaling: Each application scales independently. Heavy workloads on one application don’t affect others.
Resource Isolation: Each application has its own database, S3 bucket, and Keycloak realm (or shared by
organization). Resource limits prevent one application from affecting others.
Observability: Standard Kubernetes observability tools work out of the box. Platform components expose Prometheus
metrics, health check endpoints, and structured logs.
Summary
The ContentGrid Runtime Platform provides a Kubernetes-native infrastructure that:
- Automatically discovers and integrates applications through labels and dynamic service discovery
- Scales applications independently with horizontal scaling and load balancing
- Manages authentication and authorization through Keycloak and OPA
- Provides a shared Navigator frontend that adapts to any data model
- Handles TLS, routing, and CORS through Gateway and Ingress management
The platform’s design enables operational simplicity—deploying applications requires no platform configuration changes,
and standard Kubernetes operations handle scaling, updates, and failover.
Application Server
The ContentGrid Application Server is the core runtime component that serves dynamic REST APIs based on application
models. Each ContentGrid application runs as an instance of the same Application Server container image, configured with
application-specific artifacts.
The ContentGrid Application Server is developed in open source. The sources can be found
on https://github.com/xenit-eu/contentgrid-appserver.
Design Philosophy
The Application Server is built with a configuration-driven approach. A single shared
container image serves all applications. The only difference between different deployments is the application artifact (
model JSON, database
migrations, and policies) loaded at startup.
This approach provides several advantages:
- Deployment speed: No need to build new Docker images on model changes
- Consistent behavior: All applications run identical server code
- Simplified operations: One image to build, test, and deploy
Architecture Overview
graph TD
;
subgraph "Application Server"
REST[REST Layer] --> DOMAIN[Domain Layer];
DOMAIN --> QUERY[Query Engine];
DOMAIN --> CONTENT[Content Store];
DOMAIN --> APPLICATION[Application Resolver];
end
subgraph "External Persistence"
POSTGRES[PostgreSQL];
STORAGE[Content Storage];
ARTIFACT[Application Artifact];
end
QUERY --> POSTGRES;
CONTENT --> STORAGE;
APPLICATION --> ARTIFACT;
Components
The Application Server consists of several layered components, each with clear responsibilities and boundaries.
REST Layer
The REST Layer serves dynamic data endpoints based on the application model loaded from the artifact.
Responsibilities:
- Parse incoming HTTP requests and extract parameters
- Look up the application model and resolve entity definitions
- Follow relations to navigate between entities
- Parse authorization expressions from request context
- Format responses in HAL/HAL-FORMS format
- Serve HAL profile endpoints for entity metadata
- Handle upload and download of content
- Serve OpenAPI specifications
- Serve Rego policies for ABAC
Dynamic Endpoint Generation: When a request arrives for /invoices, the REST Layer looks up the “invoices” entity
definition. It then uses this definition to understand which attributes and relations exist,
generating the appropriate response structure on the fly.
Content Negotiation: The REST Layer supports multiple response formats based on the Accept header:
application/hal+json: Standard HAL responses with hypermedia links
application/prs.hal-forms+json: HAL-FORMS responses with action templates
application/schema+json: JSON Schema descriptions for entity profiles
Domain Layer
The Domain Layer provides business logic and abstracts data access patterns. It sits between the REST Layer and the
Query Engine/Content Store, enforcing business rules and model constraints.
Responsibilities:
- Expose logical operations: find, create, update, partial update, delete (for data); set/clear for to-one relations;
add/remove for to-many relations; find, update, delete for content
- Convert search filters and authorization rules into query expressions
- Enforce data constraints configured in the model (required fields, validation rules)
- Maintain audit information (created/modified timestamps and users)
- Coordinate transactions across multiple operations
Authorization Translation: When the REST Layer provides authorization rules from OPA, the Domain Layer translates
these into query expressions that the Query Engine can push down to the database. This ensures unauthorized data never
leaves the database—filtering happens at the SQL level.
Constraint Enforcement: The Domain Layer validates all data modifications against the model’s constraint definitions
before allowing changes to persist. This ensures data integrity independent of client validation.
Query Engine
The Query Engine translates the application model and query expressions into efficient database queries. It’s
responsible for all interactions with PostgreSQL.
Responsibilities:
- Dynamically construct SQL queries based on model definitions
- Apply pagination and sorting parameters
- Implement counting strategies (exact and estimated counts)
- Handle optimistic locking using row versions
- Push authorization filters down to SQL WHERE clauses
The query engine is implemented using JOOQ, a type-safe SQL query construction library for Java
that provides direct control over SQL while maintaining compile-time validation and type safety.
Dynamic Query Construction: Unlike traditional ORMs that use fixed entity classes, the Query Engine builds queries
at runtime based on the model. When the Domain Layer requests “all invoices where department=‘sales’”, the Query Engine
knows the invoices table structure from the model and constructs the appropriate SQL.
Counting Strategies: For large collections, exact counts can be expensive. The Query Engine implements fallback
strategies:
- Attempt an exact count with a timeout
- If timeout occurs, use PostgreSQL’s query planner statistics for an estimate
- Indicate in the response whether the count is exact or estimated
graph TD
subgraph "REST Layer"
REQUEST[Request Parameters]
ABAC[ABAC Rules]
end
subgraph "Domain Layer"
THUNX[Generate Query Expression]
APP[Application Model]
end
REQUEST --> THUNX
ABAC --> THUNX
APP --> THUNX
subgraph "Query Engine"
SQL["Translate to SQL"]
end
THUNX -- Expression --> SQL
THUNX -- Application --> SQL
THUNX -- Entity --> SQL
SQL --> POSTGRES[Execute Query on PostgreSQL]
Application Resolver
The Application Resolver provides model lookup for other components. It loads the application model from the artifact
and makes it available throughout the application lifecycle.
Responsibilities:
- Load application model from JSON artifact at startup
- Provide entity definitions to REST and Domain layers
- Validate model consistency and completeness
- Cache model in memory for fast access
Current Implementation: The Application Server currently uses a single-tenant model—one container runs one
application. The Application Resolver loads one model and always returns it. This design could support multi-tenancy in
the future by loading multiple models and routing between them, but the current focus is on simplicity and isolation.
Model Format: The application model is defined in JSON following a published schema. The model includes:
- Entity definitions with attributes and types
- Relation definitions (one-to-one, one-to-many, many-to-one)
- Constraint rules and validation
Content Store
The Content Store provides persistence and access for binary content objects (documents, images, PDFs, etc.). It
abstracts storage implementation, allowing different backends.
Capabilities:
- Read content by reference with support for HTTP Range requests
- Store new content objects and return references
- Remove content by reference
- Support transparent content encryption
Content References: The Content Store uses opaque references to identify content. These references are stored in the
database as part of entity attributes. The actual content bytes are stored in one of the implementations, referenced by
these
identifiers.
Range Request Support: For large files (videos, large PDFs), clients can request only specific byte ranges. The
Content Store translates these HTTP Range requests to the appropriate backend format, minimizing data transfer. Range
request support depends on the backend implementation - for example, S3-compatible storage supports native range
requests.
Implementations:
- S3ContentStore: Stores content in S3-compatible object storage (AWS S3, MinIO, etc.). This is the default option,
used in our own Runtime Platform.
- FileSystemContentStore: Stores content on local filesystem (useful for development)
- EncryptedContentStore: Wraps another ContentStore to add transparent encryption/decryption
Database Migrations
Database migrations are used to create the tables needed for a ContentGrid application, but also to migrate the database
schema from one version of the application to the next.
Database schema changes are managed using Flyway, a database migration tool. Migration scripts are included in the
application artifact and executed automatically during startup.
Migration Process:
- Application Server starts and loads the artifact
- Before serving requests, it runs Flyway migrations
- Flyway tracks which migrations have already been applied
- Only new migrations execute, enabling incremental schema changes
- Once migrations complete, the application begins serving traffic
Migration scripts are generated by Scribe based on model changes. When you add an entity or attribute, Scribe generates
a Flyway migration that creates or alters the corresponding table and columns.
ABAC Policy Integration
The Application Server integrates with Open Policy Agent (OPA) for attribute-based access control. Policies are written
in Rego and included in the application artifact.
Policy Lifecycle:
- Application Server loads Rego policies from artifact at startup
- Serves policies to Solon over HTTP
- For each request, the application extracts the residual expressions
- Application Server translates residual expressions to SQL filters
Residual Expressions: OPA’s partial evaluation feature is crucial for efficiency. When evaluating “user can see
invoices from their department,” OPA knows the user’s department but doesn’t know which invoices exist. It returns a
residual expression: invoice.department == "user's department". The Query Engine translates this to a SQL WHERE
clause, filtering at the database level.
This architecture means the application never loads unauthorized data—authorization filters are applied before data
leaves the database.
A more detailed description can be found in the section about access control.
Technology Stack
The Application Server leverages several proven technologies:
Query Construction: JOOQ provides type-safe SQL query construction in Java. Unlike
traditional ORMs, JOOQ gives
direct control over SQL while providing type safety and compile-time validation.
Web Framework: Spring Boot
and Spring MVC handle HTTP, dependency
injection, and application lifecycle. This
provides production-ready features like health checks, metrics, and configuration management.
Access Control: Custom query expression language (Thunx)
translates Open Policy Agent residual expressions into SQL. This enables
pushing authorization filters down to the database level.
JSON Processing: Jackson handles JSON serialization/deserialization,
including HAL and HAL-FORMS formatting.
Request Flow Example
Here’s how a request flows through the Application Server components:
Request: GET /invoices?department=sales&_sort=date,desc
-
REST Layer:
- Parses request: collection=invoices, filter={department:sales}, sort=[date,desc]
- Extracts JWT from Authorization header
- Queries for “invoice” entity definition (matching the
invoices path)
- Parses authorization expressions from JWT
-
Domain Layer:
- Receives: entity=invoice, filter={department:sales}, authorization={}
- Combines search filter with authorization expressions
- Requests paginated query from Query Engine
-
Query Engine:
- Generates SQL:
SELECT * FROM invoices WHERE department='sales' AND <auth filter> ORDER BY date DESC LIMIT 20
- Executes against PostgreSQL
- Returns result set
-
Domain Layer:
- Converts database rows to domain objects
- Returns to REST Layer
-
REST Layer:
- Formats as HAL JSON with _links and _embedded
- Adds HAL-FORMS templates for available actions
- Adds pagination links (next, prev, first)
- Returns HTTP response
Throughout this process, the only application-specific information is the model definition—the code executing these
steps is identical across all applications.
Query Performance: Database queries are the primary performance bottleneck. The Query Engine generates efficient SQL
that leverages indexes. Authorization filters are pushed down to SQL WHERE clauses, minimizing data transfer.
Memory Usage: The application model is cached in memory, but it’s typically small (kilobytes to low megabytes).
Content is streamed instead of entirely loaded in memory.
Horizontal Scaling: Application Servers are stateless (except database connections). Adding replicas linearly
increases throughput. Database connection pooling prevents connection exhaustion.
Summary
The ContentGrid Application Server provides a sophisticated runtime that generates complete REST APIs from application
models. Its layered architecture cleanly separates concerns:
- REST Layer handles HTTP and hypermedia formatting
- Domain Layer enforces constraints
- Query Engine translates models to efficient SQL
- Content Store abstracts binary content storage
- Application Resolver provides model definitions
The configuration-driven approach, combined with dynamic query construction and OPA integration, enables rapid
development and consistent operations while maintaining performance and security.
Access Control
ContentGrid uses Attribute-Based Access Control (ABAC) to enforce fine-grained permissions on data access. Instead of
folder-based permissions or role hierarchies, policies evaluate entity attributes and user attributes to make
authorization decisions.
Why ABAC?
Traditional ECM systems often tie permissions to folder structures. If a document is in a folder, you need permissions
on that folder. This approach has significant limitations:
- Artificial Organization: Data must be organized into folder hierarchies even when that doesn’t match the domain
model
- Rigid Permissions: Changing access requirements often requires reorganizing folders
- No Multi-Dimensional Access: Hard to express “users can see invoices from their own department AND invoices they
created”
- Weak Scalability: Permission checks traverse folder hierarchies, becoming expensive with deep nesting
ContentGrid replaces this with ABAC, where permissions are based on attributes:
- User Attributes: Department, role, clearance level, location, etc.
- Entity Attributes: Status, owner, creation date, sensitivity level, etc.
- Request Context: Operation (read/write/delete)
Example Policy: “Users can view invoices where invoice.department == user.department OR invoice.status == 'published'”
This policy doesn’t care about folder structures, as it evaluates attributes directly. Changing the invoice’s attribute
change the permissions, without having to explicitly change permissions.
Architecture Components
Open Policy Agent (OPA)
ContentGrid uses Open Policy Agent, an open-source policy engine, to evaluate access
control policies. Policies are written in Rego, OPA’s policy language.
Why OPA?
- Industry Standard: Widely adopted in cloud-native environments
- Declarative Policies: Rego is declarative—you specify what should be allowed, not how to check it
- Partial Evaluation: OPA can return residual expressions when it cannot fully evaluate a policy (crucial for
efficiency)
- Decoupled from Application: Policies live in Rego files, not scattered through application code
Rego Policies
Rego policies define authorization rules. Here’s a simplified example:
package contentgrid.invoices
import future.keywords
default can_read_invoice := false
# Allow reading invoices from the user's own department
can_read_invoice if {
input.entity.department == input.auth.principal.department
}
# Allow reading published invoices regardless of department
can_read_invoice if {
input.entity.status == "published"
}
allow if {
input.request.method == "GET"
# Path /invoices
count(input.request.path) == 1
input.request.path[0] == "invoices"
can_read_invoice == true
}
When the application queries OPA, it provides:
- input.request.method: The HTTP method (GET, POST, PUT, DELETE)
- input.request.path: The path being accessed
- input.auth.principal: User attributes from the JWT
- input.entity: Entity object with attributes and relations (when checking a specific record)
OPA evaluates all allow rules. If any rule evaluates to true, access is granted.
Partial Evaluation
The key to ABAC efficiency in ContentGrid is OPA’s partial evaluation feature.
The Problem: When a user requests /invoices, the application needs to return only invoices the user can access.
Naively, you might:
- Load all invoices from the database
- For each invoice, query OPA: “Can this user access this invoice?”
- Filter out invoices where OPA says “no”
This is catastrophically inefficient. For 10,000 invoices, you’d make 10,000 OPA queries and transfer 10,000 invoices
from the database just to filter most of them out.
The Solution: Partial evaluation allows OPA to evaluate policies with incomplete information.
The application queries OPA: “The user wants to see invoices. What filter should I apply?”
OPA evaluates the policy but doesn’t know which invoices exist. It returns a residual expression, a simplified
policy that only contains the parts that could not be fully evaluated:
input.entity.department == "sales" OR input.entity.status == "published"
The application translates this residual expression to a SQL WHERE clause:
SELECT *
FROM invoices
WHERE department = 'sales'
OR status = 'published'
Now only authorized invoices leave the database. One OPA query, efficient SQL filtering and minimal data transfer.
Architecture
Centralized OPA
ContentGrid currently uses a shared OPA instance for all applications deployed in the Runtime Platform.
Components
- Gateway: Entry point for all requests, responsible for policy enforcement
- Centralized OPA: Shared OPA instance that evaluates policies for all applications
- Solon: Service that collects Rego policy files from all applications and bundles them for OPA
- Application: Serves its policy file and receives residual expressions from the Gateway
Request Flow
sequenceDiagram
autonumber
participant Client
participant Gateway
participant OPA as Centralized OPA
participant Solon
participant App as Application
participant DB as Database
Note over Solon, App: Policy Distribution Phase
App ->> Solon: Serve policy file (HTTP endpoint)
Solon ->> Solon: Bundle all policies
OPA ->> Solon: Download policy bundle
Note over Client, DB: Request Processing Phase
Client ->> Gateway: HTTP Request
Gateway ->> OPA: Authorization query
OPA -->> Gateway: Allow/Deny/Residual expression
Gateway ->> Gateway: Encode residual in JWT
Gateway ->> App: Forward request + JWT with residual
App ->> App: Decode residual from JWT
App ->> DB: Query with residual as filter
DB -->> App: Filtered results
App -->> Client: Response
How It Works:
- Policy Distribution: Solon collects Rego policies from all applications via HTTP endpoints and bundles them for
the centralized OPA
- Request Processing: When a request arrives, the Gateway queries the centralized OPA for authorization
- Residual Encoding: OPA returns a residual expression that the Gateway encodes in a JWT
- Application Processing: The application decodes the residual from the JWT and applies it as a SQL filter
- Data Retrieval: Only authorized data is retrieved from the database
Policy Evaluation in Practice
The examples below illustrate key concepts for readers. While payloads may differ from real implementations, the
underlying principles remain the same.
For collection queries (e.g., “What invoices can this user see?”), the application provides partial information:
{
"input": {
"method": "GET",
"entity": "invoices",
"user": {
"id": "user-123",
"department": "sales",
"role": "employee"
}
}
}
Notice entity is missing—the application doesn’t know which invoices exist yet. OPA performs partial evaluation
and returns a residual expression:
{
"result": {
"or": [
{
"eq": [
{
"ref": [
"entity",
"department"
]
},
"sales"
]
},
{
"eq": [
{
"ref": [
"entity",
"status"
]
},
"published"
]
}
]
}
}
The application translates this to SQL. The exact translation mechanism uses an internal query expression language
called Thunx that
maps to SQL WHERE clauses.
Policy Development Workflow
Policies are defined in the Management Platform alongside the data model:
- Define Permissions: In the Management Platform, you define permission rules as part of the application model
- Generate Rego: Scribe generates Rego policies from the permission definitions
- Bundle Policies: Rego policies are included in the application artifact
- Deploy: When the application starts, it makes them available for OPA (via Solon)
- Enforce: The Gateway queries OPA for every request, enforcing the policies
Policy changes follow the same deployment pipeline as model changes, and are deployed in tandem. Update permissions,
regenerate artifact, redeploy application.
Query-Level Filtering: Pushing authorization filters to SQL ensures only authorized data leaves the database. This
is dramatically more efficient than application-level filtering.
OPA Response Time: OPA policy evaluation is fast (microseconds to low milliseconds).
Database Indexes: Authorization filters often involve specific columns (e.g., department, owner). Proper indexes
on these columns ensure efficient query execution.
Security Benefits
Defense in Depth: Authorization happens at multiple layers:
- OPA evaluates policies before queries execute
- Database enforces SQL filters (data never loaded without authorization)
Principle of Least Privilege: ABAC naturally supports least-privilege models. Users only see data matching their
attributes, and policies can be as granular as needed.
Dynamic Permissions: Attribute changes immediately affect permissions—no cache invalidation needed. If a user
changes departments, their access automatically reflects the new department.
Summary
ContentGrid’s Attribute-Based Access Control provides fine-grained, efficient authorization:
- Flexible Policies: Based on entity and user attributes, not artificial folder hierarchies
- Efficient Enforcement: Partial evaluation pushes authorization filters to SQL
- Centralized Architecture: Shared OPA instance with policy collection via Solon
- Declarative Rego: Policies are readable, maintainable, and separate from application code
ABAC enables expressing complex authorization requirements naturally while maintaining query performance through
intelligent filter pushdown.
Data Storage & Encryption
ContentGrid separates structured metadata from binary content, storing each optimally. Metadata lives in PostgreSQL,
while content (documents, images, videos) is stored in S3-compatible object storage.
Storage Architecture
PostgreSQL for Metadata
Each ContentGrid application has its own PostgreSQL database. The schema is generated automatically from the application
model:
- Entities → Tables: Each entity in your model becomes a table
- Attributes → Columns: Entity attributes map to columns with appropriate types
- One-to-x Relations → Foreign Keys: Relations between entities use foreign key constraints
- Many-to-many Relations → Join Tables: Many-to-many relations between entities use join tables
This direct mapping enables leveraging PostgreSQL’s full capabilities:
- Indexes: Standard B-tree and other indexes for efficient queries
- Constraints: Check constraints, unique constraints, and foreign keys enforce data integrity
- Transactions: Full ACID guarantees for all operations
Migration Management: Flyway manages schema migrations. When the model changes, Scribe generates SQL migration
scripts that execute automatically on deployment.
S3-Compatible Storage for Content
Binary content is stored in S3-compatible object storage (AWS S3, MinIO, Ceph, etc.). Each application has dedicated
buckets. This ensures:
- Isolation: Applications cannot access each other’s buckets
- Scalability: Object storage scales independently of compute and database
Content References: The database stores only references (unique identifiers) to content, not the content itself.
When the application needs content, it retrieves it from S3 using the reference.
Immutability: Content objects are never overwritten. Updating content creates a new object with a new reference. The
old content remains until explicitly deleted, enabling:
- Safe Backups: Backup S3 buckets without worrying about in-flight modifications
- Recoverability: Old content versions can be retained for recovery or audit
- Atomic Updates: Database transactions can commit content reference changes without coordinating with S3
Content Encryption
ContentGrid provides transparent encryption at rest for content stored in S3. Encryption and decryption happen
automatically—applications and users don’t need to manage keys or modify their workflows.
Content is encrypted with AES-128 in CTR mode.
Encryption Goals
The encryption architecture is designed to meet several requirements:
- Strong Security: Content encrypted using standard cryptographic primitives
- Key Protection: Encryption keys managed securely with database access controls
- Enable After Deployment: Encryption can be enabled for applications with existing unencrypted content
- Key Rotation: Encryption keys can be rotated on an individual basis without re-encrypting all content
- Application Isolation: Each application uses different encryption keys
- Range Request Support: Clients can request parts of files (HTTP Range) without decrypting the entire file
Data Encryption Keys
ContentGrid encrypts content at rest using Data Encryption Keys (DEKs). Each content object gets its own unique
symmetric key, ensuring strong isolation between content objects.
flowchart LR
DEK[Data Encryption Key<br/>Per Content Object]
Content[Content<br/>Binary Data]
DEK -->|Encrypts| Content
How It Works:
- Unique Keys: Each content object has its own 128-bit AES symmetric key (DEK)
- Strong Isolation: Compromising one DEK does not affect other content objects
- Local Encryption: Content encryption and decryption happen in the application using the DEK (no external service
calls)
- Database Storage: DEKs are stored in the database alongside content metadata
Note: Future enhancements will add Key Encryption Keys (KEKs) stored in Hardware Security Modules (HSMs) or cloud
Key Management Services (KMS) to provide an additional layer of protection for DEKs. See the Future Enhancements section
below.
Encryption Process
When content is uploaded:
flowchart TD
Content[Content Upload]
GenDEK[Generate DEK]
EncryptContent[Encrypt Content with DEK]
StoreContent[Store Encrypted Content in S3]
StoreKey[Store DEK in Database]
Content --> GenDEK
GenDEK --> EncryptContent
EncryptContent --> StoreContent
GenDEK --> StoreKey
- Generate DEK: Application generates a random symmetric key (AES-128)
- Encrypt Content: Application encrypts content using the DEK
- Store Content: Encrypted content is stored in S3
- Store DEK: DEK is stored in the database alongside content metadata
Decryption Process
When content is downloaded:
flowchart TD
FetchKey[Fetch DEK from Database]
FetchContent[Fetch Encrypted Content from S3]
DecryptContent[Decrypt Content using DEK]
Return[Return Content to Client]
FetchKey --> DecryptContent
FetchContent --> DecryptContent
DecryptContent --> Return
- Fetch DEK: Application retrieves DEK from database
- Fetch Encrypted Content: Application retrieves encrypted content from S3
- Decrypt Content: Application decrypts content locally using the DEK
- Return to Client: Decrypted content is sent to the client
Key Storage and Management
Data Encryption Keys (DEKs):
- Stored in the database in a dedicated table
- Each DEK is a 128-bit AES symmetric key
- Each content object has its own unique DEK
- Access controlled through database permissions and connection authentication
- DEKs are associated with their corresponding content references
Key Rotation
To rotate a DEK (e.g., upgrading encryption algorithm or if a DEK is compromised):
- Fetch and decrypt the old content using the old DEK
- Generate a new DEK
- Encrypt the content with the new DEK
- Store the new encrypted content in S3
- Update the database with the new DEK and content reference
DEK rotation is performed on a per-object basis and is typically only needed when upgrading cryptographic algorithms or
responding to a security incident.
Range Request Support
HTTP Range requests allow clients to request specific byte ranges of a file (e.g., “bytes 1000-2000”). This is essential
for:
- Video Seeking: Jump to a timestamp without downloading the entire video
- Large PDFs: Load only visible pages
- Parallel Downloads: Split large files across multiple connections
Encryption Challenge: Not all encryption modes support decrypting arbitrary byte ranges—some require decrypting from
the beginning.
Solution: ContentGrid uses block cipher modes that support random access (e.g., AES-CTR). The encryption
implementation:
- Calculates which encrypted blocks contain the requested byte range, and adjust the counter for that
- Fetch only the exact amount of requested data from S3 (using S3 range requests)
- Pads the downloaded data to align to the correct block size
- Decrypts the blocks
- Trims to the exact requested range
- Returns to the client
No additional data needs to be fetched. There is no need to decrypt the entire file, only a small amount of extra
decryption is performed to align with block boundaries.
Security Considerations
Data Confidentiality:
- Content is encrypted using strong symmetric algorithms (AES-128)
- Each content object has a unique encryption key (DEK)
- DEKs are stored in the database with access controlled through database permissions
- Encrypted content in S3 is protected from unauthorized access at the storage layer
- Database connections are authenticated and encrypted
Data Integrity:
- Immutability prevents accidental overwrites
Access Control:
- Database access controls restrict which services and users can access DEKs
- Application servers authenticate to the database using service credentials
Defense in Depth:
- Content encrypted at rest (this architecture)
- Data encrypted in transit (TLS)
- Access control enforced at query level (ABAC)
- Database connections authenticated and encrypted
- Database encryption at rest can provide an additional protection layer
Encryption Overhead:
Modern CPUs have AES hardware acceleration (AES-NI), making symmetric encryption very fast. The overhead for
encrypting/decrypting content is minimal—typically less than 100 MB/s of throughput impact.
Range Requests:
Range requests with encryption decrypt slightly more data (to align with block boundaries), but the overhead is small.
For a 1 KB range request, you might decrypt 1-2 KB. This is negligible compared to fetching and decrypting the entire
file.
Future Enhancements
Envelope Encryption with Key Encryption Keys
ContentGrid’s encryption architecture is designed to support envelope encryption (also called two-level encryption),
a standard technique used by AWS KMS, Google Cloud KMS, HashiCorp Vault, and other enterprise systems.
In envelope encryption, a Key Encryption Key (KEK) stored in a Hardware Security Module (HSM) or cloud Key
Management Service (KMS) is used to encrypt the DEKs before storing them in the database. This provides additional
security benefits:
- Enhanced Key Protection: DEKs are encrypted before storage, with KEKs never leaving the HSM/KMS
- Audit Logging: All key operations logged in the KMS for compliance and security monitoring
- Efficient Key Rotation: Rotating the KEK only requires re-encrypting small DEKs, not the entire content
The implementation will be backward compatible with existing encrypted content. When enabled, new content will use
envelope encryption, and existing DEKs can be migrated in the background without service interruption.
Summary
ContentGrid’s storage architecture separates structured metadata (PostgreSQL) from binary content (S3), optimizing each
for its purpose:
- PostgreSQL: Provides ACID transactions, relational integrity, and efficient querying for metadata
- S3: Provides scalable, durable storage for large binary content
- Content Encryption: Protects content at rest using unique Data Encryption Keys (DEKs) for each object
- Range Request Support: Enables efficient access to large files without sacrificing encryption
Encryption is transparent to applications and users—no code changes or workflow modifications required. The architecture
balances strong security with operational simplicity and performance.
Deployment Pipeline
ContentGrid applications are deployed through a fully automated pipeline that transforms high-level application models
into running services. The deployment process is managed by the Management Platform and executed in the Runtime
Platform.
Deployment Overview
The deployment pipeline bridges two environments:
Management Platform: Where you define application models, configure settings, and
initiate deployments. This is the control plane.
Runtime Platform: Where applications actually run, serving APIs to end-users. This is the
data plane.
The pipeline ensures consistent, repeatable deployments with zero manual intervention required for the deployment
mechanics.
The deployment pipeline is orchestrated by three Management Platform components:
- Architect: Source of truth for application models (entities, permissions, configurations)
- Scribe: Transforms models into deployable artifacts (migrations, policies, OpenAPI specs)
- Captain: Orchestrates infrastructure provisioning and Kubernetes deployment
For complete details on how these components work together, their responsibilities, and operational characteristics,
see Management Platform.
Deployment Architecture
flowchart TD
S3Artifact[S3 Artifact Bucket]
subgraph Management Platform
Arch[Architect<br/>Model Definition]
Scr[Scribe<br/>Artifact Generator]
Cap[Captain<br/>Orchestrator]
Arch --> Scr
Cap -->|Request Artifact| Scr
Scr -->|Generated ZIP| Cap
end
subgraph Runtime Platform
K8s[Kubernetes API]
AS[Application Server]
end
CB[(S3 Content Bucket)]
DB[(PostgreSQL Database)]
Cap -->|Upload Artifact| S3Artifact
Cap -->|Create/Update Resources| K8s
K8s -->|Deploy| AS
AS -->|Fetch Artifact| S3Artifact
AS -->|Metadata| DB
AS -->|Content| CB
AS -->|Serve API| Users[Clients]
Deployment Lifecycle
The end-to-end deployment process follows these steps:
sequenceDiagram
autonumber
participant Architect
participant Scribe
participant Captain
participant S3 as S3 Artifact Storage
participant CB as S3 Content Bucket
participant K8s as Kubernetes
participant App as Application Server
Captain ->> Scribe: Request artifact for application X
Scribe ->> Architect: Fetch application model
Architect -->> Scribe: Model JSON
Scribe -->> Captain: ZIP (model.json, migrations/, policies.rego)
Captain ->> S3: Upload artifact ZIP
Captain ->> K8s: Create/Update Deployment/Service/ConfigMaps/Secrets
K8s -->> App: Start application server pod
App ->> S3: Fetch artifact ZIP
App ->> App: Unpack artifact (model, migrations, policies)
App ->> App: Run DB migrations, Serve Rego files
App -->> Users: Serve API for application X
Step-by-Step Breakdown
1. Artifact Request
Captain initiates deployment by requesting Scribe to generate an artifact for a specific application. This happens when
the user triggers a deployment via the Console.
2. Model Retrieval
Scribe fetches the current application model from Architect. The model includes all entity definitions, permissions,
constraints, and configuration.
3. Artifact Generation
Scribe processes the model:
- Migration Generation: Compares the current model to the previous version (if it exists) and generates SQL DDL
statements to migrate the schema.
- Policy Compilation: Converts permission rules (defined in a user-friendly format in Architect) into Rego policies
that OPA understands.
- Model Serialization: Serializes the model to JSON in the format expected by the Application Server.
- Manifest File: Containing information about the artifact.
- Packaging: Bundles everything into a ZIP with a consistent structure.
4. Artifact Upload
Captain uploads the artifact to a shared artifact storage bucket (S3). The artifact is stored with a path including the
application ID and version, enabling:
- Rollback: Previous artifacts remain available for reverting deployments
- Audit: Complete history of what was deployed when
- Distribution: All Application Server replicas fetch from the same location
5. Infrastructure Provisioning
For new applications, Captain provisions:
- PostgreSQL Database: Creates a new database with credentials stored in a Kubernetes Secret
- S3 Content Bucket: Creates a bucket with appropriate access policies
- Keycloak Realm: Creates or configures a realm for authentication, sets up OIDC clients
6. Kubernetes Resource Creation
Captain creates or updates Kubernetes resources:
Deployment:
- Specifies the Application Server container image (same for all applications)
- Configures environment variables pointing to the artifact location
- Sets resource limits (CPU, memory)
- Configures health check endpoints
Service:
- Exposes the Application Server pods to the Gateway
- Labels with application ID and service type for discovery
ConfigMaps:
- Gateway configuration (domains, CORS settings)
- Webapp configuration (OIDC settings)
Secrets:
- Database credentials
- S3 bucket access keys
- Gateway authentication credentials for Keycloak
7. Application Server Startup
Kubernetes starts the Application Server pod(s):
Init Phase:
- Application Server downloads the artifact from S3
- Unpacks the artifact to the local filesystem
- Runs Flyway migrations against the database
Runtime Phase:
- Loads the application model into memory
- Starts the HTTP server
- Registers health check endpoints
- Begins serving API requests
The init phase ensures the database schema matches the model before serving traffic. If migrations fail, the pod won’t
become ready, and Kubernetes won’t route traffic to it.
8. Service Readiness
Once health checks pass, Kubernetes marks the pod as ready and begins routing traffic.
Artifact Structure
A typical artifact has this structure:
application-artifact.zip
├── manifest.json # Metadata
├── application-model.json # Model definition
├── policies.rego # Access control policies
└── migrations/ # Flyway migrations
├── V1__initial_schema.sql
├── V2__add_invoices.sql
└── V3__add_status_column.sql
manifest.json:
{
"organizationId": "org-123",
"organizationName": "Acme Corp",
"projectId": "proj-456",
"projectName": "Invoice Management",
"version": "1.2.3",
"changeset": "abc123def456",
"timestamp": "2026-01-28T10:30:00Z",
"scribeVersion": "2.1.0"
}
application-model.json:
The model JSON follows a published schema and includes complete entity definitions, attributes, relations, and
constraints. This is the single source of truth for the Application Server’s runtime behavior.
policies.rego:
Rego policies define authorization rules. These are loaded by OPA and queried by the Application Server for every
request.
migrations:
Flyway migration scripts are numbered sequentially. Flyway tracks which migrations have executed in a special database
table (flyway_schema_history), ensuring each migration runs exactly once.
Update Strategy
When updating an existing application:
Update:
- Captain deletes the old Deployment
- Captain creates a new Deployment with the updated artifact
- Kubernetes starts new pods with the updated artifact
- New pods run migrations (only new migrations execute)
- Health checks pass, new pods become ready
Database Migrations:
- Flyway migrations are forward-only—there are no automatic rollbacks
- Migrations should be backward-compatible when possible (e.g., adding nullable columns)
- Breaking changes require coordination between schema and code deployments
Rollback:
- Captain can redeploy a previous artifact version
- Even if the database migrations ran, the old schema is still available, and the old version of the model will still run
Deployment Observability
Health Checks:
Application Server exposes health endpoints:
/actuator/health/liveness: Is the process alive?
/actuator/health/readiness: Is the application ready to serve traffic?
Kubernetes uses these for liveness probes (restart if unhealthy) and readiness probes (route traffic only when ready).
Logs:
All components log to stdout/stderr, collected by Kubernetes:
- Captain logs deployment activities and decisions
- Application Server logs startup, migrations, and request handling
- OPA logs policy evaluation (if enabled)
Centralized logging (e.g., Elasticsearch, Loki) aggregates logs for analysis.
Metrics:
Application Server exposes Prometheus metrics:
- Request counts and latencies
- Database query performance
- OPA policy evaluation times
- Content storage access patterns
Monitoring dashboards provide visibility into application behavior.
Deployment Security
Principle of Least Privilege:
- Captain has credentials to provision infrastructure but not to access application data
- Application Server has credentials to access its own database and bucket, but not others
- Keycloak realms isolate authentication between organizations
Secret Management:
- Kubernetes Secrets store sensitive credentials
- Secrets are injected as mounted files
- Secrets are not included in artifacts or logged
Artifact Integrity:
- Artifacts are versioned and immutable once created
- Only Captain can upload artifacts to the shared bucket
Summary
The ContentGrid deployment pipeline provides a fully automated path from application model to running service:
- Artifact Generation: Scribe transforms models into deployable artifacts with migrations and policies
- Orchestration: Captain provisions infrastructure and coordinates Kubernetes deployments
- Observability: Health checks, logs, and metrics provide visibility into deployment status and application health
The pipeline’s design enables rapid iteration—model changes deploy quickly and consistently. The use of a shared
container image simplifies operations while maintaining application isolation through configuration and infrastructure
separation.