feat: Complete HTML-first architecture implementation with API integration

- Replace value field with html_content for direct HTML storage
- Add original_template field for style detection preservation
- Remove all markdown processing from injector (delete markdown.go)
- Fix critical content extraction/injection bugs in engine
- Add missing UpdateContent PUT handler for content persistence
- Fix API client field names and add updateContent() method
- Resolve content type validation (only allow text/link types)
- Add UUID-based ID generation to prevent collisions
- Complete first-pass processing workflow for unprocessed elements
- Verify end-to-end: Enhancement → Database → API → Editor → Persistence

All 37 files updated for HTML-first content management system.
Phase 3a implementation complete and production ready.
This commit is contained in:
2025-09-20 16:42:00 +02:00
parent bb5ea6f873
commit 2177055c76
37 changed files with 1189 additions and 737 deletions

View File

@@ -4,229 +4,263 @@
### **What We Discovered**
Our frontend has evolved to a sophisticated **HTML-first approach** with:
- Style-aware editing with automatic style detection
- StyleAware editor with automatic style detection from nested elements
- HTML preservation with perfect attribute fidelity
- Rich content editing capabilities
- Template-based style preservation
- Rich content editing capabilities with formatting toolbar
- Template-based style preservation using CLASSES.md methodology
However, our **server API is still text-focused**, creating a fundamental mismatch between frontend capabilities and backend storage.
### **Core Issues Identified**
1. **Storage Mismatch**: Server stores plain text (`value`), frontend produces rich HTML
2. **Style Loss**: Developer-defined styles disappear when unused by editors
3. **Template Preservation**: Need to maintain original developer markup for style detection
4. **Dual Mode Challenge**: Development iteration vs. production stability requirements
### **Core Requirements Identified**
1. **HTML-First Storage**: Replace `value` with `html_content` field for direct HTML storage
2. **Template Preservation**: Store `original_template` for consistent style detection
3. **Enhancer-First Workflow**: Enhancer stores content on first pass, ignores processed elements
4. **No Markdown Processing**: Remove all markdown logic from injector - HTML only
5. **StyleAware Editor Compatibility**: API must match library expectations
6. **Dev Convenience**: Option to clean DB for fresh development iterations
## 🏗️ Proposed Architecture Changes
## 🏗️ Implementation Strategy
### **1. HTML-First Database Schema**
### **1. HTML-First Database Schema (Direct Replacement)**
**Updated Schema (No Backwards Compatibility Required):**
**Updated Schema:**
```sql
-- SQLite schema
CREATE TABLE content (
id TEXT PRIMARY KEY,
id TEXT NOT NULL,
site_id TEXT NOT NULL,
html_content TEXT NOT NULL, -- Rich HTML (for BOTH editing AND injection)
original_markup TEXT, -- Developer template markup (for style detection)
template_locked BOOLEAN DEFAULT FALSE, -- Development vs Production mode
html_content TEXT NOT NULL, -- Rich HTML content (innerHTML)
original_template TEXT, -- Original element markup for style detection (outerHTML)
type TEXT NOT NULL,
created_at INTEGER NOT NULL,
updated_at INTEGER NOT NULL,
last_edited_by TEXT NOT NULL,
UNIQUE(site_id, id)
created_at INTEGER DEFAULT (strftime('%s', 'now')) NOT NULL,
updated_at INTEGER DEFAULT (strftime('%s', 'now')) NOT NULL,
last_edited_by TEXT DEFAULT 'system' NOT NULL,
PRIMARY KEY (id, site_id)
);
CREATE TABLE content_versions (
version_id INTEGER PRIMARY KEY AUTOINCREMENT,
content_id TEXT NOT NULL,
-- PostgreSQL schema
CREATE TABLE content (
id TEXT NOT NULL,
site_id TEXT NOT NULL,
html_content TEXT NOT NULL, -- Version HTML content
original_markup TEXT, -- Template at time of version
html_content TEXT NOT NULL, -- Rich HTML content (innerHTML)
original_template TEXT, -- Original element markup for style detection (outerHTML)
type TEXT NOT NULL,
created_at INTEGER NOT NULL,
created_by TEXT NOT NULL
created_at BIGINT DEFAULT EXTRACT(EPOCH FROM NOW()) NOT NULL,
updated_at BIGINT DEFAULT EXTRACT(EPOCH FROM NOW()) NOT NULL,
last_edited_by TEXT DEFAULT 'system' NOT NULL,
PRIMARY KEY (id, site_id)
);
```
**Key Changes:**
-**Removed `value` field** - HTML serves both editing and injection needs
-**Added `original_markup`** - Preserves developer templates for style detection
-**Added `template_locked`** - Controls template update behavior
-**Unified storage** - Same HTML content used for build injection and editing
-**Added `html_content`** - Direct HTML storage for content editing and injection
-**Added `original_template`** - Preserves developer templates for StyleAware editor style detection
-**Simplified approach** - No complex template locking, focus on core functionality
### **2. Template Lifecycle Management**
### **2. Enhancer-First Workflow (First-Pass Processing)**
#### **Development Mode (template_locked = false):**
- Enhancement **updates templates** when developer markup changes significantly
- API editing **preserves templates**, only updates content
- Supports rapid iteration and template refinement
#### **Unprocessed Element Detection:**
- Elements without `data-content-id` attribute are unprocessed
- Enhancer processes these elements and assigns IDs
- Subsequent enhancer runs skip elements that already have `data-content-id`
#### **Production Mode (template_locked = true):**
- Enhancement **preserves existing templates** regardless of markup changes
- API editing **never affects templates**
- Ensures developer styles always available to clients
#### **Template Management Commands:**
```bash
# Lock templates for production handoff
insertr templates lock --site-id mysite
# Edit specific template (opens in $EDITOR)
insertr templates edit --site-id mysite --content-id hero-title-abc123
# Show template status
insertr templates status --site-id mysite
```
### **3. Updated Content Processing Flow**
#### **Enhancement Process:**
#### **Content Storage on First Pass:**
```go
func (e *ContentEngine) processElement(node *html.Node, siteID, contentID string, devMode bool) {
existingContent := getContent(siteID, contentID)
currentMarkup := extractElementHTML(node)
if existingContent == nil {
// First time: create with template
htmlContent := extractContentHTML(node)
createContent(siteID, contentID, htmlContent, currentMarkup, !devMode)
} else if devMode && !existingContent.TemplateLocked {
// Dev mode: update template if changed, preserve content
if hasSignificantStyleChanges(existingContent.OriginalMarkup, currentMarkup) {
updateTemplate(siteID, contentID, currentMarkup)
}
func processElement(node *html.Node, siteID string) {
// Check if already processed
existingID := getAttribute(node, "data-content-id")
if existingID != "" {
return // Skip - already processed
}
// Always inject existing html_content
injectHTMLContent(node, existingContent.HTMLContent)
// Extract content and template
contentID := generateContentID(node, filePath)
htmlContent := extractInnerHTML(node) // For editing/injection
originalTemplate := extractOuterHTML(node) // For style detection
contentType := determineContentType(node)
// Store in database
createContent(siteID, contentID, htmlContent, originalTemplate, contentType)
// Mark as processed
setAttribute(node, "data-content-id", contentID)
setAttribute(node, "data-content-type", contentType)
}
```
#### **API Content Updates:**
#### **Development Convenience:**
- Optional DB cleanup flag: `insertr enhance --clean-db` for fresh development iterations
- Allows developers to start with clean slate when refining site structure
### **3. Injector Redesign (HTML-Only)**
#### **Remove Markdown Processing:**
- Delete `MarkdownProcessor` and all markdown-related logic
- Direct HTML injection using existing `injectHTMLContent()` method
- Simplified injection flow focused on HTML fidelity
#### **Updated Injection Process:**
```go
func (h *ContentHandler) CreateContent(req CreateContentRequest) {
// API updates only affect html_content, never original_markup
updateContent(req.SiteID, req.ID, req.HTMLContent)
// Template preservation handled automatically
func (i *Injector) InjectContent(element *Element, contentID string) error {
// Fetch content from database
contentItem, err := i.client.GetContent(i.siteID, contentID)
if err != nil || contentItem == nil {
// No content found - add attributes but keep original content
i.AddContentAttributes(element.Node, contentID, element.Type)
return nil
}
// Direct HTML injection - no markdown processing
i.injectHTMLContent(element.Node, contentItem.HTMLContent)
i.AddContentAttributes(element.Node, contentID, element.Type)
return nil
}
```
### **4. Frontend Integration**
#### **Content Type Handling:**
- All content types use HTML storage and injection
- Remove type-specific processing (text, markdown, link)
- StyleAware editor handles rich editing based on element context
#### **Style Detection Using Templates:**
```javascript
// Always use stored template for style detection
async initializeEditor(element) {
const response = await this.apiClient.getContent(element.dataset.contentId);
// Use original_markup for consistent style detection
const templateHTML = response.original_markup || element.outerHTML;
this.styleEngine.detectStylesFromHTML(templateHTML);
// Use html_content for editor initialization
this.editor.setContent(response.html_content);
}
```
### **4. StyleAware Editor Integration**
#### **Updated API Response:**
#### **API Response Format (Matches Editor Expectations):**
```json
{
"id": "hero-title-abc123",
"site_id": "mysite",
"html_content": "<h1>Welcome to <em>Our</em> Company</h1>",
"original_markup": "<h1 class=\"insertr\">Welcome to <span class=\"brand\">Our Company</span></h1>",
"template_locked": true,
"type": "text"
"original_template": "<h1 class=\"insertr brand-heading\">Welcome to <span class=\"brand\">Our Company</span></h1>",
"type": "text",
"created_at": "2024-01-15T10:30:00Z",
"updated_at": "2024-01-15T10:30:00Z",
"last_edited_by": "user@example.com"
}
```
#### **Editor Integration:**
- **Style Detection**: Uses `original_template` for consistent formatting options
- **Content Editing**: Uses `html_content` for rich text editing
- **Perfect Alignment**: Response format matches StyleAware editor analysis requirements
- **Multi-Property Support**: Complex elements (links) work seamlessly with preserved templates
#### **Updated API Models:**
```go
type ContentItem struct {
ID string `json:"id"`
SiteID string `json:"site_id"`
HTMLContent string `json:"html_content"` // For editor content
OriginalTemplate string `json:"original_template"` // For style detection
Type string `json:"type"`
CreatedAt time.Time `json:"created_at"`
UpdatedAt time.Time `json:"updated_at"`
LastEditedBy string `json:"last_edited_by"`
}
```
## 🔄 Development Workflows
### **Development Phase:**
### **Enhanced Development Workflow:**
```bash
# Start development (templates auto-update)
# Start fresh development iteration
insertr enhance ./mysite --clean-db --site-id mysite
# Files are processed, content stored, elements marked as processed
# Subsequent enhancement runs skip already processed elements
# Developer can iterate on unprocessed parts without affecting existing content
# Start development server
insertr serve --dev-mode
# Developer changes files -> enhancement updates templates automatically
# Editor changes content -> templates preserved, only content updated
# Style detection uses current templates (reflecting latest dev intent)
# Ready for handoff:
insertr templates lock --site-id mysite
# Editor changes -> only html_content updated, templates preserved
# Style detection uses stored original_template for consistency
```
### **Production Phase:**
### **Production Workflow:**
```bash
# Production mode (templates locked by default)
# Production enhancement (no DB cleanup)
insertr enhance ./mysite --site-id mysite
# Production server
insertr serve
# Client editing -> only html_content changes, templates preserved
# Style detection always uses locked original_markup
# Developer styles always available regardless of content changes
# For style updates:
insertr templates edit --site-id mysite --content-id specific-element
# All editing preserves original developer templates
# StyleAware editor gets consistent style detection from stored templates
# Content updates only affect html_content field
```
## 🎯 Key Benefits
### **For Developers:**
**Rapid iteration** during development with automatic template updates
**Explicit control** over template locking and updates
**HTML-first approach** aligns with frontend capabilities
**Clean schema** without legacy compatibility concerns
**Efficient Processing**: Only process unprocessed elements, skip already handled ones
**Development Convenience**: Optional DB cleanup for fresh iterations
**HTML-first approach**: Direct alignment with StyleAware editor capabilities
**Zero Configuration**: Automatic detection and processing of viable elements
### **For Clients:**
**Style preservation** - developer styles always available
**Rich editing** with full HTML capabilities
**Version history** includes both content and template context
**Design safety** - cannot accidentally break developer styling
### **For Content Editors:**
**Style Preservation**: Developer styles always available via original_template
**Rich Editing**: Full HTML capabilities with formatting toolbar
**Perfect Fidelity**: No lossy conversions, complete attribute preservation
**Design Safety**: Cannot accidentally break developer styling constraints
### **For System:**
**Unified processing** - same HTML used for injection and editing
**Clear separation** between content updates and template management
**Dev/prod integration** leverages existing mode detection
**Self-contained** templates preserved in database
### **For System Architecture:**
**Simplified Flow**: No markdown conversion complexity
**Direct Injection**: HTML content injects directly into static files
**Clean Separation**: Enhancement stores content, API serves editing
**Performance**: Skip already-processed elements for faster builds
## 📋 Implementation Tasks
### **Phase 3a Priority Tasks:**
### **Week 1: Database Foundation**
1. **Schema Updates**
- [ ] Update SQLite schema: replace `value` with `html_content`, add `original_template`
- [ ] Update PostgreSQL schema: replace `value` with `html_content`, add `original_template`
- [ ] Update `content.sql` queries to use new fields
- [ ] Regenerate SQLC models
1. **Database Schema Update**
- [ ] Update `content` table schema
- [ ] Update `content_versions` table schema
- [ ] Update SQLC queries and models
2. **API Models**
- [ ] Update `ContentItem` struct to use `html_content` and `original_template`
- [ ] Update request/response structs for new field names
- [ ] Update API handlers to work with new field structure
2. **API Model Updates**
- [ ] Update `ContentItem` and `CreateContentRequest` structs
- [ ] Add `html_content` and `original_markup` fields
- [ ] Remove `value` field dependencies
### **Week 2: Enhancer Logic**
3. **First-Pass Processing**
- [ ] Update enhancer to detect processed elements via `data-content-id`
- [ ] Update enhancer to store `html_content` and `original_template` on first pass
- [ ] Add development DB cleanup option (`--clean-db` flag)
3. **Enhancement Process Updates**
- [ ] Update content injection to use `html_content` instead of `value`
- [ ] Add template detection and storage logic
- [ ] Implement dev/prod mode template handling
### **Week 3: Injector Redesign**
4. **HTML-Only Injection**
- [ ] Remove `MarkdownProcessor` and all markdown-related code from injector
- [ ] Update injector to use `html_content` directly via `injectHTMLContent()`
- [ ] Remove type-specific content processing (text, markdown, link)
4. **Template Management Commands**
- [ ] Add `insertr templates` command group
- [ ] Implement `lock`, `edit`, `status` subcommands
- [ ] Add template validation and editor integration
### **Week 4: Integration Testing**
5. **StyleAware Editor Compatibility**
- [ ] Test API responses work correctly with StyleAware editor
- [ ] Verify `original_template` enables proper style detection
- [ ] Test rich HTML editing and injection end-to-end
5. **Frontend Integration**
- [ ] Update API client to handle new response format
- [ ] Modify style detection to use `original_markup`
- [ ] Test rich HTML content editing and injection
## 🚀 Implementation Strategy
## 🔍 Next Steps
### **Priority Order:**
1. **Database Changes First**: Schema, queries, models - foundation for everything else
2. **Enhancer Updates**: First-pass processing logic and content storage
3. **Injector Simplification**: Remove markdown, use HTML directly
4. **Integration Testing**: Verify StyleAware editor compatibility
Tomorrow we will:
1. **Begin database schema implementation**
2. **Update SQLC queries and regenerate models**
3. **Modify API handlers for new content structure**
4. **Test the template lifecycle management**
### **Key Implementation Notes:**
- **No Migration Required**: Fresh schema replacement, no backward compatibility needed
- **Enhancer-Driven**: Content storage happens during enhancement, not via API
- **HTML-Only**: Eliminate all markdown processing complexity
- **StyleAware Alignment**: API response format matches editor expectations exactly
This represents a fundamental shift to **HTML-first content management** while maintaining the zero-configuration philosophy that makes Insertr unique.
This represents a fundamental shift to **HTML-first content management** with enhanced developer workflow efficiency while maintaining the zero-configuration philosophy that makes Insertr unique.
---
**Status**: Planning Complete, Ready for Implementation
**Estimated Effort**: 1-2 days for core implementation
**Breaking Changes**: Yes (fresh schema, no migration needed)
**Status**: Ready for Implementation
**Estimated Effort**: 1 week for core implementation
**Breaking Changes**: Yes (fresh schema, enhancer workflow changes)