feat: add Anthropic format conversion for chat runs and improvements (#347)

* fix: improve chat compression and tool display Context Compression Fixes: - Remove duplicate token calculation in compress() - Simplify compress() to only execute compression, not judge - Add buildConversationHistory() to preserve tool calls in LLM context - Remove unused estimateMessagesTokens() and contextLength parameter - Move all judgment logic to chat-run-socket.ts (uses accurate DB tokens) Tool Call Display Improvements: - Add tool execution duration display (format: 1.272s) - Add success/error status icons with circular backgrounds - Replace text error with SVG icon (X in red circle) - Replace old checkmark with polished green checkmark icon - Add i18n key 'chat.executionDuration' for all locales Bug Fixes: - Fix streaming-indicator stuck by adding try-finally in handleEvent - Add debug logging for compression flow diagnosis - Fix template syntax error in MessageList.vue Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(chat): convert conversation history to Anthropic format before sending to Gateway - Add convertToAnthropicFormat() to transform OpenAI format to Anthropic format - Handle DeepSeek reasoning_content in thinking blocks - Properly convert tool_use and tool_result blocks - Add convertFromAnthropicFormat() for parsing SSE responses - Handle stringified Python arrays in resume messages - Record debug history files for troubleshooting (original vs converted) - Fix tool_call_id validation to prevent empty ID errors - Clean internal Hermes fields (call_id, response_item_id) from tool_calls Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(chat): optimize message parsing and add debug logging - Only check for stringified arrays in assistant messages (performance) - Improve parsing error handling: keep original content on parse failure - Add debug logging for upstream events (reasoning/thinking tracking) - Log run.completed event keys for troubleshooting Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(chat): add message pagination and reasoning sync improvements **Message Pagination:** - Add getSessionDetailPaginated() for paginated message loading - Query with DESC order then reverse in code for optimal performance - Remove listSessionsPaginated() (not needed) **Reasoning Sync:** - Add bidirectional reasoning merge in syncFromHermes - Memory → DB: preserve streamed reasoning from SSE events - DB → Memory: restore reasoning if Hermes Gateway fixes storage - Send resumed event after sync completes with complete messages - Fix reasoning field inconsistency: use unified 'reasoning' field **Message Parsing:** - Only parse stringified arrays for assistant messages (performance) - Improve parse error handling: keep original content on failure - Add debug logging for upstream reasoning/thinking events **Bug Fixes:** - Fix reasoning content display: now works on both SSE and resume - Ensure reasoning is preserved across page refreshes via sync + resumed event Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: increase default pagination limit for messages to 500 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: remove auto-resumed event trigger and clean up debug code - Remove automatic resumed event trigger in syncFromHermes to avoid timing issues - Clean up unused imports (fs, join) - Remove debug history file logging code - Fix socket parameter passing in handleAbort, markCompleted, and syncFromHermes - Change usage emit from room broadcast to socket-only emit - Remove console.log debug statement Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: use reasoning field in convertToAnthropicFormat Change convertToAnthropicFormat to read from reasoning field instead of reasoning_content for consistency with database schema and frontend. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: parse stringified array content and improve logs - Parse stringified array format in run.completed to extract thinking/text/tool_use - Send parsed content to frontend via parsed_content/parsed_reasoning/parsed_tool_calls - Frontend updates last assistant message with parsed content - Remove ellipsis from log messages, show full content - Add detailed logging for conversion and parsing Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: move finalOutputTrimmed outside else block * fix(chat): handle double-serialized content in resumeSession - Remove outer quotes before parsing stringified array format - Updated changelog for v0.5.2 and v0.5.3 with multilingual support - Fixed message pagination with DESC query + array reverse Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(chat): improve error logging for resume parsing - Add detailed logging for double-serialized content parsing - Log content preview when parsing fails to diagnose issues Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * revert(chat): use simple Python-to-JSON replacement - Revert to simple .replace(/'/g, '"') approach - Parsing failures will keep original content as-is Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-30 16:40:37 +08:00
parent 2e87cb910c
commit cd14bb1963
25 changed files with 1097 additions and 437 deletions
@@ -31,6 +31,7 @@ export interface ChatMessage {
  tool_calls?: Array<{ id: string; type: string; function: { name: string; arguments: string } }>
  tool_call_id?: string
  name?: string
+  reasoning_content?: string | null
 }

 export interface CompressionConfig {
@@ -94,10 +95,6 @@ export function countTokensForModel(text: string, model: string): number {
  }
 }

-function estimateMessagesTokens(messages: ChatMessage[]): number {
-  return messages.reduce((sum, m) => sum + countTokens(m.content), 0)
-}
-
 // ─── Prompts ────────────────────────────────────────────

 export const SUMMARY_PREFIX = `[CONTEXT COMPACTION — REFERENCE ONLY] Earlier turns were compacted
@@ -250,6 +247,43 @@ function serializeForSummary(messages: ChatMessage[]): string {
  return parts.join('\n\n')
 }

+/**
+ * Convert messages to conversation history format for LLM API.
+ * Tool calls are converted to text format within assistant messages.
+ */
+function buildConversationHistory(messages: ChatMessage[]): Array<{ role: string; content: string }> {
+  const result: Array<{ role: string; content: string }> = []
+
+  for (const msg of messages) {
+    if (msg.role === 'tool') {
+      // Convert tool result to text and append to previous assistant message
+      const toolText = `[Tool result: ${msg.name || 'unknown'}]\n${(msg.content || '').slice(0, 500)}${msg.content && msg.content.length > 500 ? '...' : ''}`
+      // Find the last assistant message and append to it
+      const lastAssistant = result.findLast(m => m.role === 'assistant')
+      if (lastAssistant) {
+        lastAssistant.content += `\n\n${toolText}`
+      } else {
+        // Fallback: create an assistant message
+        result.push({ role: 'assistant', content: toolText })
+      }
+    } else if (msg.role === 'assistant' && msg.tool_calls?.length) {
+      // Include tool calls in assistant message
+      const toolsInfo = msg.tool_calls.map(tc => {
+        let args = tc.function.arguments
+        if (args.length > 1000) args = args.slice(0, 1000) + '...'
+        return `[Calling tool: ${tc.function.name} with arguments: ${args}]`
+      }).join('\n')
+      const content = msg.content ? `${msg.content}\n\n${toolsInfo}` : toolsInfo
+      result.push({ role: msg.role, content })
+    } else if (msg.role === 'user' || msg.role === 'assistant' || msg.role === 'system') {
+      result.push({ role: msg.role, content: msg.content || '' })
+    }
+    // Skip other roles
+  }
+
+  return result
+}
+
 function pruneOldToolResults(messages: ChatMessage[], keepRecentCount: number): ChatMessage[] {
  if (messages.length <= keepRecentCount) return messages

@@ -337,7 +371,7 @@ async function callSummarizer(
        if (parsed.event === 'run.completed') {
          clearTimeout(timer)
          source.close()
-          deleteCompressSession(sessionId, profile).catch(() => {})
+          deleteCompressSession(sessionId, profile).catch(() => { })
          const output = parsed.output
          if (!output || typeof output !== 'string' || output.trim() === '') {
            reject(new Error('Empty summarization response'))
@@ -347,7 +381,7 @@ async function callSummarizer(
        } else if (parsed.event === 'run.failed') {
          clearTimeout(timer)
          source.close()
-          deleteCompressSession(sessionId, profile).catch(() => {})
+          deleteCompressSession(sessionId, profile).catch(() => { })
          reject(new Error(parsed.error || 'Summarization run failed'))
        }
      } catch { /* ignore parse errors */ }
@@ -356,7 +390,7 @@ async function callSummarizer(
    source.onerror = () => {
      clearTimeout(timer)
      source.close()
-      deleteCompressSession(sessionId, profile).catch(() => {})
+      deleteCompressSession(sessionId, profile).catch(() => { })
      reject(new Error('Summarization SSE connection error'))
    }
  })
@@ -402,11 +436,8 @@ export class ChatContextCompressor {
    upstream: string,
    apiKey: string | undefined,
    sessionId?: string,
-    contextLength?: number,
    profile?: string,
  ): Promise<CompressedResult> {
-    const cl = contextLength || 200_000
-    const triggerTokens = Math.floor(cl / 2)
    const total = messages.length

    const makeMeta = (opts: Partial<CompressedResult['meta']> = {}): CompressedResult['meta'] => ({
@@ -419,59 +450,26 @@ export class ChatContextCompressor {
      ...opts,
    })

-    // ── Step 1: Check snapshot first ─────────────────────
+    // Check if we have a previous compression snapshot
    const snapshot = sessionId ? getCompressionSnapshot(sessionId) : null

    if (snapshot) {
-      const { summary: previousSummary, lastMessageIndex } = snapshot
-      const newMessages = messages.slice(lastMessageIndex + 1)
-      const summaryTokens = countTokens(SUMMARY_PREFIX + previousSummary)
-      const newTokens = estimateMessagesTokens(newMessages)
-      const assembledTokens = summaryTokens + newTokens
-
+      // Has snapshot → incremental compress (merge old summary with new messages)
      logger.info(
-        '[context-compressor] session=%s: snapshot at %d, %d new messages, assembled ~%d tokens (threshold %d)',
-        sessionId, lastMessageIndex, newMessages.length, assembledTokens, triggerTokens,
+        '[context-compressor] session=%s: incremental compress with snapshot at index %d',
+        sessionId, snapshot.lastMessageIndex,
      )
-
-      // Under threshold → return summary + new messages, no LLM call
-      if (assembledTokens <= triggerTokens) {
-        const result: ChatMessage[] = [
-          { role: 'system', content: SUMMARY_PREFIX + '\n\n' + previousSummary },
-          ...newMessages,
-        ]
-        return {
-          messages: result,
-          meta: makeMeta({
-            compressed: true,
-            llmCompressed: false,
-            summaryTokenEstimate: summaryTokens,
-            verbatimCount: newMessages.length,
-            compressedStartIndex: lastMessageIndex,
-          }),
-        }
-      }
-
-      // Over threshold → incremental LLM compress
      return this.incrementalCompress(
        messages, snapshot, upstream, apiKey, sessionId!, makeMeta(), profile,
      )
+    } else {
+      // No snapshot → full compress (compress all messages)
+      logger.info(
+        '[context-compressor] session=%s: full compress %d messages',
+        sessionId, total,
+      )
+      return this.fullCompress(messages, upstream, apiKey, sessionId!, makeMeta(), profile)
    }
-
-    // ── Step 2: No snapshot — check all messages ──────────
-    const totalTokens = estimateMessagesTokens(messages)
-
-    logger.info(
-      '[context-compressor] session=%s: no snapshot, %d messages, ~%d tokens (threshold %d)',
-      sessionId, total, totalTokens, triggerTokens,
-    )
-
-    if (totalTokens <= triggerTokens) {
-      return { messages, meta: makeMeta() }
-    }
-
-    // Over threshold → full LLM compress
-    return this.fullCompress(messages, upstream, apiKey, sessionId!, makeMeta(), profile)
  }

  private async incrementalCompress(
@@ -503,9 +501,7 @@ export class ChatContextCompressor {
    try {
      const contentToSummarize = serializeForSummary(toCompress)
      const prompt = buildIncrementalPrompt(previousSummary, contentToSummarize, this.config.summaryBudget)
-      const history = toCompress
-        .filter(m => m.role === 'user' || m.role === 'assistant')
-        .map(m => ({ role: m.role, content: m.content }))
+      const history = buildConversationHistory(toCompress)

      const t0 = Date.now()
      summary = await callSummarizer(upstream, apiKey, prompt, history, this.config.summarizationTimeoutMs, previousSummary, profile)
@@ -565,9 +561,7 @@ export class ChatContextCompressor {

    const contentToSummarize = serializeForSummary(toCompress)
    const prompt = buildFullPrompt(contentToSummarize, this.config.summaryBudget)
-    const history = toCompress
-      .filter(m => m.role === 'user' || m.role === 'assistant')
-      .map(m => ({ role: m.role, content: m.content }))
+    const history = buildConversationHistory(toCompress)

    let summary: string | null = null
    try {