feat: add Anthropic format conversion for chat runs and improvements (#347)

* fix: improve chat compression and tool display

Context Compression Fixes:
- Remove duplicate token calculation in compress()
- Simplify compress() to only execute compression, not judge
- Add buildConversationHistory() to preserve tool calls in LLM context
- Remove unused estimateMessagesTokens() and contextLength parameter
- Move all judgment logic to chat-run-socket.ts (uses accurate DB tokens)

Tool Call Display Improvements:
- Add tool execution duration display (format: 1.272s)
- Add success/error status icons with circular backgrounds
- Replace text error with SVG icon (X in red circle)
- Replace old checkmark with polished green checkmark icon
- Add i18n key 'chat.executionDuration' for all locales

Bug Fixes:
- Fix streaming-indicator stuck by adding try-finally in handleEvent
- Add debug logging for compression flow diagnosis
- Fix template syntax error in MessageList.vue

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(chat): convert conversation history to Anthropic format before sending to Gateway

- Add convertToAnthropicFormat() to transform OpenAI format to Anthropic format
- Handle DeepSeek reasoning_content in thinking blocks
- Properly convert tool_use and tool_result blocks
- Add convertFromAnthropicFormat() for parsing SSE responses
- Handle stringified Python arrays in resume messages
- Record debug history files for troubleshooting (original vs converted)
- Fix tool_call_id validation to prevent empty ID errors
- Clean internal Hermes fields (call_id, response_item_id) from tool_calls

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(chat): optimize message parsing and add debug logging

- Only check for stringified arrays in assistant messages (performance)
- Improve parsing error handling: keep original content on parse failure
- Add debug logging for upstream events (reasoning/thinking tracking)
- Log run.completed event keys for troubleshooting

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(chat): add message pagination and reasoning sync improvements

**Message Pagination:**
- Add getSessionDetailPaginated() for paginated message loading
- Query with DESC order then reverse in code for optimal performance
- Remove listSessionsPaginated() (not needed)

**Reasoning Sync:**
- Add bidirectional reasoning merge in syncFromHermes
  - Memory → DB: preserve streamed reasoning from SSE events
  - DB → Memory: restore reasoning if Hermes Gateway fixes storage
- Send resumed event after sync completes with complete messages
- Fix reasoning field inconsistency: use unified 'reasoning' field

**Message Parsing:**
- Only parse stringified arrays for assistant messages (performance)
- Improve parse error handling: keep original content on failure
- Add debug logging for upstream reasoning/thinking events

**Bug Fixes:**
- Fix reasoning content display: now works on both SSE and resume
- Ensure reasoning is preserved across page refreshes via sync + resumed event

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: increase default pagination limit for messages to 500

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: remove auto-resumed event trigger and clean up debug code

- Remove automatic resumed event trigger in syncFromHermes to avoid timing issues
- Clean up unused imports (fs, join)
- Remove debug history file logging code
- Fix socket parameter passing in handleAbort, markCompleted, and syncFromHermes
- Change usage emit from room broadcast to socket-only emit
- Remove console.log debug statement

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: use reasoning field in convertToAnthropicFormat

Change convertToAnthropicFormat to read from reasoning field instead
of reasoning_content for consistency with database schema and frontend.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: parse stringified array content and improve logs

- Parse stringified array format in run.completed to extract thinking/text/tool_use
- Send parsed content to frontend via parsed_content/parsed_reasoning/parsed_tool_calls
- Frontend updates last assistant message with parsed content
- Remove ellipsis from log messages, show full content
- Add detailed logging for conversion and parsing

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: move finalOutputTrimmed outside else block

* fix(chat): handle double-serialized content in resumeSession

- Remove outer quotes before parsing stringified array format
- Updated changelog for v0.5.2 and v0.5.3 with multilingual support
- Fixed message pagination with DESC query + array reverse

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(chat): improve error logging for resume parsing

- Add detailed logging for double-serialized content parsing
- Log content preview when parsing fails to diagnose issues

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* revert(chat): use simple Python-to-JSON replacement

- Revert to simple .replace(/'/g, '"') approach
- Parsing failures will keep original content as-is

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
ekko
2026-04-30 16:40:37 +08:00
committed by GitHub
parent 2e87cb910c
commit cd14bb1963
25 changed files with 1097 additions and 437 deletions
@@ -31,6 +31,7 @@ export interface ChatMessage {
tool_calls?: Array<{ id: string; type: string; function: { name: string; arguments: string } }>
tool_call_id?: string
name?: string
reasoning_content?: string | null
}
export interface CompressionConfig {
@@ -94,10 +95,6 @@ export function countTokensForModel(text: string, model: string): number {
}
}
function estimateMessagesTokens(messages: ChatMessage[]): number {
return messages.reduce((sum, m) => sum + countTokens(m.content), 0)
}
// ─── Prompts ────────────────────────────────────────────
export const SUMMARY_PREFIX = `[CONTEXT COMPACTION — REFERENCE ONLY] Earlier turns were compacted
@@ -250,6 +247,43 @@ function serializeForSummary(messages: ChatMessage[]): string {
return parts.join('\n\n')
}
/**
* Convert messages to conversation history format for LLM API.
* Tool calls are converted to text format within assistant messages.
*/
function buildConversationHistory(messages: ChatMessage[]): Array<{ role: string; content: string }> {
const result: Array<{ role: string; content: string }> = []
for (const msg of messages) {
if (msg.role === 'tool') {
// Convert tool result to text and append to previous assistant message
const toolText = `[Tool result: ${msg.name || 'unknown'}]\n${(msg.content || '').slice(0, 500)}${msg.content && msg.content.length > 500 ? '...' : ''}`
// Find the last assistant message and append to it
const lastAssistant = result.findLast(m => m.role === 'assistant')
if (lastAssistant) {
lastAssistant.content += `\n\n${toolText}`
} else {
// Fallback: create an assistant message
result.push({ role: 'assistant', content: toolText })
}
} else if (msg.role === 'assistant' && msg.tool_calls?.length) {
// Include tool calls in assistant message
const toolsInfo = msg.tool_calls.map(tc => {
let args = tc.function.arguments
if (args.length > 1000) args = args.slice(0, 1000) + '...'
return `[Calling tool: ${tc.function.name} with arguments: ${args}]`
}).join('\n')
const content = msg.content ? `${msg.content}\n\n${toolsInfo}` : toolsInfo
result.push({ role: msg.role, content })
} else if (msg.role === 'user' || msg.role === 'assistant' || msg.role === 'system') {
result.push({ role: msg.role, content: msg.content || '' })
}
// Skip other roles
}
return result
}
function pruneOldToolResults(messages: ChatMessage[], keepRecentCount: number): ChatMessage[] {
if (messages.length <= keepRecentCount) return messages
@@ -337,7 +371,7 @@ async function callSummarizer(
if (parsed.event === 'run.completed') {
clearTimeout(timer)
source.close()
deleteCompressSession(sessionId, profile).catch(() => {})
deleteCompressSession(sessionId, profile).catch(() => { })
const output = parsed.output
if (!output || typeof output !== 'string' || output.trim() === '') {
reject(new Error('Empty summarization response'))
@@ -347,7 +381,7 @@ async function callSummarizer(
} else if (parsed.event === 'run.failed') {
clearTimeout(timer)
source.close()
deleteCompressSession(sessionId, profile).catch(() => {})
deleteCompressSession(sessionId, profile).catch(() => { })
reject(new Error(parsed.error || 'Summarization run failed'))
}
} catch { /* ignore parse errors */ }
@@ -356,7 +390,7 @@ async function callSummarizer(
source.onerror = () => {
clearTimeout(timer)
source.close()
deleteCompressSession(sessionId, profile).catch(() => {})
deleteCompressSession(sessionId, profile).catch(() => { })
reject(new Error('Summarization SSE connection error'))
}
})
@@ -402,11 +436,8 @@ export class ChatContextCompressor {
upstream: string,
apiKey: string | undefined,
sessionId?: string,
contextLength?: number,
profile?: string,
): Promise<CompressedResult> {
const cl = contextLength || 200_000
const triggerTokens = Math.floor(cl / 2)
const total = messages.length
const makeMeta = (opts: Partial<CompressedResult['meta']> = {}): CompressedResult['meta'] => ({
@@ -419,59 +450,26 @@ export class ChatContextCompressor {
...opts,
})
// ── Step 1: Check snapshot first ─────────────────────
// Check if we have a previous compression snapshot
const snapshot = sessionId ? getCompressionSnapshot(sessionId) : null
if (snapshot) {
const { summary: previousSummary, lastMessageIndex } = snapshot
const newMessages = messages.slice(lastMessageIndex + 1)
const summaryTokens = countTokens(SUMMARY_PREFIX + previousSummary)
const newTokens = estimateMessagesTokens(newMessages)
const assembledTokens = summaryTokens + newTokens
// Has snapshot → incremental compress (merge old summary with new messages)
logger.info(
'[context-compressor] session=%s: snapshot at %d, %d new messages, assembled ~%d tokens (threshold %d)',
sessionId, lastMessageIndex, newMessages.length, assembledTokens, triggerTokens,
'[context-compressor] session=%s: incremental compress with snapshot at index %d',
sessionId, snapshot.lastMessageIndex,
)
// Under threshold → return summary + new messages, no LLM call
if (assembledTokens <= triggerTokens) {
const result: ChatMessage[] = [
{ role: 'system', content: SUMMARY_PREFIX + '\n\n' + previousSummary },
...newMessages,
]
return {
messages: result,
meta: makeMeta({
compressed: true,
llmCompressed: false,
summaryTokenEstimate: summaryTokens,
verbatimCount: newMessages.length,
compressedStartIndex: lastMessageIndex,
}),
}
}
// Over threshold → incremental LLM compress
return this.incrementalCompress(
messages, snapshot, upstream, apiKey, sessionId!, makeMeta(), profile,
)
} else {
// No snapshot → full compress (compress all messages)
logger.info(
'[context-compressor] session=%s: full compress %d messages',
sessionId, total,
)
return this.fullCompress(messages, upstream, apiKey, sessionId!, makeMeta(), profile)
}
// ── Step 2: No snapshot — check all messages ──────────
const totalTokens = estimateMessagesTokens(messages)
logger.info(
'[context-compressor] session=%s: no snapshot, %d messages, ~%d tokens (threshold %d)',
sessionId, total, totalTokens, triggerTokens,
)
if (totalTokens <= triggerTokens) {
return { messages, meta: makeMeta() }
}
// Over threshold → full LLM compress
return this.fullCompress(messages, upstream, apiKey, sessionId!, makeMeta(), profile)
}
private async incrementalCompress(
@@ -503,9 +501,7 @@ export class ChatContextCompressor {
try {
const contentToSummarize = serializeForSummary(toCompress)
const prompt = buildIncrementalPrompt(previousSummary, contentToSummarize, this.config.summaryBudget)
const history = toCompress
.filter(m => m.role === 'user' || m.role === 'assistant')
.map(m => ({ role: m.role, content: m.content }))
const history = buildConversationHistory(toCompress)
const t0 = Date.now()
summary = await callSummarizer(upstream, apiKey, prompt, history, this.config.summarizationTimeoutMs, previousSummary, profile)
@@ -565,9 +561,7 @@ export class ChatContextCompressor {
const contentToSummarize = serializeForSummary(toCompress)
const prompt = buildFullPrompt(contentToSummarize, this.config.summaryBudget)
const history = toCompress
.filter(m => m.role === 'user' || m.role === 'assistant')
.map(m => ({ role: m.role, content: m.content }))
const history = buildConversationHistory(toCompress)
let summary: string | null = null
try {