feat: add Anthropic format conversion for chat runs and improvements (#347)
* fix: improve chat compression and tool display Context Compression Fixes: - Remove duplicate token calculation in compress() - Simplify compress() to only execute compression, not judge - Add buildConversationHistory() to preserve tool calls in LLM context - Remove unused estimateMessagesTokens() and contextLength parameter - Move all judgment logic to chat-run-socket.ts (uses accurate DB tokens) Tool Call Display Improvements: - Add tool execution duration display (format: 1.272s) - Add success/error status icons with circular backgrounds - Replace text error with SVG icon (X in red circle) - Replace old checkmark with polished green checkmark icon - Add i18n key 'chat.executionDuration' for all locales Bug Fixes: - Fix streaming-indicator stuck by adding try-finally in handleEvent - Add debug logging for compression flow diagnosis - Fix template syntax error in MessageList.vue Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(chat): convert conversation history to Anthropic format before sending to Gateway - Add convertToAnthropicFormat() to transform OpenAI format to Anthropic format - Handle DeepSeek reasoning_content in thinking blocks - Properly convert tool_use and tool_result blocks - Add convertFromAnthropicFormat() for parsing SSE responses - Handle stringified Python arrays in resume messages - Record debug history files for troubleshooting (original vs converted) - Fix tool_call_id validation to prevent empty ID errors - Clean internal Hermes fields (call_id, response_item_id) from tool_calls Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(chat): optimize message parsing and add debug logging - Only check for stringified arrays in assistant messages (performance) - Improve parsing error handling: keep original content on parse failure - Add debug logging for upstream events (reasoning/thinking tracking) - Log run.completed event keys for troubleshooting Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(chat): add message pagination and reasoning sync improvements **Message Pagination:** - Add getSessionDetailPaginated() for paginated message loading - Query with DESC order then reverse in code for optimal performance - Remove listSessionsPaginated() (not needed) **Reasoning Sync:** - Add bidirectional reasoning merge in syncFromHermes - Memory → DB: preserve streamed reasoning from SSE events - DB → Memory: restore reasoning if Hermes Gateway fixes storage - Send resumed event after sync completes with complete messages - Fix reasoning field inconsistency: use unified 'reasoning' field **Message Parsing:** - Only parse stringified arrays for assistant messages (performance) - Improve parse error handling: keep original content on failure - Add debug logging for upstream reasoning/thinking events **Bug Fixes:** - Fix reasoning content display: now works on both SSE and resume - Ensure reasoning is preserved across page refreshes via sync + resumed event Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: increase default pagination limit for messages to 500 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: remove auto-resumed event trigger and clean up debug code - Remove automatic resumed event trigger in syncFromHermes to avoid timing issues - Clean up unused imports (fs, join) - Remove debug history file logging code - Fix socket parameter passing in handleAbort, markCompleted, and syncFromHermes - Change usage emit from room broadcast to socket-only emit - Remove console.log debug statement Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: use reasoning field in convertToAnthropicFormat Change convertToAnthropicFormat to read from reasoning field instead of reasoning_content for consistency with database schema and frontend. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: parse stringified array content and improve logs - Parse stringified array format in run.completed to extract thinking/text/tool_use - Send parsed content to frontend via parsed_content/parsed_reasoning/parsed_tool_calls - Frontend updates last assistant message with parsed content - Remove ellipsis from log messages, show full content - Add detailed logging for conversion and parsing Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: move finalOutputTrimmed outside else block * fix(chat): handle double-serialized content in resumeSession - Remove outer quotes before parsing stringified array format - Updated changelog for v0.5.2 and v0.5.3 with multilingual support - Fixed message pagination with DESC query + array reverse Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(chat): improve error logging for resume parsing - Add detailed logging for double-serialized content parsing - Log content preview when parsing fails to diagnose issues Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * revert(chat): use simple Python-to-JSON replacement - Revert to simple .replace(/'/g, '"') approach - Parsing failures will keep original content as-is Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -31,6 +31,7 @@ export interface ChatMessage {
|
||||
tool_calls?: Array<{ id: string; type: string; function: { name: string; arguments: string } }>
|
||||
tool_call_id?: string
|
||||
name?: string
|
||||
reasoning_content?: string | null
|
||||
}
|
||||
|
||||
export interface CompressionConfig {
|
||||
@@ -94,10 +95,6 @@ export function countTokensForModel(text: string, model: string): number {
|
||||
}
|
||||
}
|
||||
|
||||
function estimateMessagesTokens(messages: ChatMessage[]): number {
|
||||
return messages.reduce((sum, m) => sum + countTokens(m.content), 0)
|
||||
}
|
||||
|
||||
// ─── Prompts ────────────────────────────────────────────
|
||||
|
||||
export const SUMMARY_PREFIX = `[CONTEXT COMPACTION — REFERENCE ONLY] Earlier turns were compacted
|
||||
@@ -250,6 +247,43 @@ function serializeForSummary(messages: ChatMessage[]): string {
|
||||
return parts.join('\n\n')
|
||||
}
|
||||
|
||||
/**
|
||||
* Convert messages to conversation history format for LLM API.
|
||||
* Tool calls are converted to text format within assistant messages.
|
||||
*/
|
||||
function buildConversationHistory(messages: ChatMessage[]): Array<{ role: string; content: string }> {
|
||||
const result: Array<{ role: string; content: string }> = []
|
||||
|
||||
for (const msg of messages) {
|
||||
if (msg.role === 'tool') {
|
||||
// Convert tool result to text and append to previous assistant message
|
||||
const toolText = `[Tool result: ${msg.name || 'unknown'}]\n${(msg.content || '').slice(0, 500)}${msg.content && msg.content.length > 500 ? '...' : ''}`
|
||||
// Find the last assistant message and append to it
|
||||
const lastAssistant = result.findLast(m => m.role === 'assistant')
|
||||
if (lastAssistant) {
|
||||
lastAssistant.content += `\n\n${toolText}`
|
||||
} else {
|
||||
// Fallback: create an assistant message
|
||||
result.push({ role: 'assistant', content: toolText })
|
||||
}
|
||||
} else if (msg.role === 'assistant' && msg.tool_calls?.length) {
|
||||
// Include tool calls in assistant message
|
||||
const toolsInfo = msg.tool_calls.map(tc => {
|
||||
let args = tc.function.arguments
|
||||
if (args.length > 1000) args = args.slice(0, 1000) + '...'
|
||||
return `[Calling tool: ${tc.function.name} with arguments: ${args}]`
|
||||
}).join('\n')
|
||||
const content = msg.content ? `${msg.content}\n\n${toolsInfo}` : toolsInfo
|
||||
result.push({ role: msg.role, content })
|
||||
} else if (msg.role === 'user' || msg.role === 'assistant' || msg.role === 'system') {
|
||||
result.push({ role: msg.role, content: msg.content || '' })
|
||||
}
|
||||
// Skip other roles
|
||||
}
|
||||
|
||||
return result
|
||||
}
|
||||
|
||||
function pruneOldToolResults(messages: ChatMessage[], keepRecentCount: number): ChatMessage[] {
|
||||
if (messages.length <= keepRecentCount) return messages
|
||||
|
||||
@@ -337,7 +371,7 @@ async function callSummarizer(
|
||||
if (parsed.event === 'run.completed') {
|
||||
clearTimeout(timer)
|
||||
source.close()
|
||||
deleteCompressSession(sessionId, profile).catch(() => {})
|
||||
deleteCompressSession(sessionId, profile).catch(() => { })
|
||||
const output = parsed.output
|
||||
if (!output || typeof output !== 'string' || output.trim() === '') {
|
||||
reject(new Error('Empty summarization response'))
|
||||
@@ -347,7 +381,7 @@ async function callSummarizer(
|
||||
} else if (parsed.event === 'run.failed') {
|
||||
clearTimeout(timer)
|
||||
source.close()
|
||||
deleteCompressSession(sessionId, profile).catch(() => {})
|
||||
deleteCompressSession(sessionId, profile).catch(() => { })
|
||||
reject(new Error(parsed.error || 'Summarization run failed'))
|
||||
}
|
||||
} catch { /* ignore parse errors */ }
|
||||
@@ -356,7 +390,7 @@ async function callSummarizer(
|
||||
source.onerror = () => {
|
||||
clearTimeout(timer)
|
||||
source.close()
|
||||
deleteCompressSession(sessionId, profile).catch(() => {})
|
||||
deleteCompressSession(sessionId, profile).catch(() => { })
|
||||
reject(new Error('Summarization SSE connection error'))
|
||||
}
|
||||
})
|
||||
@@ -402,11 +436,8 @@ export class ChatContextCompressor {
|
||||
upstream: string,
|
||||
apiKey: string | undefined,
|
||||
sessionId?: string,
|
||||
contextLength?: number,
|
||||
profile?: string,
|
||||
): Promise<CompressedResult> {
|
||||
const cl = contextLength || 200_000
|
||||
const triggerTokens = Math.floor(cl / 2)
|
||||
const total = messages.length
|
||||
|
||||
const makeMeta = (opts: Partial<CompressedResult['meta']> = {}): CompressedResult['meta'] => ({
|
||||
@@ -419,59 +450,26 @@ export class ChatContextCompressor {
|
||||
...opts,
|
||||
})
|
||||
|
||||
// ── Step 1: Check snapshot first ─────────────────────
|
||||
// Check if we have a previous compression snapshot
|
||||
const snapshot = sessionId ? getCompressionSnapshot(sessionId) : null
|
||||
|
||||
if (snapshot) {
|
||||
const { summary: previousSummary, lastMessageIndex } = snapshot
|
||||
const newMessages = messages.slice(lastMessageIndex + 1)
|
||||
const summaryTokens = countTokens(SUMMARY_PREFIX + previousSummary)
|
||||
const newTokens = estimateMessagesTokens(newMessages)
|
||||
const assembledTokens = summaryTokens + newTokens
|
||||
|
||||
// Has snapshot → incremental compress (merge old summary with new messages)
|
||||
logger.info(
|
||||
'[context-compressor] session=%s: snapshot at %d, %d new messages, assembled ~%d tokens (threshold %d)',
|
||||
sessionId, lastMessageIndex, newMessages.length, assembledTokens, triggerTokens,
|
||||
'[context-compressor] session=%s: incremental compress with snapshot at index %d',
|
||||
sessionId, snapshot.lastMessageIndex,
|
||||
)
|
||||
|
||||
// Under threshold → return summary + new messages, no LLM call
|
||||
if (assembledTokens <= triggerTokens) {
|
||||
const result: ChatMessage[] = [
|
||||
{ role: 'system', content: SUMMARY_PREFIX + '\n\n' + previousSummary },
|
||||
...newMessages,
|
||||
]
|
||||
return {
|
||||
messages: result,
|
||||
meta: makeMeta({
|
||||
compressed: true,
|
||||
llmCompressed: false,
|
||||
summaryTokenEstimate: summaryTokens,
|
||||
verbatimCount: newMessages.length,
|
||||
compressedStartIndex: lastMessageIndex,
|
||||
}),
|
||||
}
|
||||
}
|
||||
|
||||
// Over threshold → incremental LLM compress
|
||||
return this.incrementalCompress(
|
||||
messages, snapshot, upstream, apiKey, sessionId!, makeMeta(), profile,
|
||||
)
|
||||
} else {
|
||||
// No snapshot → full compress (compress all messages)
|
||||
logger.info(
|
||||
'[context-compressor] session=%s: full compress %d messages',
|
||||
sessionId, total,
|
||||
)
|
||||
return this.fullCompress(messages, upstream, apiKey, sessionId!, makeMeta(), profile)
|
||||
}
|
||||
|
||||
// ── Step 2: No snapshot — check all messages ──────────
|
||||
const totalTokens = estimateMessagesTokens(messages)
|
||||
|
||||
logger.info(
|
||||
'[context-compressor] session=%s: no snapshot, %d messages, ~%d tokens (threshold %d)',
|
||||
sessionId, total, totalTokens, triggerTokens,
|
||||
)
|
||||
|
||||
if (totalTokens <= triggerTokens) {
|
||||
return { messages, meta: makeMeta() }
|
||||
}
|
||||
|
||||
// Over threshold → full LLM compress
|
||||
return this.fullCompress(messages, upstream, apiKey, sessionId!, makeMeta(), profile)
|
||||
}
|
||||
|
||||
private async incrementalCompress(
|
||||
@@ -503,9 +501,7 @@ export class ChatContextCompressor {
|
||||
try {
|
||||
const contentToSummarize = serializeForSummary(toCompress)
|
||||
const prompt = buildIncrementalPrompt(previousSummary, contentToSummarize, this.config.summaryBudget)
|
||||
const history = toCompress
|
||||
.filter(m => m.role === 'user' || m.role === 'assistant')
|
||||
.map(m => ({ role: m.role, content: m.content }))
|
||||
const history = buildConversationHistory(toCompress)
|
||||
|
||||
const t0 = Date.now()
|
||||
summary = await callSummarizer(upstream, apiKey, prompt, history, this.config.summarizationTimeoutMs, previousSummary, profile)
|
||||
@@ -565,9 +561,7 @@ export class ChatContextCompressor {
|
||||
|
||||
const contentToSummarize = serializeForSummary(toCompress)
|
||||
const prompt = buildFullPrompt(contentToSummarize, this.config.summaryBudget)
|
||||
const history = toCompress
|
||||
.filter(m => m.role === 'user' || m.role === 'assistant')
|
||||
.map(m => ({ role: m.role, content: m.content }))
|
||||
const history = buildConversationHistory(toCompress)
|
||||
|
||||
let summary: string | null = null
|
||||
try {
|
||||
|
||||
Reference in New Issue
Block a user