feat: add Anthropic format conversion for chat runs and improvements (#347)

* fix: improve chat compression and tool display Context Compression Fixes: - Remove duplicate token calculation in compress() - Simplify compress() to only execute compression, not judge - Add buildConversationHistory() to preserve tool calls in LLM context - Remove unused estimateMessagesTokens() and contextLength parameter - Move all judgment logic to chat-run-socket.ts (uses accurate DB tokens) Tool Call Display Improvements: - Add tool execution duration display (format: 1.272s) - Add success/error status icons with circular backgrounds - Replace text error with SVG icon (X in red circle) - Replace old checkmark with polished green checkmark icon - Add i18n key 'chat.executionDuration' for all locales Bug Fixes: - Fix streaming-indicator stuck by adding try-finally in handleEvent - Add debug logging for compression flow diagnosis - Fix template syntax error in MessageList.vue Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(chat): convert conversation history to Anthropic format before sending to Gateway - Add convertToAnthropicFormat() to transform OpenAI format to Anthropic format - Handle DeepSeek reasoning_content in thinking blocks - Properly convert tool_use and tool_result blocks - Add convertFromAnthropicFormat() for parsing SSE responses - Handle stringified Python arrays in resume messages - Record debug history files for troubleshooting (original vs converted) - Fix tool_call_id validation to prevent empty ID errors - Clean internal Hermes fields (call_id, response_item_id) from tool_calls Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(chat): optimize message parsing and add debug logging - Only check for stringified arrays in assistant messages (performance) - Improve parsing error handling: keep original content on parse failure - Add debug logging for upstream events (reasoning/thinking tracking) - Log run.completed event keys for troubleshooting Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(chat): add message pagination and reasoning sync improvements **Message Pagination:** - Add getSessionDetailPaginated() for paginated message loading - Query with DESC order then reverse in code for optimal performance - Remove listSessionsPaginated() (not needed) **Reasoning Sync:** - Add bidirectional reasoning merge in syncFromHermes - Memory → DB: preserve streamed reasoning from SSE events - DB → Memory: restore reasoning if Hermes Gateway fixes storage - Send resumed event after sync completes with complete messages - Fix reasoning field inconsistency: use unified 'reasoning' field **Message Parsing:** - Only parse stringified arrays for assistant messages (performance) - Improve parse error handling: keep original content on failure - Add debug logging for upstream reasoning/thinking events **Bug Fixes:** - Fix reasoning content display: now works on both SSE and resume - Ensure reasoning is preserved across page refreshes via sync + resumed event Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: increase default pagination limit for messages to 500 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: remove auto-resumed event trigger and clean up debug code - Remove automatic resumed event trigger in syncFromHermes to avoid timing issues - Clean up unused imports (fs, join) - Remove debug history file logging code - Fix socket parameter passing in handleAbort, markCompleted, and syncFromHermes - Change usage emit from room broadcast to socket-only emit - Remove console.log debug statement Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: use reasoning field in convertToAnthropicFormat Change convertToAnthropicFormat to read from reasoning field instead of reasoning_content for consistency with database schema and frontend. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: parse stringified array content and improve logs - Parse stringified array format in run.completed to extract thinking/text/tool_use - Send parsed content to frontend via parsed_content/parsed_reasoning/parsed_tool_calls - Frontend updates last assistant message with parsed content - Remove ellipsis from log messages, show full content - Add detailed logging for conversion and parsing Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: move finalOutputTrimmed outside else block * fix(chat): handle double-serialized content in resumeSession - Remove outer quotes before parsing stringified array format - Updated changelog for v0.5.2 and v0.5.3 with multilingual support - Fixed message pagination with DESC query + array reverse Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(chat): improve error logging for resume parsing - Add detailed logging for double-serialized content parsing - Log content preview when parsing fails to diagnose issues Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * revert(chat): use simple Python-to-JSON replacement - Revert to simple .replace(/'/g, '"') approach - Parsing failures will keep original content as-is Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-30 16:40:37 +08:00
parent 2e87cb910c
commit cd14bb1963
25 changed files with 1097 additions and 437 deletions
@@ -1,6 +1,6 @@
 {
  "name": "hermes-web-ui",
-  "version": "0.5.2",
+  "version": "0.5.3",
  "description": "Self-hosted AI chat dashboard for Hermes Agent — multi-model (Claude, GPT, Gemini, DeepSeek) web UI with Telegram, Discord, Slack, WhatsApp integration",
  "repository": {
    "type": "git",
@@ -134,12 +134,16 @@ export function startRunViaSocket(
  // All event handlers share the same cleanup logic
  const handleEvent = (event: RunEvent) => {
    if (closed) return
    try {
      onEvent(event)
    } finally {
      if (event.event === 'run.completed' || event.event === 'run.failed') {
        console.log('[startRunViaSocket] Run completed/failed, calling cleanup and onDone', event.event)
        cleanup()
        onDone()
      }
    }
  }
  function onRunStarted(data: RunEvent) {
    handleEvent(data)
@@ -119,8 +119,9 @@ onBeforeUnmount(() => {
 const thinkingDurationMs = computed<number | null>(() => {
  const ob = chatStore.getThinkingObservation(props.message.id);
  if (!ob?.startedAt) return null;
-  const end = ob.endedAt ?? (props.message.isStreaming ? nowTick.value : ob.startedAt);
+  const startedAt = ob.startedAt!; // Non-null assertion after check
-  return Math.max(0, end - ob.startedAt);
+  const end = ob?.endedAt ?? (props.message.isStreaming ? nowTick.value : startedAt);
  return Math.max(0, end - startedAt);
 });
 function formatDuration(ms: number): string {
@@ -762,6 +763,7 @@ const renderedToolResult = computed(() => {
  padding: 0 4px;
  border-radius: 3px;
  line-height: 14px;
  margin-left: 4px;
 }
 .tool-details {
@@ -18,6 +18,14 @@ function formatTokens(n: number): string {
  return String(n)
 }
 function formatToolDuration(seconds: number): string {
  if (seconds < 1) return `${Math.round(seconds * 1000)}ms`
  if (seconds < 60) return `${Math.round(seconds * 10) / 10}s`
  const mins = Math.floor(seconds / 60)
  const secs = Math.round(seconds % 60)
  return `${mins}m ${secs}s`
 }
 const displayMessages = computed(() =>
  chatStore.messages.filter((m) => m.role !== "tool"),
 );
@@ -198,13 +206,52 @@ watch(currentToolCalls, () => {
            <span v-if="tc.toolPreview" class="tool-call-preview">{{
              tc.toolPreview
            }}</span>
            <span
              v-if="tc.toolDuration && tc.toolStatus !== 'running'"
              class="tool-call-duration"
              :title="$t('chat.executionDuration')"
            >{{ formatToolDuration(tc.toolDuration) }}</span
            >
            <svg
              v-if="tc.toolStatus === 'done'"
              width="16"
              height="16"
              viewBox="0 0 24 24"
              fill="none"
              class="tool-call-success-icon"
            >
              <circle cx="12" cy="12" r="10" fill="currentColor" fill-opacity="0.15"/>
              <path
                d="M8 12L11 15L16 9"
                stroke="currentColor"
                stroke-width="2"
                stroke-linecap="round"
                stroke-linejoin="round"
                fill="none"
              />
            </svg>
            <span
              v-if="tc.toolStatus === 'running'"
              class="tool-call-spinner"
            ></span>
-            <span v-if="tc.toolStatus === 'error'" class="tool-call-error">{{
+            <svg
-              t("chat.error")
+              v-if="tc.toolStatus === 'error'"
-            }}</span>
+              width="16"
              height="16"
              viewBox="0 0 24 24"
              fill="none"
              class="tool-call-error-icon"
            >
              <circle cx="12" cy="12" r="10" fill="currentColor" fill-opacity="0.15"/>
              <path
                d="M15 9L9 15M9 9L15 15"
                stroke="currentColor"
                stroke-width="2"
                stroke-linecap="round"
                stroke-linejoin="round"
                fill="none"
              />
            </svg>
          </div>
        </div>
      </div>
@@ -334,13 +381,30 @@ watch(currentToolCalls, () => {
  flex-shrink: 0;
 }
-.tool-call-error {
+.tool-call-error-icon {
-  font-size: 9px;
+  color: #ff4d4f;
-  color: $error;
+  flex-shrink: 0;
-  background: rgba($error, 0.08);
+  margin-left: 6px;
-  padding: 0 4px;
+  display: flex;
-  border-radius: 3px;
+  align-items: center;
-  line-height: 14px;
+  justify-content: center;
 }
 .tool-call-duration {
  font-size: 10px;
  color: $text-muted;
  font-family: $font-code;
  margin-left: 4px;
  flex-shrink: 0;
 }
 .tool-call-success-icon {
  color: #52c41a;
  flex-shrink: 0;
  margin-left: 6px;
  display: flex;
  align-items: center;
  justify-content: center;
 }
@keyframes spin {
@@ -5,6 +5,29 @@ export interface ChangelogEntry {
 }
 export const changelog: ChangelogEntry[] = [
  {
    version: '0.5.3',
    date: '2026-04-30',
    changes: [
      'changelog.new_0_5_3_1',
      'changelog.new_0_5_3_2',
      'changelog.new_0_5_3_3',
      'changelog.new_0_5_3_4',
      'changelog.new_0_5_3_5',
    ],
  },
  {
    version: '0.5.2',
    date: '2026-04-29',
    changes: [
      'changelog.new_0_5_2_1',
      'changelog.new_0_5_2_2',
      'changelog.new_0_5_2_3',
      'changelog.new_0_5_2_4',
      'changelog.new_0_5_2_5',
      'changelog.new_0_5_2_6',
    ],
  },
  {
    version: '0.5.1',
    date: '2026-04-29',
@@ -131,7 +131,7 @@ export default {
    arguments: 'Argumente',
    result: 'Ergebnis',
    truncated: '... (abgeschnitten)',
-    thinkingLabel: 'Denkprozess',
+    executionDuration: 'Execution time',    thinkingLabel: 'Denkprozess',
    thinkingInProgress: 'Denkt…',
    thinkingShow: 'Denkprozess anzeigen',
    thinkingHide: 'Denkprozess ausblenden',
@@ -563,6 +563,17 @@ jobTriggered: 'Job ausgelost',
  // Anderungsprotokoll
  changelog: {
    new_0_5_3_1: 'Improve reasoning process display with persistence across page refreshes',
    new_0_5_3_2: 'Optimize stringified array format parsing to extract thinking/text/tool_calls',
    new_0_5_3_3: 'Improve log display by removing ellipsis and showing full content',
    new_0_5_3_4: 'Add detailed logging for format conversion and parsing',
    new_0_5_3_5: 'Optimize token calculation to accurately include tool results',
    new_0_5_2_1: 'Convert conversation history to Anthropic format before sending to Gateway',
    new_0_5_2_2: 'Add bidirectional reasoning sync between memory and database',
    new_0_5_2_3: 'Add message pagination with DESC query + array reverse for performance',
    new_0_5_2_4: 'Clean up debug code and unused imports',
    new_0_5_2_5: 'Remove auto-resumed event trigger to avoid timing issues',
    new_0_5_2_6: 'Use reasoning field consistently across codebase',
    new_0_5_1_1: 'Auto-sync Hermes history sessions on first startup',
    new_0_5_1_2: 'Fix session sync failure with old Hermes versions (backward compatible)',
    new_0_5_1_3: 'Smart cleanup of exclusive platform credentials on profile clone (Telegram, Discord, Slack, etc.)',
@@ -154,6 +154,7 @@ export default {
    arguments: 'Arguments',
    result: 'Result',
    truncated: '... (truncated)',
    executionDuration: 'Execution time',
    thinkingLabel: 'Thinking',
    thinkingInProgress: 'Thinking…',
    thinkingShow: 'Show thinking',
@@ -727,6 +728,17 @@ export default {
  // Changelog
  changelog: {
    new_0_5_3_1: 'Improve reasoning process display with persistence across page refreshes',
    new_0_5_3_2: 'Optimize stringified array format parsing to extract thinking/text/tool_calls',
    new_0_5_3_3: 'Improve log display by removing ellipsis and showing full content',
    new_0_5_3_4: 'Add detailed logging for format conversion and parsing',
    new_0_5_3_5: 'Optimize token calculation to accurately include tool results',
    new_0_5_2_1: 'Convert conversation history to Anthropic format before sending to Gateway',
    new_0_5_2_2: 'Add bidirectional reasoning sync between memory and database',
    new_0_5_2_3: 'Add message pagination with DESC query + array reverse for performance',
    new_0_5_2_4: 'Clean up debug code and unused imports',
    new_0_5_2_5: 'Remove auto-resumed event trigger to avoid timing issues',
    new_0_5_2_6: 'Use reasoning field consistently across codebase',
    new_0_5_1_1: 'Auto-sync Hermes history sessions on first startup',
    new_0_5_1_2: 'Fix session sync failure with old Hermes versions (backward compatible)',
    new_0_5_1_3: 'Smart cleanup of exclusive platform credentials on profile clone (Telegram, Discord, Slack, etc.)',
@@ -131,7 +131,7 @@ export default {
    arguments: 'Argumentos',
    result: 'Resultado',
    truncated: '... (truncado)',
-    thinkingLabel: 'Pensamiento',
+    executionDuration: 'Execution time',    thinkingLabel: 'Pensamiento',
    thinkingInProgress: 'Pensando…',
    thinkingShow: 'Mostrar pensamiento',
    thinkingHide: 'Ocultar pensamiento',
@@ -563,6 +563,17 @@ jobTriggered: 'Job ejecutado',
  // Registro de cambios
  changelog: {
    new_0_5_3_1: 'Improve reasoning process display with persistence across page refreshes',
    new_0_5_3_2: 'Optimize stringified array format parsing to extract thinking/text/tool_calls',
    new_0_5_3_3: 'Improve log display by removing ellipsis and showing full content',
    new_0_5_3_4: 'Add detailed logging for format conversion and parsing',
    new_0_5_3_5: 'Optimize token calculation to accurately include tool results',
    new_0_5_2_1: 'Convert conversation history to Anthropic format before sending to Gateway',
    new_0_5_2_2: 'Add bidirectional reasoning sync between memory and database',
    new_0_5_2_3: 'Add message pagination with DESC query + array reverse for performance',
    new_0_5_2_4: 'Clean up debug code and unused imports',
    new_0_5_2_5: 'Remove auto-resumed event trigger to avoid timing issues',
    new_0_5_2_6: 'Use reasoning field consistently across codebase',
    new_0_5_1_1: 'Auto-sync Hermes history sessions on first startup',
    new_0_5_1_2: 'Fix session sync failure with old Hermes versions (backward compatible)',
    new_0_5_1_3: 'Smart cleanup of exclusive platform credentials on profile clone (Telegram, Discord, Slack, etc.)',
@@ -131,7 +131,7 @@ export default {
    arguments: 'Arguments',
    result: 'Resultat',
    truncated: '... (tronque)',
-    thinkingLabel: 'Raisonnement',
+    executionDuration: 'Execution time',    thinkingLabel: 'Raisonnement',
    thinkingInProgress: 'En réflexion…',
    thinkingShow: 'Afficher le raisonnement',
    thinkingHide: 'Masquer le raisonnement',
@@ -563,6 +563,17 @@ jobTriggered: 'Job declenche',
  // Journal des modifications
  changelog: {
    new_0_5_3_1: 'Improve reasoning process display with persistence across page refreshes',
    new_0_5_3_2: 'Optimize stringified array format parsing to extract thinking/text/tool_calls',
    new_0_5_3_3: 'Improve log display by removing ellipsis and showing full content',
    new_0_5_3_4: 'Add detailed logging for format conversion and parsing',
    new_0_5_3_5: 'Optimize token calculation to accurately include tool results',
    new_0_5_2_1: 'Convert conversation history to Anthropic format before sending to Gateway',
    new_0_5_2_2: 'Add bidirectional reasoning sync between memory and database',
    new_0_5_2_3: 'Add message pagination with DESC query + array reverse for performance',
    new_0_5_2_4: 'Clean up debug code and unused imports',
    new_0_5_2_5: 'Remove auto-resumed event trigger to avoid timing issues',
    new_0_5_2_6: 'Use reasoning field consistently across codebase',
    new_0_5_1_1: 'Auto-sync Hermes history sessions on first startup',
    new_0_5_1_2: 'Fix session sync failure with old Hermes versions (backward compatible)',
    new_0_5_1_3: 'Smart cleanup of exclusive platform credentials on profile clone (Telegram, Discord, Slack, etc.)',
@@ -131,7 +131,7 @@ export default {
    arguments: '引数',
    result: '結果',
    truncated: '... (省略)',
-    thinkingLabel: '思考過程',
+    executionDuration: 'Execution time',    thinkingLabel: '思考過程',
    thinkingInProgress: '思考中…',
    thinkingShow: '思考過程を表示',
    thinkingHide: '思考過程を隠す',
@@ -563,6 +563,17 @@ export default {
  // 更新履歴
  changelog: {
    new_0_5_3_1: 'Improve reasoning process display with persistence across page refreshes',
    new_0_5_3_2: 'Optimize stringified array format parsing to extract thinking/text/tool_calls',
    new_0_5_3_3: 'Improve log display by removing ellipsis and showing full content',
    new_0_5_3_4: 'Add detailed logging for format conversion and parsing',
    new_0_5_3_5: 'Optimize token calculation to accurately include tool results',
    new_0_5_2_1: 'Convert conversation history to Anthropic format before sending to Gateway',
    new_0_5_2_2: 'Add bidirectional reasoning sync between memory and database',
    new_0_5_2_3: 'Add message pagination with DESC query + array reverse for performance',
    new_0_5_2_4: 'Clean up debug code and unused imports',
    new_0_5_2_5: 'Remove auto-resumed event trigger to avoid timing issues',
    new_0_5_2_6: 'Use reasoning field consistently across codebase',
    new_0_5_1_1: 'Auto-sync Hermes history sessions on first startup',
    new_0_5_1_2: 'Fix session sync failure with old Hermes versions (backward compatible)',
    new_0_5_1_3: 'Smart cleanup of exclusive platform credentials on profile clone (Telegram, Discord, Slack, etc.)',
@@ -131,7 +131,7 @@ export default {
    arguments: '인수',
    result: '결과',
    truncated: '... (잘림)',
-    thinkingLabel: '사고 과정',
+    executionDuration: 'Execution time',    thinkingLabel: '사고 과정',
    thinkingInProgress: '사고 중…',
    thinkingShow: '사고 과정 펼치기',
    thinkingHide: '사고 과정 접기',
@@ -563,6 +563,17 @@ export default {
  // 변경 이력
  changelog: {
    new_0_5_3_1: 'Improve reasoning process display with persistence across page refreshes',
    new_0_5_3_2: 'Optimize stringified array format parsing to extract thinking/text/tool_calls',
    new_0_5_3_3: 'Improve log display by removing ellipsis and showing full content',
    new_0_5_3_4: 'Add detailed logging for format conversion and parsing',
    new_0_5_3_5: 'Optimize token calculation to accurately include tool results',
    new_0_5_2_1: 'Convert conversation history to Anthropic format before sending to Gateway',
    new_0_5_2_2: 'Add bidirectional reasoning sync between memory and database',
    new_0_5_2_3: 'Add message pagination with DESC query + array reverse for performance',
    new_0_5_2_4: 'Clean up debug code and unused imports',
    new_0_5_2_5: 'Remove auto-resumed event trigger to avoid timing issues',
    new_0_5_2_6: 'Use reasoning field consistently across codebase',
    new_0_5_1_1: 'Auto-sync Hermes history sessions on first startup',
    new_0_5_1_2: 'Fix session sync failure with old Hermes versions (backward compatible)',
    new_0_5_1_3: 'Smart cleanup of exclusive platform credentials on profile clone (Telegram, Discord, Slack, etc.)',
@@ -131,7 +131,7 @@ export default {
    arguments: 'Argumentos',
    result: 'Resultado',
    truncated: '... (truncado)',
-    thinkingLabel: 'Raciocínio',
+    executionDuration: 'Execution time',    thinkingLabel: 'Raciocínio',
    thinkingInProgress: 'Pensando…',
    thinkingShow: 'Mostrar raciocínio',
    thinkingHide: 'Ocultar raciocínio',
@@ -563,6 +563,17 @@ jobTriggered: 'Job acionado',
  // Registro de alteracoes
  changelog: {
    new_0_5_3_1: 'Improve reasoning process display with persistence across page refreshes',
    new_0_5_3_2: 'Optimize stringified array format parsing to extract thinking/text/tool_calls',
    new_0_5_3_3: 'Improve log display by removing ellipsis and showing full content',
    new_0_5_3_4: 'Add detailed logging for format conversion and parsing',
    new_0_5_3_5: 'Optimize token calculation to accurately include tool results',
    new_0_5_2_1: 'Convert conversation history to Anthropic format before sending to Gateway',
    new_0_5_2_2: 'Add bidirectional reasoning sync between memory and database',
    new_0_5_2_3: 'Add message pagination with DESC query + array reverse for performance',
    new_0_5_2_4: 'Clean up debug code and unused imports',
    new_0_5_2_5: 'Remove auto-resumed event trigger to avoid timing issues',
    new_0_5_2_6: 'Use reasoning field consistently across codebase',
    new_0_5_1_1: 'Auto-sync Hermes history sessions on first startup',
    new_0_5_1_2: 'Fix session sync failure with old Hermes versions (backward compatible)',
    new_0_5_1_3: 'Smart cleanup of exclusive platform credentials on profile clone (Telegram, Discord, Slack, etc.)',
@@ -154,6 +154,7 @@ export default {
    arguments: '参数',
    result: '结果',
    truncated: '... (已截断)',
    executionDuration: '执行时长',
    thinkingLabel: '思考过程',
    thinkingInProgress: '思考中…',
    thinkingShow: '展开思考过程',
@@ -729,6 +730,17 @@ export default {
  // 更新日志
  changelog: {
    new_0_5_3_1: '改进思考过程显示，支持页面刷新后持久化',
    new_0_5_3_2: '优化字符串化数组格式解析，自动提取思考/文本/工具调用',
    new_0_5_3_3: '改进日志显示，移除省略号完整展示日志内容',
    new_0_5_3_4: '添加详细的格式转换和解析日志记录',
    new_0_5_3_5: '优化 token 计算，准确包含 tool 结果',
    new_0_5_2_1: '将对话历史转换为 Anthropic 格式后发送给 Gateway',
    new_0_5_2_2: '添加内存和数据库之间的双向思考过程同步',
    new_0_5_2_3: '添加消息分页功能（DESC 查询 + 数组反转，性能优化）',
    new_0_5_2_4: '清理调试代码和未使用的导入',
    new_0_5_2_5: '移除自动 resumed 事件触发，避免时序问题',
    new_0_5_2_6: '统一使用 reasoning 字段',
    new_0_5_1_1: '首次启动时自动同步 Hermes 历史会话',
    new_0_5_1_2: '修复旧版本 Hermes 会话同步失败问题（向后兼容）',
    new_0_5_1_3: 'Profile 克隆时智能清理独占平台凭据（Telegram、Discord、Slack 等）',
@@ -26,6 +26,7 @@ export interface Message {
  toolArgs?: string
  toolResult?: string
  toolStatus?: 'running' | 'done' | 'error'
  toolDuration?: number  // 工具执行时长（秒）
  isStreaming?: boolean
  attachments?: Attachment[]
  // 思考/推理文本。两条来源：
@@ -615,8 +616,10 @@ export const useChatStore = defineStore('chat', () => {
      // Helper to clean up this session's stream state
      const cleanup = () => {
        console.log('[sendMessage] cleanup called, deleting stream state for sid:', sid)
        streamStates.value.delete(sid)
        serverWorking.value.delete(sid)
        console.log('[sendMessage] cleanup done, isStreaming now:', isStreaming.value)
      }
      // Per-run flags used to detect silently-swallowed errors at run.completed.
@@ -765,7 +768,13 @@ export const useChatStore = defineStore('chat', () => {
              )
              if (toolMsgs.length > 0) {
                const last = toolMsgs[toolMsgs.length - 1]
-                updateMessage(sid, last.id, { toolStatus: 'done' })
+                // Check if tool errored
                const hasError = (evt as any).error === true
                const duration = (evt as any).duration
                updateMessage(sid, last.id, {
                  toolStatus: hasError ? 'error' : 'done',
                  toolDuration: duration,
                })
              }
              break
@@ -790,9 +799,29 @@ export const useChatStore = defineStore('chat', () => {
              // stream). If we never produced assistant text but the gateway
              // reports a non-empty output, fall back to rendering it as a
              // single assistant message so the user actually sees the reply.
              // Check if backend provided parsed content (from stringified array format)
              let finalOutputTrimmed = ''
              if ((evt as any).parsed_content !== undefined) {
                // Backend has parsed stringified array format, update last assistant message
                const msgs = getSessionMsgs(sid)
                const lastAssistant = [...msgs].reverse().find(m => m.role === 'assistant')
                if (lastAssistant) {
                  updateMessage(sid, lastAssistant.id, {
                    content: (evt as any).parsed_content || '',
                  })
                  if ((evt as any).parsed_reasoning) {
                    updateMessage(sid, lastAssistant.id, {
                      reasoning: (evt as any).parsed_reasoning,
                    })
                  }
                  finalOutputTrimmed = ((evt as any).parsed_content || '').trim()
                }
              } else {
                // Fallback to output field (legacy behavior)
                const finalOutput =
                  typeof evt.output === 'string' ? evt.output : ''
-              const finalOutputTrimmed = finalOutput.trim()
+                finalOutputTrimmed = finalOutput.trim()
                if (!runProducedAssistantText && finalOutputTrimmed !== '') {
                  addMessage(sid, {
                    id: uid(),
@@ -802,6 +831,7 @@ export const useChatStore = defineStore('chat', () => {
                  })
                  runProducedAssistantText = true
                }
              }
              // Workaround for upstream hermes-agent bug: when the agent
              // layer silently swallows an error (e.g. invalid API key,
              // unsupported model), the gateway still emits run.completed
@@ -875,6 +905,7 @@ export const useChatStore = defineStore('chat', () => {
        },
        // onDone
        () => {
          console.log('[sendMessage] onDone callback called, cleaning up stream state')
          const msgs = getSessionMsgs(sid)
          const last = msgs[msgs.length - 1]
          if (last?.isStreaming) {
@@ -1076,7 +1107,11 @@ export const useChatStore = defineStore('chat', () => {
          const msgs = getSessionMsgs(sid)
          const toolMsgs = msgs.filter(m => m.role === 'tool' && m.toolStatus === 'running')
          if (toolMsgs.length > 0) {
-            updateMessage(sid, toolMsgs[toolMsgs.length - 1].id, { toolStatus: 'done' })
+            const hasError = (evt as any).error === true
            updateMessage(sid, toolMsgs[toolMsgs.length - 1].id, {
              toolStatus: hasError ? 'error' : 'done',
              toolDuration: (evt as any).duration,
            })
          }
          break
@@ -1096,8 +1131,27 @@ export const useChatStore = defineStore('chat', () => {
              target.outputTokens = (evt as any).outputTokens
            }
          }
          // Check if backend provided parsed content (from stringified array format)
          let finalOutputTrimmed = ''
          if ((evt as any).parsed_content !== undefined) {
            // Backend has parsed stringified array format, update last assistant message
            const msgs = getSessionMsgs(sid)
            const lastAssistant = [...msgs].reverse().find(m => m.role === 'assistant')
            if (lastAssistant) {
              updateMessage(sid, lastAssistant.id, {
                content: (evt as any).parsed_content || '',
              })
              if ((evt as any).parsed_reasoning) {
                updateMessage(sid, lastAssistant.id, {
                  reasoning: (evt as any).parsed_reasoning,
                })
              }
              finalOutputTrimmed = ((evt as any).parsed_content || '').trim()
            }
          } else {
            // Fallback to output field (legacy behavior)
            const finalOutput = typeof evt.output === 'string' ? evt.output : ''
-          const finalOutputTrimmed = finalOutput.trim()
+            finalOutputTrimmed = finalOutput.trim()
            if (!runProducedAssistantText && finalOutputTrimmed !== '') {
              addMessage(sid, {
                id: uid(),
@@ -1106,6 +1160,7 @@ export const useChatStore = defineStore('chat', () => {
                timestamp: Date.now(),
              })
            }
          }
          const swallowedError = !runProducedAssistantText && !runHadToolActivity && finalOutputTrimmed === ''
          if (swallowedError) {
            addMessage(sid, {
@@ -261,9 +261,9 @@ onMounted(async () => {
 .log-message {
  color: $text-secondary;
-  overflow: hidden;
+  overflow: visible;
-  text-overflow: ellipsis;
+  white-space: normal;
-  white-space: nowrap;
+  word-break: break-word;
  min-width: 0;
 }
@@ -17,7 +17,6 @@ export async function create(ctx: any) {
  const { name, base_url, api_key, model, context_length, providerKey } = ctx.request.body as {
    name: string; base_url: string; api_key: string; model: string; context_length?: number; providerKey?: string | null
  }
  console.log(name, base_url, api_key, model, providerKey)
  if (!name || !base_url || !model) {
    ctx.status = 400; ctx.body = { error: 'Missing name, base_url, or model' }; return
  }
@@ -402,3 +402,43 @@ export async function usageStats(ctx: any) {
    daily_usage: [...dayMap.values()],
  }
 }
 export async function getConversationMessagesPaginated(ctx: any) {
  const offset = ctx.query.offset ? parseInt(ctx.query.offset as string, 10) : 0
  const limit = ctx.query.limit ? parseInt(ctx.query.limit as string, 10) : 50
  if (useLocalSessionStore()) {
    const { getSessionDetailPaginated } = await import('../../db/hermes/session-store')
    const result = getSessionDetailPaginated(ctx.params.id, offset, limit)
    if (!result) {
      ctx.status = 404
      ctx.body = { error: 'Conversation not found' }
      return
    }
    ctx.body = {
      session: {
        id: result.session.id,
        source: result.session.source,
        model: result.session.model,
        title: result.session.title,
        started_at: result.session.started_at,
        ended_at: result.session.ended_at,
        last_active: result.session.last_active,
        message_count: result.session.message_count,
        input_tokens: result.session.input_tokens,
        output_tokens: result.session.output_tokens,
      },
      messages: result.messages,
      total: result.total,
      offset: result.offset,
      limit: result.limit,
      hasMore: result.hasMore,
    }
    return
  }
  ctx.status = 404
  ctx.body = { error: 'Conversation not found' }
 }
@@ -300,6 +300,15 @@ export function searchSessions(profile: string, query: string, limit = 20): Herm
  })
 }
 export interface PaginatedSessionDetailResult {
  session: HermesSessionRow
  messages: HermesMessageRow[]
  total: number
  offset: number
  limit: number
  hasMore: boolean
 }
 export function getSessionDetail(id: string): HermesSessionDetailRow | null {
  if (!isSqliteAvailable()) return null
  const db = getDb()!
@@ -411,6 +420,45 @@ export function updateSessionStats(id: string): void {
  ).run(id, id, id)
 }
 export function getSessionDetailPaginated(
  id: string,
  offset = 0,
  limit = 500,
 ): PaginatedSessionDetailResult | null {
  if (!isSqliteAvailable()) {
    return null
  }
  const db = getDb()!
  // Get session info
  const sessionRow = db.prepare(`SELECT * FROM ${SESSIONS_TABLE} WHERE id = ?`).get(id) as Record<string, unknown> | undefined
  if (!sessionRow) return null
  // Get total message count
  const countResult = db.prepare(
    `SELECT COUNT(*) as total FROM ${MESSAGES_TABLE} WHERE session_id = ?`,
  ).get(id) as { total: number } | undefined
  const total = countResult?.total || 0
  // Get paginated messages (newest first from DB, then reverse)
  const msgRows = db.prepare(
    `SELECT * FROM ${MESSAGES_TABLE} WHERE session_id = ? ORDER BY timestamp DESC, id DESC LIMIT ? OFFSET ?`,
  ).all(id, limit, offset) as Record<string, unknown>[]
  const session = mapSessionRow(sessionRow)
  const messages = msgRows.map(mapMessageRow).reverse()  // Reverse to show oldest first
  return {
    session,
    messages,
    total,
    offset,
    limit,
    hasMore: offset + messages.length < total,
  }
 }
 // --- Session store mode ---
 import { config } from '../../config'
@@ -696,13 +696,14 @@ export async function getSessionDetailFromDbWithProfile(sessionId: string, profi
  }
 }
-export async function listSessionSummaries(source?: string, limit = 2000): Promise<HermesSessionRow[]> {
+export async function listSessionSummaries(source?: string, limit = 2000, profile?: string): Promise<HermesSessionRow[]> {
  if (!SQLITE_AVAILABLE) {
    throw new Error(`node:sqlite requires Node >= 22.5, current: ${process.versions.node}`)
  }
  const { DatabaseSync } = await import('node:sqlite')
-  const db = new DatabaseSync(sessionDbPath(), { open: true, readOnly: true })
+  const dbPath = profile ? `${getProfileDir(profile)}/state.db` : sessionDbPath()
  const db = new DatabaseSync(dbPath, { open: true, readOnly: true })
  try {
    const clauses = ["s.parent_session_id IS NULL", "s.source != 'tool'", "s.id NOT LIKE 'compress_%'"]
@@ -33,11 +33,15 @@ export function getDb(): DatabaseSync | null {
    mkdirSync(DB_DIR, { recursive: true })
    _db = new DatabaseSync(DB_PATH)
    // Use WAL mode for better concurrency and WSL compatibility
    if (isDev) {
      _db.exec('PRAGMA journal_mode=DELETE')
    } else {
      _db.exec('PRAGMA journal_mode=WAL')
      _db.exec('PRAGMA synchronous=NORMAL')
      _db.exec('PRAGMA busy_timeout=5000')
      _db.exec('PRAGMA foreign_keys=ON')
    }
  }
  return _db
 }
@@ -31,6 +31,7 @@ export interface ChatMessage {
  tool_calls?: Array<{ id: string; type: string; function: { name: string; arguments: string } }>
  tool_call_id?: string
  name?: string
  reasoning_content?: string | null
 }
 export interface CompressionConfig {
@@ -94,10 +95,6 @@ export function countTokensForModel(text: string, model: string): number {
  }
 }
 function estimateMessagesTokens(messages: ChatMessage[]): number {
  return messages.reduce((sum, m) => sum + countTokens(m.content), 0)
 }
 // ─── Prompts ────────────────────────────────────────────
 export const SUMMARY_PREFIX = `[CONTEXT COMPACTION — REFERENCE ONLY] Earlier turns were compacted
@@ -250,6 +247,43 @@ function serializeForSummary(messages: ChatMessage[]): string {
  return parts.join('\n\n')
 }
 /**
 * Convert messages to conversation history format for LLM API.
 * Tool calls are converted to text format within assistant messages.
 */
 function buildConversationHistory(messages: ChatMessage[]): Array<{ role: string; content: string }> {
  const result: Array<{ role: string; content: string }> = []
  for (const msg of messages) {
    if (msg.role === 'tool') {
      // Convert tool result to text and append to previous assistant message
      const toolText = `[Tool result: ${msg.name || 'unknown'}]\n${(msg.content || '').slice(0, 500)}${msg.content && msg.content.length > 500 ? '...' : ''}`
      // Find the last assistant message and append to it
      const lastAssistant = result.findLast(m => m.role === 'assistant')
      if (lastAssistant) {
        lastAssistant.content += `\n\n${toolText}`
      } else {
        // Fallback: create an assistant message
        result.push({ role: 'assistant', content: toolText })
      }
    } else if (msg.role === 'assistant' && msg.tool_calls?.length) {
      // Include tool calls in assistant message
      const toolsInfo = msg.tool_calls.map(tc => {
        let args = tc.function.arguments
        if (args.length > 1000) args = args.slice(0, 1000) + '...'
        return `[Calling tool: ${tc.function.name} with arguments: ${args}]`
      }).join('\n')
      const content = msg.content ? `${msg.content}\n\n${toolsInfo}` : toolsInfo
      result.push({ role: msg.role, content })
    } else if (msg.role === 'user' || msg.role === 'assistant' || msg.role === 'system') {
      result.push({ role: msg.role, content: msg.content || '' })
    }
    // Skip other roles
  }
  return result
 }
 function pruneOldToolResults(messages: ChatMessage[], keepRecentCount: number): ChatMessage[] {
  if (messages.length <= keepRecentCount) return messages
@@ -337,7 +371,7 @@ async function callSummarizer(
        if (parsed.event === 'run.completed') {
          clearTimeout(timer)
          source.close()
-          deleteCompressSession(sessionId, profile).catch(() => {})
+          deleteCompressSession(sessionId, profile).catch(() => { })
          const output = parsed.output
          if (!output || typeof output !== 'string' || output.trim() === '') {
            reject(new Error('Empty summarization response'))
@@ -347,7 +381,7 @@ async function callSummarizer(
        } else if (parsed.event === 'run.failed') {
          clearTimeout(timer)
          source.close()
-          deleteCompressSession(sessionId, profile).catch(() => {})
+          deleteCompressSession(sessionId, profile).catch(() => { })
          reject(new Error(parsed.error || 'Summarization run failed'))
        }
      } catch { /* ignore parse errors */ }
@@ -356,7 +390,7 @@ async function callSummarizer(
    source.onerror = () => {
      clearTimeout(timer)
      source.close()
-      deleteCompressSession(sessionId, profile).catch(() => {})
+      deleteCompressSession(sessionId, profile).catch(() => { })
      reject(new Error('Summarization SSE connection error'))
    }
  })
@@ -402,11 +436,8 @@ export class ChatContextCompressor {
    upstream: string,
    apiKey: string | undefined,
    sessionId?: string,
    contextLength?: number,
    profile?: string,
  ): Promise<CompressedResult> {
    const cl = contextLength || 200_000
    const triggerTokens = Math.floor(cl / 2)
    const total = messages.length
    const makeMeta = (opts: Partial<CompressedResult['meta']> = {}): CompressedResult['meta'] => ({
@@ -419,60 +450,27 @@ export class ChatContextCompressor {
      ...opts,
    })
-    // ── Step 1: Check snapshot first ─────────────────────
+    // Check if we have a previous compression snapshot
    const snapshot = sessionId ? getCompressionSnapshot(sessionId) : null
    if (snapshot) {
-      const { summary: previousSummary, lastMessageIndex } = snapshot
+      // Has snapshot → incremental compress (merge old summary with new messages)
      const newMessages = messages.slice(lastMessageIndex + 1)
      const summaryTokens = countTokens(SUMMARY_PREFIX + previousSummary)
      const newTokens = estimateMessagesTokens(newMessages)
      const assembledTokens = summaryTokens + newTokens
      logger.info(
-        '[context-compressor] session=%s: snapshot at %d, %d new messages, assembled ~%d tokens (threshold %d)',
+        '[context-compressor] session=%s: incremental compress with snapshot at index %d',
-        sessionId, lastMessageIndex, newMessages.length, assembledTokens, triggerTokens,
+        sessionId, snapshot.lastMessageIndex,
      )
      // Under threshold → return summary + new messages, no LLM call
      if (assembledTokens <= triggerTokens) {
        const result: ChatMessage[] = [
          { role: 'system', content: SUMMARY_PREFIX + '\n\n' + previousSummary },
          ...newMessages,
        ]
        return {
          messages: result,
          meta: makeMeta({
            compressed: true,
            llmCompressed: false,
            summaryTokenEstimate: summaryTokens,
            verbatimCount: newMessages.length,
            compressedStartIndex: lastMessageIndex,
          }),
        }
      }
      // Over threshold → incremental LLM compress
      return this.incrementalCompress(
        messages, snapshot, upstream, apiKey, sessionId!, makeMeta(), profile,
      )
-    }
+    } else {
-
+      // No snapshot → full compress (compress all messages)
    // ── Step 2: No snapshot — check all messages ──────────
    const totalTokens = estimateMessagesTokens(messages)
      logger.info(
-      '[context-compressor] session=%s: no snapshot, %d messages, ~%d tokens (threshold %d)',
+        '[context-compressor] session=%s: full compress %d messages',
-      sessionId, total, totalTokens, triggerTokens,
+        sessionId, total,
      )
    if (totalTokens <= triggerTokens) {
      return { messages, meta: makeMeta() }
    }
    // Over threshold → full LLM compress
      return this.fullCompress(messages, upstream, apiKey, sessionId!, makeMeta(), profile)
    }
  }
  private async incrementalCompress(
    messages: ChatMessage[],
@@ -503,9 +501,7 @@ export class ChatContextCompressor {
    try {
      const contentToSummarize = serializeForSummary(toCompress)
      const prompt = buildIncrementalPrompt(previousSummary, contentToSummarize, this.config.summaryBudget)
-      const history = toCompress
+      const history = buildConversationHistory(toCompress)
        .filter(m => m.role === 'user' || m.role === 'assistant')
        .map(m => ({ role: m.role, content: m.content }))
      const t0 = Date.now()
      summary = await callSummarizer(upstream, apiKey, prompt, history, this.config.summarizationTimeoutMs, previousSummary, profile)
@@ -565,9 +561,7 @@ export class ChatContextCompressor {
    const contentToSummarize = serializeForSummary(toCompress)
    const prompt = buildFullPrompt(contentToSummarize, this.config.summaryBudget)
-    const history = toCompress
+    const history = buildConversationHistory(toCompress)
      .filter(m => m.role === 'user' || m.role === 'assistant')
      .map(m => ({ role: m.role, content: m.content }))
    let summary: string | null = null
    try {
@@ -5,6 +5,7 @@ export const sessionRoutes = new Router()
 sessionRoutes.get('/api/hermes/sessions/conversations', ctrl.listConversations)
 sessionRoutes.get('/api/hermes/sessions/conversations/:id/messages', ctrl.getConversationMessages)
 sessionRoutes.get('/api/hermes/sessions/conversations/:id/messages/paginated', ctrl.getConversationMessagesPaginated)
 sessionRoutes.get('/api/hermes/sessions', ctrl.list)
 sessionRoutes.get('/api/hermes/search/sessions', ctrl.search)
 sessionRoutes.get('/api/hermes/sessions/search', ctrl.search)
@@ -15,6 +15,7 @@ import { updateUsage } from '../../db/hermes/usage-store'
 import {
  getSession,
  getSessionDetail,
  getSessionDetailPaginated,
  createSession,
  addMessage,
  updateSessionStats,
@@ -29,6 +30,98 @@ import { logger } from '../logger'
 const compressor = new ChatContextCompressor()
 // --- Helper: Convert OpenAI format to Anthropic format ---
 function convertToAnthropicFormat(messages: any[]): any[] {
  const result: any[] = []
  for (const m of messages) {
    const role = m.role
    const content = m.content || ''
    if (role === 'assistant') {
      const blocks: any[] = []
      // Add thinking block if reasoning_content exists
      if (m.reasoning) {
        blocks.push({ type: 'thinking', thinking: m.reasoning })
      }
      // Add text content
      if (content) {
        if (typeof content === 'string') {
          blocks.push({ type: 'text', text: content })
        } else if (Array.isArray(content)) {
          blocks.push(...content)
        }
      }
      // Add tool_use blocks
      if (m.tool_calls && Array.isArray(m.tool_calls)) {
        for (const tc of m.tool_calls) {
          if (tc.id && tc.function) {
            let args = tc.function.arguments || '{}'
            try {
              args = typeof args === 'string' ? JSON.parse(args) : args
            } catch {
              args = {}
            }
            blocks.push({
              type: 'tool_use',
              id: tc.id,
              name: tc.function.name,
              input: args
            })
          }
        }
      }
      // Handle empty content
      if (blocks.length === 0) {
        blocks.push({ type: 'text', text: '' })
      }
      result.push({ role: 'assistant', content: blocks })
      continue
    }
    if (role === 'tool') {
      // Convert tool message to tool_result in user message
      const toolContent = content || '(no output)'
      const toolResult = {
        type: 'tool_result',
        tool_use_id: m.tool_call_id || '',
        content: typeof toolContent === 'string' ? toolContent : JSON.stringify(toolContent)
      }
      // Merge with previous user message if it ends with tool_result
      if (
        result.length > 0 &&
        result[result.length - 1].role === 'user' &&
        Array.isArray(result[result.length - 1].content) &&
        result[result.length - 1].content.length > 0 &&
        result[result.length - 1].content[result[result.length - 1].content.length - 1].type === 'tool_result'
      ) {
        result[result.length - 1].content.push(toolResult)
      } else {
        result.push({ role: 'user', content: [toolResult] })
      }
      continue
    }
    // Regular user message
    if (role === 'user') {
      if (typeof content === 'string') {
        result.push({ role: 'user', content: content || '(empty message)' })
      } else if (Array.isArray(content)) {
        result.push({ role: 'user', content })
      }
      continue
    }
  }
  return result
 }
 // --- Session state tracking ---
 interface SessionMessage {
@@ -113,14 +206,21 @@ export class ChatRunSocket {
      const sid = data.session_id
      const room = `session:${sid}`
      socket.join(room)
      this.resumeSession(socket, sid)
    })
    socket.on('abort', (data: { session_id?: string }) => {
      if (data.session_id) {
        this.handleAbort(socket, data.session_id)
      }
    })
  }
  private async resumeSession(socket: Socket, sid: string) {
    let state = this.sessionMap.get(sid)
      // Not in memory — load from DB
      if (!state) {
    try {
      const detail = useLocalSessionStore()
-            ? getSessionDetail(sid)
+        ? getSessionDetailPaginated(sid)
        : await getSessionDetailFromDb(sid)
      const messages = detail?.messages?.length
        ? detail.messages
@@ -131,31 +231,129 @@ export class ChatRunSocket {
              session_id: sid,
              role: m.role,
              content: m.content || '',
              reasoning: m.reasoning || '',
              timestamp: m.timestamp,
            }
-                if (m.tool_calls?.length) msg.tool_calls = m.tool_calls
+            // Convert Anthropic format content to OpenAI format
            // Check if content is a stringified array (Hermes Gateway behavior) - only for assistant messages
            if (m.role === 'assistant' && typeof m.content === 'string') {
              // Handle double-serialized content: "[{'type': 'text', ...}]" -> "[{'type': 'text', ...}]"
              let contentToParse = m.content
              const trimmed = m.content.trim()
              if (trimmed.startsWith('"') && trimmed.endsWith('"') && trimmed.length >= 2) {
                contentToParse = trimmed.slice(1, -1)
                logger.info('[chat-run-socket] resume message %s: double-serialized, removed outer quotes', m.id)
              }
              if (contentToParse.startsWith('[') && contentToParse.endsWith(']')) {
                try {
                  // Parse stringified Python-like array to JSON
                  const parsedContent = JSON.parse(
                    contentToParse
                      .replace(/'/g, '"')  // Python single quotes to JSON double quotes
                      .replace(/True/g, 'true')
                      .replace(/False/g, 'false')
                      .replace(/None/g, 'null')
                  )
                  if (Array.isArray(parsedContent)) {
                    const textBlocks: string[] = []
                    const toolCalls: any[] = []
                    let reasoningContent: string | null = null
                    for (const block of parsedContent) {
                      if (block.type === 'thinking') {
                        reasoningContent = block.thinking
                      } else if (block.type === 'text') {
                        textBlocks.push(block.text)
                      } else if (block.type === 'tool_use') {
                        toolCalls.push({
                          id: block.id,
                          type: 'function',
                          function: {
                            name: block.name,
                            arguments: JSON.stringify(block.input)
                          }
                        })
                      }
                    }
                    msg.content = textBlocks.join('') || ''
                    if (toolCalls.length > 0) {
                      msg.tool_calls = toolCalls
                    }
                    if (reasoningContent) {
                      msg.reasoning = reasoningContent
                    }
                  }
                } catch (e) {
                  // Parsing failed, keep original content
                  msg.content = m.content
                }
              }
            } else if (Array.isArray(m.content)) {
              const textBlocks: string[] = []
              const toolCalls: any[] = []
              let reasoningContent: string | null = null
              for (const block of m.content) {
                if (block.type === 'thinking') {
                  reasoningContent = block.thinking
                } else if (block.type === 'text') {
                  textBlocks.push(block.text)
                } else if (block.type === 'tool_use') {
                  toolCalls.push({
                    id: block.id,
                    type: 'function',
                    function: {
                      name: block.name,
                      arguments: JSON.stringify(block.input)
                    }
                  })
                }
              }
              msg.content = textBlocks.join('') || ''
              if (toolCalls.length > 0) {
                msg.tool_calls = toolCalls
              }
              if (reasoningContent) {
                msg.reasoning = reasoningContent
              }
            }
            if (m.tool_calls?.length) {
              // Filter out tool_calls with empty/invalid id and remove internal fields
              const cleanedToolCalls = m.tool_calls
                .filter((tc: any) => tc.id && tc.id.length > 0)
                .map((tc: any) => ({
                  id: tc.id,
                  type: tc.type,
                  function: tc.function
                }))
              if (cleanedToolCalls.length > 0) {
                msg.tool_calls = cleanedToolCalls
              }
            }
            // For tool messages, ensure tool_call_id exists
            if (m.role === 'tool') {
-                  if (m.tool_call_id) {
+              let callId = m.tool_call_id
-                    msg.tool_call_id = m.tool_call_id
+              if (!callId || callId.length === 0) {
                  } else {
                // Try to reconstruct tool_call_id from previous assistant message
                const prevMsg = arr[idx - 1]
                if (prevMsg?.role === 'assistant' && prevMsg.tool_calls?.length) {
                  // Find matching tool_call by tool_name
                  const tc = prevMsg.tool_calls.find((t: any) => t.function?.name === m.tool_name)
                  if (tc?.id) {
-                        msg.tool_call_id = tc.id
+                    callId = tc.id
-                      } else {
+                  }
-                        // Cannot reconstruct - skip this tool message
+                }
              }
              // Skip tool message if no valid tool_call_id
              if (!callId || callId.length === 0) {
                return null
              }
-                    } else {
+              msg.tool_call_id = callId
                      // No previous assistant message with tool_calls - skip
                      return null
                    }
                  }
            }
            if (m.tool_name) msg.tool_name = m.tool_name
@@ -164,7 +362,6 @@ export class ChatRunSocket {
          })
          .filter(m => m !== null)
        : []
      // Calculate context tokens — aware of compression snapshot
      let inputTokens: number
      const snapshot = getCompressionSnapshot(sid)
@@ -192,12 +389,105 @@ export class ChatRunSocket {
      state = { messages: [], isWorking: false, events: [] }
      this.sessionMap.set(sid, state)
    }
      }
    // Reply with messages, working status + events (if working)
    // Convert messages from internal storage format to OpenAI format for client
    const clientMessages = state.messages.map((m: any) => {
      const msg: any = { ...m }
      // Check if content is a stringified array (Hermes Gateway behavior) - only for assistant messages
      if (m.role === 'assistant' && typeof m.content === 'string') {
        // Handle double-serialized content: "[{'type': 'text', ...}]"
        let contentToParse = m.content
        const trimmed = m.content.trim()
        if (trimmed.startsWith('"') && trimmed.endsWith('"') && trimmed.length >= 2) {
          contentToParse = trimmed.slice(1, -1)
          logger.info('[chat-run-socket] resume message %s: double-serialized, removed outer quotes', m.id)
        }
        if (contentToParse.trim().startsWith('[') && contentToParse.trim().endsWith(']')) {
          try {
            // Parse stringified Python-like array to JSON
            const parsedContent = JSON.parse(
              contentToParse
                .replace(/'/g, '"')
                .replace(/True/g, 'true')
                .replace(/False/g, 'false')
                .replace(/None/g, 'null')
            )
            if (Array.isArray(parsedContent)) {
              const textBlocks: string[] = []
              const toolCalls: any[] = []
              let reasoningContent: string | null = null
              for (const block of parsedContent) {
                if (block.type === 'thinking') {
                  reasoningContent = block.thinking
                } else if (block.type === 'text') {
                  textBlocks.push(block.text)
                } else if (block.type === 'tool_use') {
                  toolCalls.push({
                    id: block.id,
                    type: 'function',
                    function: {
                      name: block.name,
                      arguments: JSON.stringify(block.input)
                    }
                  })
                }
              }
              msg.content = textBlocks.join('') || ''
              if (toolCalls.length > 0) {
                msg.tool_calls = toolCalls
              }
              if (reasoningContent) {
                msg.reasoning = reasoningContent
              }
            }
          } catch (e) {
            logger.error('[chat-run-socket] resume message %s: failed to parse content, error=%s, content=%s', m.id, (e as Error).message, contentToParse.substring(0, 200))
            // Parsing failed, keep original content
            msg.content = m.content
          }
        }
      } else if (Array.isArray(m.content)) {
        // If content is an array (Anthropic format), convert to OpenAI format
        const textBlocks: string[] = []
        const toolCalls: any[] = []
        let reasoningContent: string | null = null
        for (const block of m.content) {
          if (block.type === 'thinking') {
            reasoningContent = block.thinking
          } else if (block.type === 'text') {
            textBlocks.push(block.text)
          } else if (block.type === 'tool_use') {
            toolCalls.push({
              id: block.id,
              type: 'function',
              function: {
                name: block.name,
                arguments: JSON.stringify(block.input)
              }
            })
          }
        }
        msg.content = textBlocks.join('') || ''
        if (toolCalls.length > 0) {
          msg.tool_calls = toolCalls
        }
        if (reasoningContent) {
          msg.reasoning = reasoningContent
        }
      }
      return msg
    })
    socket.emit('resumed', {
      session_id: sid,
-        messages: state.messages,
+      messages: clientMessages,
      isWorking: state.isWorking,
      events: state.isWorking ? state.events : [],
      inputTokens: state.inputTokens,
@@ -206,15 +496,7 @@ export class ChatRunSocket {
    logger.info('[chat-run-socket] socket %s resumed session %s (working: %s, messages: %d)',
      socket.id, sid, state.isWorking, state.messages.length)
    })
    socket.on('abort', (data: { session_id?: string }) => {
      if (data.session_id) {
        this.handleAbort(data.session_id)
  }
    })
  }
  // --- Run handler ---
  private async handleRun(
@@ -301,32 +583,46 @@ export class ChatRunSocket {
              tool_calls?: any[]
              tool_call_id?: string
              name?: string
              reasoning_content?: string | null
            }> = (lastUserMsgIndex >= 0
              ? validMessages.slice(0, validMessages.length - lastUserMsgIndex - 1)
              : validMessages
            ).map((m, idx, arr) => {
-              const msg: any = { role: m.role, content: m.content || '' }
+              const msg: any = { role: m.role, content: m.content || 'empty message' }
-              if (m.tool_calls?.length) msg.tool_calls = m.tool_calls
+              if (m.reasoning_content) msg.reasoning_content = m.reasoning_content
              if (m.tool_calls?.length) {
                // Filter out tool_calls with empty/invalid id and remove internal fields
                const cleanedToolCalls = m.tool_calls
                  .filter((tc: any) => tc.id && tc.id.length > 0)
                  .map((tc: any) => ({
                    id: tc.id,
                    type: tc.type,
                    function: tc.function
                  }))
                if (cleanedToolCalls.length > 0) {
                  msg.tool_calls = cleanedToolCalls
                }
              }
              // For tool messages, ensure tool_call_id exists
              if (m.role === 'tool') {
-                if (m.tool_call_id) {
+                let callId = m.tool_call_id
-                  msg.tool_call_id = m.tool_call_id
+                if (!callId || callId.length === 0) {
                } else {
                  // Try to reconstruct tool_call_id from previous assistant message
                  const prevMsg = arr[idx - 1]
                  if (prevMsg?.role === 'assistant' && prevMsg.tool_calls?.length) {
                    const tc = prevMsg.tool_calls.find((t: any) => t.function?.name === m.tool_name)
                    if (tc?.id) {
-                      msg.tool_call_id = tc.id
+                      callId = tc.id
                    } else {
                      return null // Cannot reconstruct
                    }
                  } else {
                    return null // No assistant message to reconstruct from
                    }
                  }
                }
                // Skip tool message if no valid tool_call_id
                if (!callId || callId.length === 0) {
                  return null
                }
                msg.tool_call_id = callId
              }
              if (m.tool_name) msg.name = m.tool_name
              return msg
@@ -367,7 +663,7 @@ export class ChatRunSocket {
                try {
                  const result = await compressor.compress(
-                    history, upstream, apiKey, session_id, contextLength,
+                    history, upstream, apiKey, session_id,
                  )
                  const afterTokens = await this.calcAndUpdateUsage(session_id, cState, emit)
                  this.replaceState(session_id, 'compression.completed', {
@@ -397,13 +693,29 @@ export class ChatRunSocket {
                    compressedStartIndex: result.meta.compressedStartIndex,
                  })
-                  history = result.messages.map(m => ({
+                  history = result.messages.map(m => {
                    const msg: any = {
                      role: m.role,
                      content: m.content,
                    tool_calls: m.tool_calls,
                      tool_call_id: m.tool_call_id,
                      name: m.name,
                    }
                    if (m.reasoning_content) msg.reasoning_content = m.reasoning_content
                    // Filter tool_calls if present, remove internal fields
                    if (m.tool_calls?.length) {
                      const cleanedToolCalls = m.tool_calls
                        .filter((tc: any) => tc.id && tc.id.length > 0)
                        .map((tc: any) => ({
                          id: tc.id,
                          type: tc.type,
                          function: tc.function
                        }))
                      if (cleanedToolCalls.length > 0) {
                        msg.tool_calls = cleanedToolCalls
                      }
                    }
                    return msg
                  })
                  // Update usage from DB (snapshot now updated by compressor)
                  await this.calcAndUpdateUsage(session_id, cState, emit)
                } catch (err: any) {
@@ -457,7 +769,7 @@ export class ChatRunSocket {
                try {
                  const result = await compressor.compress(
-                    history, upstream, apiKey, session_id, contextLength,
+                    history, upstream, apiKey, session_id,
                  )
                  const cState = this.getOrCreateSession(session_id)
                  const afterTokens = await this.calcAndUpdateUsage(session_id, cState, emit)
@@ -488,13 +800,29 @@ export class ChatRunSocket {
                    compressedStartIndex: result.meta.compressedStartIndex,
                  })
-                  history = result.messages.map(m => ({
+                  history = result.messages.map(m => {
                    const msg: any = {
                      role: m.role,
                      content: m.content,
                    tool_calls: m.tool_calls,
                      tool_call_id: m.tool_call_id,
                      name: m.name,
                    }
                    if (m.reasoning_content) msg.reasoning_content = m.reasoning_content
                    // Filter tool_calls if present, remove internal fields
                    if (m.tool_calls?.length) {
                      const cleanedToolCalls = m.tool_calls
                        .filter((tc: any) => tc.id && tc.id.length > 0)
                        .map((tc: any) => ({
                          id: tc.id,
                          type: tc.type,
                          function: tc.function
                        }))
                      if (cleanedToolCalls.length > 0) {
                        msg.tool_calls = cleanedToolCalls
                      }
                    }
                    return msg
                  })
                  await this.calcAndUpdateUsage(session_id, cState, emit)
                } catch (err: any) {
                  this.replaceState(session_id, 'compression.completed', {
@@ -535,6 +863,16 @@ export class ChatRunSocket {
      const headers: Record<string, string> = { 'Content-Type': 'application/json' }
      if (apiKey) headers['Authorization'] = `Bearer ${apiKey}`
      // Debug: write history to JSON file for analysis (before conversion)
      // Convert conversation_history from OpenAI format to Anthropic format
      if (body.conversation_history && Array.isArray(body.conversation_history)) {
        body.conversation_history = convertToAnthropicFormat(body.conversation_history)
        logger.info('[chat-run-socket] converted conversation_history to Anthropic format for session %s: %d messages, content: %s',
          session_id || '(new)', body.conversation_history.length, JSON.stringify(body.conversation_history, null, 2))
      }
      const res = await fetch(`${upstream}/v1/runs`, {
        method: 'POST',
        headers,
@@ -589,6 +927,12 @@ export class ChatRunSocket {
      source.onmessage = (event: MessageEvent) => {
        try {
          const parsed = JSON.parse(event.data as string)
          // Debug: log all events from upstream
          if (parsed.event?.includes('reasoning') || parsed.event?.includes('thinking')) {
            logger.info('[chat-run-socket] upstream event: %s, data: %j', parsed.event, parsed)
          } else {
            logger.info('[chat-run-socket] upstream event: %s', parsed.event)
          }
          // Track messages into sessionMap
          if (session_id) {
@@ -653,9 +997,15 @@ export class ChatRunSocket {
                  break
                }
                case 'run.completed': {
                  logger.info('[chat-run-socket] ENTER run.completed case, session_id: %s, messages: %d',
                    session_id, msgs.length)
                  if (last?.role === 'assistant' && last.finish_reason == null) {
                    last.finish_reason = parsed.finish_reason || 'stop'
                  }
                  // Debug: log run.completed to check if reasoning is included
                  logger.info('[chat-run-socket] run.completed keys: %s', Object.keys(parsed))
                  // Finalize assistant message — if no content was streamed, use output
                  if (parsed.output && !runProducedAssistantText(msgs)) {
                    if (last?.role === 'assistant') {
@@ -670,6 +1020,70 @@ export class ChatRunSocket {
                      })
                    }
                  }
                  // Parse stringified array content for all assistant messages
                  let parsedCount = 0
                  for (const msg of msgs) {
                    if (msg.role === 'assistant' && typeof msg.content === 'string' &&
                      msg.content.trim().startsWith('[') && msg.content.trim().endsWith(']')) {
                      try {
                        logger.info('[chat-run-socket] parsing array content for message %s, content preview: %s',
                          msg.id, msg.content.slice(0, 100))
                        const parsedContent = JSON.parse(
                          msg.content
                            .replace(/'/g, '"')
                            .replace(/True/g, 'true')
                            .replace(/False/g, 'false')
                            .replace(/None/g, 'null')
                        )
                        if (Array.isArray(parsedContent)) {
                          const textBlocks: string[] = []
                          const toolCalls: any[] = []
                          let reasoningContent: string | null = null
                          for (const block of parsedContent) {
                            if (block.type === 'thinking') {
                              reasoningContent = block.thinking
                            } else if (block.type === 'text') {
                              textBlocks.push(block.text)
                            } else if (block.type === 'tool_use') {
                              toolCalls.push({
                                id: block.id,
                                type: 'function',
                                function: {
                                  name: block.name,
                                  arguments: JSON.stringify(block.input)
                                }
                              })
                            }
                          }
                          msg.content = textBlocks.join('') || ''
                          if (toolCalls.length > 0) {
                            msg.tool_calls = toolCalls
                          }
                          if (reasoningContent) {
                            msg.reasoning = reasoningContent
                          }
                          parsedCount++
                        }
                      } catch (e) {
                        logger.error(e, '[chat-run-socket] failed to parse array content for message %s', msg.id)
                      }
                    }
                  }
                  logger.info('[chat-run-socket] EXIT run.completed case, parsed %d messages', parsedCount)
                  // Attach the last assistant message's parsed content to fix stringified array format
                  const lastAssistantMsg = msgs.filter((m: any) => m.role === 'assistant').pop()
                  if (lastAssistantMsg && parsedCount > 0) {
                    parsed.parsed_content = lastAssistantMsg.content || ''
                    parsed.parsed_tool_calls = lastAssistantMsg.tool_calls || null
                    parsed.parsed_reasoning = lastAssistantMsg.reasoning || null
                    logger.info('[chat-run-socket] attached parsed content to run.completed event for message %s', lastAssistantMsg.id)
                  }
                  break
                }
              }
@@ -682,7 +1096,7 @@ export class ChatRunSocket {
          if (parsed.event === 'run.completed' || parsed.event === 'run.failed') {
            source.close()
-            if (session_id) this.markCompleted(session_id, { event: parsed.event, run_id: parsed.run_id })
+            if (session_id) this.markCompleted(socket, session_id, { event: parsed.event, run_id: parsed.run_id })
          }
        } catch { /* not JSON, skip */ }
      }
@@ -690,26 +1104,26 @@ export class ChatRunSocket {
      source.onerror = () => {
        source.close()
        emit('run.failed', { event: 'run.failed', error: 'EventSource connection lost' })
-        if (session_id) this.markCompleted(session_id, { event: 'run.failed' })
+        if (session_id) this.markCompleted(socket, session_id, { event: 'run.failed' })
      }
    } catch (err: any) {
      emit('run.failed', { event: 'run.failed', error: err.message })
-      if (session_id) this.markCompleted(session_id, { event: 'run.failed' })
+      if (session_id) this.markCompleted(socket, session_id, { event: 'run.failed' })
    }
  }
  // --- Abort handler ---
-  private handleAbort(sessionId: string) {
+  private handleAbort(socket: Socket, sessionId: string) {
    const state = this.sessionMap.get(sessionId)
    if (state?.isWorking && state.abortController) {
      state.abortController.abort()
-      this.markCompleted(sessionId, { event: 'run.failed', run_id: state.runId })
+      this.markCompleted(socket, sessionId, { event: 'run.failed', run_id: state.runId })
    }
  }
  /** Mark a session run as completed/failed so reconnecting clients get notified */
-  private markCompleted(sessionId: string, _info: { event: string; run_id?: string }) {
+  private markCompleted(socket: Socket, sessionId: string, _info: { event: string; run_id?: string }) {
    const state = this.sessionMap.get(sessionId)
    if (state) {
      state.isWorking = false
@@ -723,7 +1137,7 @@ export class ChatRunSocket {
        const prof = state.profile
        state.hermesSessionId = undefined
        state.profile = undefined
-        this.syncFromHermes(sessionId, hermesId, prof)
+        this.syncFromHermes(socket, sessionId, hermesId, prof)
      }
    }
  }
@@ -775,7 +1189,7 @@ export class ChatRunSocket {
   * and write to local DB. This gives us tool results that SSE events don't include.
   * After sync, enqueues the ephemeral session for deletion.
   */
-  private syncFromHermes(localSessionId: string, hermesSessionId: string, profile?: string) {
+  private syncFromHermes(socket: Socket, localSessionId: string, hermesSessionId: string, profile?: string) {
    getSessionDetailFromDb(hermesSessionId)
      .then((detail) => {
        if (!detail || !detail.messages?.length) {
@@ -800,6 +1214,40 @@ export class ChatRunSocket {
        }
        if (toInsert.length > 0) {
          // Get in-memory messages to preserve reasoning that was streamed via SSE
          const state = this.sessionMap.get(localSessionId)
          const memoryMessages = state?.messages || []
          logger.info('[chat-run-socket] syncFromHermes: memory has %d messages, DB has %d messages',
            memoryMessages.length, toInsert.length)
          // Match messages by order since Hermes DB and memory should have same sequence
          let memoryIdx = 0
          let mergedCount = 0
          for (let i = 0; i < toInsert.length && memoryIdx < memoryMessages.length; i++) {
            const dbMsg = toInsert[i]
            // Skip user messages in memory when matching
            while (memoryIdx < memoryMessages.length && memoryMessages[memoryIdx].role === 'user') {
              memoryIdx++
            }
            if (memoryIdx >= memoryMessages.length) break
            const memoryMsg = memoryMessages[memoryIdx]
            // Only merge if roles match
            if (dbMsg.role === memoryMsg.role) {
              // Merge reasoning from memory if DB doesn't have it
              if (!dbMsg.reasoning && memoryMsg.reasoning) {
                dbMsg.reasoning = memoryMsg.reasoning
                mergedCount++
                logger.info('[chat-run-socket] syncFromHermes: merged reasoning from memory to DB for %s message at index %d',
                  dbMsg.role, i)
              }
            }
            memoryIdx++
          }
          if (mergedCount > 0) {
            logger.info('[chat-run-socket] syncFromHermes: merged reasoning for %d messages', mergedCount)
          }
          for (const msg of toInsert) {
            // Resolve tool_name from assistant's tool_calls if missing
            let toolName = msg.tool_name || null
@@ -816,7 +1264,7 @@ export class ChatRunSocket {
              timestamp: msg.timestamp || Math.floor(Date.now() / 1000),
              token_count: msg.token_count || null,
              finish_reason: msg.finish_reason || null,
-              reasoning: msg.reasoning || null,
+              reasoning: msg.reasoning || null,  // Now includes merged reasoning
              reasoning_details: msg.reasoning_details || null,
              reasoning_content: msg.reasoning_content || null,
              codex_reasoning_items: msg.codex_reasoning_items || null,
@@ -843,7 +1291,7 @@ export class ChatRunSocket {
        const state = this.sessionMap.get(localSessionId)
        if (state) {
          const emit = (event: string, payload: any) => {
-            this.nsp.to(`session:${localSessionId}`).emit(event, { ...payload, session_id: localSessionId })
+            socket.emit(event, { ...payload, session_id: localSessionId })
          }
          this.calcAndUpdateUsage(localSessionId, state, emit)
        }
@@ -871,6 +1319,7 @@ export class ChatRunSocket {
    } catch { /* best-effort */ }
  }
  /** Get or create session state in sessionMap */
  private getOrCreateSession(sessionId: string): SessionState {
    let state = this.sessionMap.get(sessionId)
@@ -2,16 +2,21 @@
 * Sync Hermes sessions from all profiles on startup.
 * Reads api_server sessions from Hermes state.db and imports into local DB.
 * Only runs when local DB is empty (first startup).
 *
 * Uses sessions-db.ts query logic to properly aggregate session chains.
 */
 import { readdirSync, existsSync } from 'fs'
 import { resolve, join } from 'path'
 import { homedir } from 'os'
 import { DatabaseSync } from 'node:sqlite'
 import { randomBytes } from 'crypto'
 import { getProfileDir } from './hermes-profile'
-import { createSession, addMessage, updateSession, getSession } from '../../db/hermes/session-store'
+import { createSession, addMessage, updateSession } from '../../db/hermes/session-store'
 import { getDb } from '../../db/index'
 import { logger } from '../logger'
 import { listSessionSummaries as listHermesSessionSummaries } from '../../db/hermes/sessions-db'
 const HERMES_BASE = resolve(homedir(), '.hermes')
 const PROFILES_DIR = join(HERMES_BASE, 'profiles')
 /**
 * Generate a UUID v4 without external dependencies
@@ -29,45 +34,6 @@ function generateUuid(): string {
  ].join('-')
 }
 const HERMES_BASE = resolve(homedir(), '.hermes')
 const PROFILES_DIR = join(HERMES_BASE, 'profiles')
 interface HermesSessionRow {
  id: string
  source: string
  model: string
  title: string | null
  started_at: number
  ended_at: number | null
  end_reason: string | null
  message_count: number
  tool_call_count: number
  input_tokens: number
  output_tokens: number
  cache_read_tokens: number
  cache_write_tokens: number
  reasoning_tokens: number
  estimated_cost_usd: number
  last_active: number
 }
 interface HermesMessageRow {
  id: number | string
  session_id: string
  role: string
  content: string
  tool_call_id: string | null
  tool_calls: any[] | null
  tool_name: string | null
  timestamp: number
  token_count: number | null
  finish_reason: string | null
  reasoning: string | null
  reasoning_details: string | null
  reasoning_content: string | null
  codex_reasoning_items: string | null
 }
 /**
 * Get all available profile names including 'default'
 */
@@ -85,73 +51,25 @@ function getAllProfiles(): string[] {
 }
 /**
- * Open Hermes state.db for a specific profile
+ * Sync api_server sessions from a single profile.
 * Uses sessions-db.ts query logic to properly aggregate session chains.
 */
-function openHermesStateDb(profile: string): DatabaseSync {
+async function syncProfileSessions(profile: string): Promise<{
  const profileDir = getProfileDir(profile)
  const dbPath = join(profileDir, 'state.db')
  if (!existsSync(dbPath)) {
    throw new Error(`Hermes state.db not found for profile '${profile}' at ${dbPath}`)
  }
  return new DatabaseSync(dbPath, { readOnly: true })
 }
 /**
 * Sync api_server sessions from a single profile
 */
 function syncProfileSessions(profile: string): {
  synced: number
  skipped: number
  errors: string[]
-} {
+}> {
-  const result = { synced: 0, skipped: 0, errors: [] as string[] }
+  const result = { synced: 0, errors: [] as string[] }
  try {
-    const db = openHermesStateDb(profile)
+    // Use listSessionSummaries to get aggregated session chains
    // This returns only root sessions with aggregated stats from the entire chain
    const summaries = await listHermesSessionSummaries('api_server', 10000, profile)
    logger.info(`[session-sync] profile '${profile}': found ${summaries.length} aggregated session chains`)
    for (const hermesSession of summaries) {
      try {
-      // Check if sessions table has estimated_cost_usd column
+        // Generate new session ID for local DB
      const tableInfo = db.prepare('PRAGMA table_info(sessions)').all() as Array<{ name: string }>
      const hasEstimatedCost = tableInfo.some(col => col.name === 'estimated_cost_usd')
      // Build SELECT query - only include estimated_cost_usd if column exists
      const estimatedCostCol = hasEstimatedCost ? ', COALESCE(estimated_cost_usd, 0) AS estimated_cost_usd' : ', 0 AS estimated_cost_usd'
      // Get all api_server sessions
      const sessions = db.prepare(`
        SELECT
          id,
          source,
          COALESCE(model, '') AS model,
          title,
          started_at,
          ended_at,
          end_reason,
          message_count,
          tool_call_count,
          input_tokens,
          output_tokens,
          cache_read_tokens,
          cache_write_tokens,
          reasoning_tokens${estimatedCostCol}
        FROM sessions
        WHERE source = 'api_server'
        ORDER BY started_at ASC
      `).all() as unknown as Omit<HermesSessionRow, 'preview' | 'last_active'>[]
      logger.info(`[session-sync] profile '${profile}': found ${sessions.length} api_server sessions`)
      for (const hermesSession of sessions) {
        try {
          // Check if this Hermes session ID already exists in local DB
          const existing = getSession(hermesSession.id)
          if (existing) {
            result.skipped++
            continue
          }
          // Generate new session ID
        const newSessionId = generateUuid()
        // Create session in local DB
@@ -162,30 +80,18 @@ function syncProfileSessions(profile: string): {
          title: hermesSession.title || undefined,
        })
-          // Get all messages for this session
+        // Get full detail including all messages from the session chain
-          const messages = db.prepare(`
+        const { getSessionDetailFromDbWithProfile } = await import('../../db/hermes/sessions-db')
-            SELECT
+        const detail = await getSessionDetailFromDbWithProfile(hermesSession.id, profile)
              id,
              session_id,
              role,
              content,
              tool_call_id,
              tool_calls,
              tool_name,
              timestamp,
              token_count,
              finish_reason,
              reasoning,
              reasoning_details,
              reasoning_content,
              codex_reasoning_items
            FROM messages
            WHERE session_id = ?
            ORDER BY timestamp, id
          `).all(hermesSession.id) as unknown as HermesMessageRow[]
-          // Insert all messages
+        if (!detail || !detail.messages) {
-          for (const msg of messages) {
+          result.errors.push(`session ${hermesSession.id}: failed to load messages`)
          logger.warn(`[session-sync] failed to load messages for session ${hermesSession.id}`)
          continue
        }
        // Insert all messages from the entire chain
        for (const msg of detail.messages) {
          addMessage({
            session_id: newSessionId,
            role: msg.role,
@@ -203,22 +109,7 @@ function syncProfileSessions(profile: string): {
          })
        }
-          // Generate preview from first user message
+        // Update session with aggregated stats from Hermes
          const firstUserMessage = messages.find(m => m.role === 'user' && m.content)
          let preview = ''
          if (firstUserMessage && firstUserMessage.content) {
            // Remove newlines, truncate to 63 chars
            preview = firstUserMessage.content
              .replace(/[\n\r]/g, ' ')
              .trim()
              .slice(0, 63)
          }
          // Update session with Hermes data
          const estimatedCost = typeof hermesSession.estimated_cost_usd === 'number'
            ? hermesSession.estimated_cost_usd
            : 0
        updateSession(newSessionId, {
          started_at: hermesSession.started_at,
          ended_at: hermesSession.ended_at,
@@ -228,21 +119,18 @@ function syncProfileSessions(profile: string): {
          cache_read_tokens: hermesSession.cache_read_tokens,
          cache_write_tokens: hermesSession.cache_write_tokens,
          reasoning_tokens: hermesSession.reasoning_tokens,
-            estimated_cost_usd: estimatedCost,
+          estimated_cost_usd: hermesSession.estimated_cost_usd,
-            last_active: hermesSession.started_at, // Use started_at as fallback since last_active doesn't exist in Hermes state.db
+          last_active: hermesSession.last_active,
-            preview,
+          preview: hermesSession.preview,
        })
        result.synced++
-          logger.info(`[session-sync] synced Hermes session ${hermesSession.id} -> ${newSessionId} (${messages.length} messages)`)
+        logger.info(`[session-sync] synced Hermes session ${hermesSession.id} -> ${newSessionId} (${detail.messages.length} messages, thread_session_count=${detail.thread_session_count})`)
      } catch (err: any) {
        result.errors.push(`session ${hermesSession.id}: ${err.message}`)
        logger.warn(err, `[session-sync] failed to sync session ${hermesSession.id}`)
      }
    }
    } finally {
      db.close()
    }
  } catch (err: any) {
    if (!err.message.includes('state.db not found')) {
      result.errors.push(err.message)
@@ -257,7 +145,7 @@ function syncProfileSessions(profile: string): {
 * Main entry point: sync all profiles on startup
 * Only runs if local DB is empty (first startup or after DB reset)
 */
-export function syncAllHermesSessionsOnStartup(): void {
+export async function syncAllHermesSessionsOnStartup(): Promise<void> {
  // Check if local DB has any sessions - only sync if completely empty
  const db = getDb()
  if (!db) {
@@ -279,13 +167,11 @@ export function syncAllHermesSessionsOnStartup(): void {
  logger.info(`[session-sync] found ${profiles.length} profiles: ${profiles.join(', ')}`)
  let totalSynced = 0
  let totalSkipped = 0
  let totalErrors = 0
  for (const profile of profiles) {
-    const result = syncProfileSessions(profile)
+    const result = await syncProfileSessions(profile)
    totalSynced += result.synced
    totalSkipped += result.skipped
    totalErrors += result.errors.length
    if (result.errors.length > 0) {
@@ -299,5 +185,5 @@ export function syncAllHermesSessionsOnStartup(): void {
    }
  }
-  logger.info(`[session-sync] sync complete: synced=${totalSynced}, skipped=${totalSkipped}, errors=${totalErrors}`)
+  logger.info(`[session-sync] sync complete: synced=${totalSynced}, errors=${totalErrors}`)
 }
@@ -49,7 +49,7 @@ export const PROVIDER_PRESETS: ProviderPreset[] = [
    value: 'deepseek',
    builtin: true,
    base_url: 'https://api.deepseek.com',
-    models: ['deepseek-chat', 'deepseek-reasoner'],
+    models: ['deepseek-v4-flash', 'deepseek-v4-pro'],
  },
  {
    label: 'Z.AI / GLM',