feat(chat): 支持思考块实时流式与历史展示 (#191)

* feat: 添加文件下载功能，支持多 Terminal Backend 实现基于 FileProvider 抽象的文件下载能力，支持 local、Docker、SSH、 Singularity 四种 backend。主要变更： - 新增 FileProvider 接口及四种后端实现（含 SSH 命令注入防护） - 新增 GET /api/hermes/download 下载路由（含 MIME 类型检测） - 前端 Markdown 文件链接拦截下载 + 附件下载按钮 - 中英文 i18n 翻译 - 更新 README、CLAUDE.md 和设计文档 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat: 添加文件浏览器与下载功能，支持目录浏览、文件编辑、预览和上传 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * build: add prepare script so 'npm install git+url' auto-builds dist/ Allows installing this package directly from git without a pre-built dist/. When cloned via npm, prepare runs 'npm run build' if dist/ is missing, producing the artifacts declared in the files[] field before packing. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: use clipboard fallback for non-secure HTTP contexts navigator.clipboard is undefined on HTTP intranet deployments (only available in secure contexts). The previous synchronous calls threw silently and the success toast still fired, making 'copy' actions appear broken. - Add packages/client/src/utils/clipboard.ts with execCommand fallback via a hidden textarea - Use the helper in FileContextMenu (copy file path), CodexLoginModal (copy user code), NousLoginModal (copy user code), ChatPanel (copy session id) - Each call now awaits the result and shows success/failure based on the actual outcome Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * i18n: backfill files/download translations for de, es, fr, ja, ko, pt Add nav.files, files.* (39 keys), and download.* (9 keys) so the file browser UI is fully localized in these six locales instead of falling back to English. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(files): close preview when navigating or affected file changes Opening a preview and then navigating directories, deleting the previewed file, or renaming it left the preview pane stuck on stale content because previewFile was never cleared. - stores/hermes/files.ts: - fetchEntries clears previewFile on path change (in-place refresh keeps the preview). - deleteEntry / renameEntry clear preview/editor state when the affected entry matches the previewed/edited file or its parent. - Add isAffected(target, changed, isDir) helper. - components/hermes/files/FilePreview.vue: replace the misleading common.cancel close button with a dedicated files.closePreview key plus an X icon and quaternary style. - i18n: add files.closePreview to all 8 locales. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore: 清理已完成功能的计划与设计文档文件浏览器与文件下载功能均已被上游合并，对应的开发计划与设计稿不再需要在 fork 中保留： - plans/2025-07-20-file-browser.md - plans/2026-04-20-file-download.md - specs/2025-07-20-file-browser-design.md - specs/2026-04-20-file-download-design.md 清理后本 fork 与 upstream/main 代码层面完全对齐。 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: 添加 thinking 块分离与折叠展示设计稿（#164）针对上游 issue #164，设计 assistant 消息中 <think>/<thinking>/<reasoning> 标签的识别、分离与可折叠展示方案。关键决策（经 rubber-duck 审查修订）： - 不修改 Message.content 与持久化字段，确保 localStorage 向前兼容 - 耗时摘要改为纯运行时派生（store 内 Map），避免刷新/重连丢失 - 首版即实现代码块保护，避免误识别 - 流结束时未闭合标签降级为正文，防止吞答案 - 解析 computed 与 duration interval 分离，规避性能风险 - 解析器放置 packages/client/src/utils/ 避免反向依赖 - 显式不支持同名嵌套（罕见场景文档化） Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: 添加 thinking 块分离与折叠实施计划（#164） 12 Task TDD 计划： - Task 1-7：utils/thinking-parser.ts 纯函数模块 + 单元测试 - Task 8-9：chat store thinkingObservation Map 接入 SSE - Task 10：8 语言 i18n 新增 6 条 key - Task 11：MessageItem.vue 渲染折叠 UI + SCSS - Task 12：构建/测试/手动验证/推送 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(thinking-parser): 首个闭合 <think> 标签拆分 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test(thinking-parser): 覆盖多段/变体标签/大小写/空输入 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test(thinking-parser): 流式 pending 与终止态降级 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(thinking-parser): 代码块保护避免误识别伪标签 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test(thinking-parser): 同名嵌套与 chunk 边界行为 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(thinking-parser): countThinkingChars 辅助函数 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(thinking-parser): detectThinkingBoundary 边界检测 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(chat-store): 新增 thinkingObservation 运行时 Map Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(chat-store): message.delta 写入 thinking 边界 + switchSession 清理 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * i18n: 新增 thinking 块 6 条 key（8 语言） Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(chat): MessageItem 渲染 thinking 折叠区 - 复用 tool-line 风格 chevron - 两条响应链：parse computed + duration interval - 流式+pending 强制展开 - show_reasoning 控制默认态 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(chat): 支持思考块实时流式与历史展示 - 扩展 Message 接口增加 reasoning 字段，mapHermesMessages 从 HermesMessage.reasoning 透传历史会话的思考内容。 - RunEvent 类型新增 text 字段，chat store 处理三个新 SSE 事件： reasoning.delta / thinking.delta / reasoning.available。 - 思考时长观察：仅在 reasoning.delta 累积时记录起始时间戳， reasoning.available 时记录结束时间戳；无实时 delta 时不显示时长。 - MessageItem 采用双源渲染（reasoning 字段优先，<think> 标签作 fallback），duration > 0 才展示耗时。 - 新增 3 条单测覆盖三个 SSE 事件；测试 32/32 通过。 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(chat): reasoning 块不再短暂展示正文根因：上游 hermes-agent run_agent.py:11275 在每次模型响应结束时用 assistant content[:500] 作为 reasoning.available 的 preview 负载，致使 Web UI 把正文写入 last.reasoning，思考块短暂显示正文直到会话轮询/刷新从 session DB 读回正确的 reasoning 字段。修复： - reasoning.available 事件不再写入 last.reasoning，仅用于标记计时结束（noteReasoningEnd）；真实推理由 reasoning.delta 或会话 DB 提供 - 新增 scrubBuggyReasoningInCache：hydration 时治愈 localStorage 里已被污染的 assistant 消息（reasoning == content 或前缀时丢弃） - 两个 cache 加载入口（loadSessions / switchSession）均接入 scrubber 测试：新增 4 条单测，全套 280/280 通过。 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-25 08:46:50 +08:00
parent c1e72942ad
commit 369001824e
18 changed files with 2369 additions and 8 deletions
@@ -23,6 +23,8 @@ export interface RunEvent {
  event: string
  run_id?: string
  delta?: string
+  /** Payload text for `reasoning.delta` / `thinking.delta` / `reasoning.available` events. */
+  text?: string
  tool?: string
  name?: string
  preview?: string
@@ -1,10 +1,13 @@
 <script setup lang="ts">
 import type { Message } from "@/stores/hermes/chat";
-import { computed, ref } from "vue";
+import { computed, onBeforeUnmount, ref, watchEffect } from "vue";
 import { useI18n } from "vue-i18n";
 import { useMessage } from "naive-ui";
 import { downloadFile } from "@/api/hermes/download";
 import MarkdownRenderer from "./MarkdownRenderer.vue";
+import { parseThinking, countThinkingChars } from "@/utils/thinking-parser";
+import { useChatStore } from "@/stores/hermes/chat";
+import { useSettingsStore } from "@/stores/hermes/settings";
 import {
  copyTextToClipboard,
  handleCodeBlockCopyClick,
@@ -20,6 +23,94 @@ const toast = useMessage();
 const isSystem = computed(() => props.message.role === "system");
 const toolExpanded = ref(false);

+const chatStore = useChatStore();
+const settingsStore = useSettingsStore();
+
+const parsedThinking = computed(() =>
+  parseThinking(props.message.content || "", { streaming: !!props.message.isStreaming }),
+);
+
+// 优先使用来自 reasoning 字段/事件的思考文本；否则回退到从 content 解析的 <think> 标签。
+// 若两者共存，则拼接展示（罕见，但保持信息不丢）。
+const hasReasoningField = computed(() => !!(props.message.reasoning && props.message.reasoning.length > 0));
+
+const hasThinking = computed(() => hasReasoningField.value || parsedThinking.value.hasThinking);
+
+const thinkingFullText = computed(() => {
+  const parts: string[] = [];
+  if (props.message.reasoning) parts.push(props.message.reasoning);
+  parts.push(...parsedThinking.value.segments);
+  if (parsedThinking.value.pending) parts.push(parsedThinking.value.pending);
+  return parts.join("\n\n");
+});
+
+const thinkingCharCount = computed(() => {
+  let count = countThinkingChars(parsedThinking.value);
+  if (props.message.reasoning) count += props.message.reasoning.length;
+  return count;
+});
+
+// 流式思考态：仍有未闭合 <think> 标签，或 reasoning 有内容但正文尚未开始。
+const thinkingStreamingNow = computed(() => {
+  if (!props.message.isStreaming) return false;
+  if (parsedThinking.value.pending !== null) return true;
+  if (hasReasoningField.value && !props.message.content) return true;
+  return false;
+});
+
+const thinkingOverride = ref<boolean | null>(null);
+
+const thinkingExpanded = computed(() => {
+  if (thinkingStreamingNow.value) return true;
+  if (thinkingOverride.value !== null) return thinkingOverride.value;
+  return !!settingsStore.display.show_reasoning;
+});
+
+function toggleThinking() {
+  thinkingOverride.value = !thinkingExpanded.value;
+}
+
+const nowTick = ref(Date.now());
+let tickTimer: number | null = null;
+
+function ensureTick() {
+  const ob = chatStore.getThinkingObservation(props.message.id);
+  const shouldTick = !!(
+    props.message.isStreaming &&
+    ob?.startedAt !== undefined &&
+    ob.endedAt === undefined
+  );
+  if (shouldTick && tickTimer === null) {
+    tickTimer = window.setInterval(() => {
+      nowTick.value = Date.now();
+    }, 1000);
+  } else if (!shouldTick && tickTimer !== null) {
+    window.clearInterval(tickTimer);
+    tickTimer = null;
+  }
+}
+
+watchEffect(ensureTick);
+
+onBeforeUnmount(() => {
+  if (tickTimer !== null) window.clearInterval(tickTimer);
+});
+
+const thinkingDurationMs = computed<number | null>(() => {
+  const ob = chatStore.getThinkingObservation(props.message.id);
+  if (!ob?.startedAt) return null;
+  const end = ob.endedAt ?? (props.message.isStreaming ? nowTick.value : ob.startedAt);
+  return Math.max(0, end - ob.startedAt);
+});
+
+function formatDuration(ms: number): string {
+  const s = Math.floor(ms / 1000);
+  if (s < 60) return `${s}s`;
+  const m = Math.floor(s / 60);
+  const r = s % 60;
+  return r === 0 ? `${m}m` : `${m}m ${r}s`;
+}
+
 const timeStr = computed(() => {
  const d = new Date(props.message.timestamp);
  return d.toLocaleTimeString([], { hour: "2-digit", minute: "2-digit" });
@@ -275,9 +366,46 @@ const renderedToolResult = computed(() => {
                </template>
              </div>
            </div>
+            <div
+              v-if="hasThinking"
+              class="thinking-block"
+              :class="{ expanded: thinkingExpanded }"
+            >
+              <div class="thinking-header" @click="toggleThinking">
+                <svg
+                  width="10"
+                  height="10"
+                  viewBox="0 0 24 24"
+                  fill="none"
+                  stroke="currentColor"
+                  stroke-width="2"
+                  class="thinking-chevron"
+                  :class="{ rotated: thinkingExpanded }"
+                >
+                  <polyline points="9 18 15 12 9 6" />
+                </svg>
+                <span class="thinking-icon">💭</span>
+                <span class="thinking-label">
+                  {{
+                    thinkingStreamingNow
+                      ? t('chat.thinkingInProgress')
+                      : t('chat.thinkingLabel')
+                  }}
+                </span>
+                <span v-if="thinkingDurationMs !== null && thinkingDurationMs > 0" class="thinking-meta">
+                  · {{ t('chat.thinkingDuration', { duration: formatDuration(thinkingDurationMs) }) }}
+                </span>
+                <span class="thinking-meta">
+                  · {{ t('chat.thinkingChars', { count: thinkingCharCount }) }}
+                </span>
+              </div>
+              <div v-if="thinkingExpanded" class="thinking-body">
+                <MarkdownRenderer :content="thinkingFullText" />
+              </div>
+            </div>
            <MarkdownRenderer
-              v-if="message.content"
-              :content="message.content"
+              v-if="parsedThinking.body"
+              :content="parsedThinking.body"
            />

            <span v-if="message.isStreaming && !message.content" class="streaming-dots">
@@ -427,6 +555,63 @@ const renderedToolResult = computed(() => {
  }
 }

+.thinking-block {
+  margin-bottom: 8px;
+  padding: 4px 0;
+  border-bottom: 1px dashed $border-light;
+
+  .thinking-header {
+    display: flex;
+    align-items: center;
+    gap: 6px;
+    font-size: 11px;
+    color: $text-muted;
+    cursor: pointer;
+    padding: 2px 4px;
+    border-radius: $radius-sm;
+    user-select: none;
+
+    &:hover {
+      background: rgba(0, 0, 0, 0.03);
+    }
+  }
+
+  .thinking-chevron {
+    flex-shrink: 0;
+    transition: transform 0.15s ease;
+
+    &.rotated {
+      transform: rotate(90deg);
+    }
+  }
+
+  .thinking-icon {
+    font-size: 11px;
+    flex-shrink: 0;
+  }
+
+  .thinking-label {
+    font-weight: 500;
+    flex-shrink: 0;
+  }
+
+  .thinking-meta {
+    color: $text-muted;
+    font-variant-numeric: tabular-nums;
+  }
+
+  .thinking-body {
+    margin-top: 6px;
+    padding: 6px 10px;
+    border-left: 2px solid $border-light;
+    font-size: 13px;
+    opacity: 0.85;
+    font-style: italic;
+
+    :deep(p) { margin: 0.3em 0; }
+  }
+}
+
 .message-time {
  font-size: 11px;
  color: $text-muted;
@@ -131,6 +131,12 @@ export default {
    arguments: 'Argumente',
    result: 'Ergebnis',
    truncated: '... (abgeschnitten)',
+    thinkingLabel: 'Denkprozess',
+    thinkingInProgress: 'Denkt…',
+    thinkingShow: 'Denkprozess anzeigen',
+    thinkingHide: 'Denkprozess ausblenden',
+    thinkingDuration: 'Beobachtet {duration}',
+    thinkingChars: '{count} Zeichen',
  },

  // Jobs
@@ -154,6 +154,12 @@ export default {
    arguments: 'Arguments',
    result: 'Result',
    truncated: '... (truncated)',
+    thinkingLabel: 'Thinking',
+    thinkingInProgress: 'Thinking…',
+    thinkingShow: 'Show thinking',
+    thinkingHide: 'Hide thinking',
+    thinkingDuration: 'Observed {duration}',
+    thinkingChars: '{count} chars',
  },

  // Jobs
@@ -131,6 +131,12 @@ export default {
    arguments: 'Argumentos',
    result: 'Resultado',
    truncated: '... (truncado)',
+    thinkingLabel: 'Pensamiento',
+    thinkingInProgress: 'Pensando…',
+    thinkingShow: 'Mostrar pensamiento',
+    thinkingHide: 'Ocultar pensamiento',
+    thinkingDuration: 'Observado {duration}',
+    thinkingChars: '{count} caracteres',
  },

  // Jobs
@@ -131,6 +131,12 @@ export default {
    arguments: 'Arguments',
    result: 'Resultat',
    truncated: '... (tronque)',
+    thinkingLabel: 'Raisonnement',
+    thinkingInProgress: 'En réflexion…',
+    thinkingShow: 'Afficher le raisonnement',
+    thinkingHide: 'Masquer le raisonnement',
+    thinkingDuration: 'Observé {duration}',
+    thinkingChars: '{count} caractères',
  },

  // Jobs
@@ -131,6 +131,12 @@ export default {
    arguments: '引数',
    result: '結果',
    truncated: '... (省略)',
+    thinkingLabel: '思考過程',
+    thinkingInProgress: '思考中…',
+    thinkingShow: '思考過程を表示',
+    thinkingHide: '思考過程を隠す',
+    thinkingDuration: '観測 {duration}',
+    thinkingChars: '{count} 文字',
  },

  // スケジュールジョブ
@@ -131,6 +131,12 @@ export default {
    arguments: '인수',
    result: '결과',
    truncated: '... (잘림)',
+    thinkingLabel: '사고 과정',
+    thinkingInProgress: '사고 중…',
+    thinkingShow: '사고 과정 펼치기',
+    thinkingHide: '사고 과정 접기',
+    thinkingDuration: '관측 {duration}',
+    thinkingChars: '{count}자',
  },

  // 예약 작업
@@ -131,6 +131,12 @@ export default {
    arguments: 'Argumentos',
    result: 'Resultado',
    truncated: '... (truncado)',
+    thinkingLabel: 'Raciocínio',
+    thinkingInProgress: 'Pensando…',
+    thinkingShow: 'Mostrar raciocínio',
+    thinkingHide: 'Ocultar raciocínio',
+    thinkingDuration: 'Observado {duration}',
+    thinkingChars: '{count} caracteres',
  },

  // Jobs
@@ -154,6 +154,12 @@ export default {
    arguments: '参数',
    result: '结果',
    truncated: '... (已截断)',
+    thinkingLabel: '思考过程',
+    thinkingInProgress: '思考中…',
+    thinkingShow: '展开思考过程',
+    thinkingHide: '收起思考过程',
+    thinkingDuration: '已观察 {duration}',
+    thinkingChars: '{count} 字',
  },

  // 定时任务
@@ -4,6 +4,7 @@ import { defineStore } from 'pinia'
 import { ref, computed } from 'vue'
 import { useAppStore } from './app'
 import { useProfilesStore } from './profiles'
+import { detectThinkingBoundary } from '@/utils/thinking-parser'

 export interface Attachment {
  id: string
@@ -26,6 +27,11 @@ export interface Message {
  toolStatus?: 'running' | 'done' | 'error'
  isStreaming?: boolean
  attachments?: Attachment[]
+  // 思考/推理文本。两条来源：
+  //   1) 历史消息：来自 HermesMessage.reasoning 字段
+  //   2) 流式：由 reasoning.delta / thinking.delta / reasoning.available 事件累加
+  // 不含 <think> 包裹标签；内容自身可以为多段纯文本。
+  reasoning?: string
 }

 export interface Session {
@@ -141,6 +147,7 @@ function mapHermesMessages(msgs: HermesMessage[]): Message[] {
      role: msg.role,
      content: msg.content || '',
      timestamp: Math.round(msg.timestamp * 1000),
+      reasoning: msg.reasoning ? msg.reasoning : undefined,
    })
  }
  return result
@@ -305,6 +312,26 @@ function sanitizeForCache(msgs: Message[]): Message[] {
  })
 }

+// Heals assistant messages whose `reasoning` field was polluted by the
+// old bug where `reasoning.available` clobbered it with the assistant
+// content. Detection heuristic: reasoning is a prefix of content (the
+// bug always derived `reasoning` from `content[:500]` with tags stripped).
+// Legitimate reasoning is almost never a prefix of the final answer.
+function scrubBuggyReasoningInCache(msgs: Message[] | null | undefined): Message[] {
+  if (!msgs) return []
+  return msgs.map(m => {
+    if (m.role !== 'assistant' || !m.reasoning || !m.content) return m
+    const r = m.reasoning.trim()
+    const c = m.content.trim()
+    if (!r || !c) return m
+    if (c === r || c.startsWith(r)) {
+      const { reasoning: _drop, ...rest } = m
+      return rest as Message
+    }
+    return m
+  })
+}
+
 export const useChatStore = defineStore('chat', () => {
  const sessions = ref<Session[]>([])
  const activeSessionId = ref<string | null>(null)
@@ -471,7 +498,7 @@ export const useChatStore = defineStore('chat', () => {
          const cachedActive = cachedSessions.find(s => s.id === savedId) || null
          if (cachedActive) {
            const cachedMsgs = loadJsonWithFallback<Message[]>(msgsCacheKey(savedId), legacyMsgsCacheKey(savedId))
-            if (cachedMsgs) cachedActive.messages = cachedMsgs
+            if (cachedMsgs) cachedActive.messages = scrubBuggyReasoningInCache(cachedMsgs)
            activeSession.value = cachedActive
            activeSessionId.value = savedId
          }
@@ -561,6 +588,7 @@ export const useChatStore = defineStore('chat', () => {
  }

  async function switchSession(sessionId: string, focusId?: string | null) {
+    clearThinkingObservationFor(sessionId)
    activeSessionId.value = sessionId
    focusMessageId.value = focusId ?? null
    setItemBestEffort(storageKey(), sessionId)
@@ -577,7 +605,7 @@ export const useChatStore = defineStore('chat', () => {
    if (!hasLocalMessages) {
      const cachedMsgs = loadJsonWithFallback<Message[]>(msgsCacheKey(sessionId), legacyMsgsCacheKey(sessionId))
      if (cachedMsgs?.length) {
-        activeSession.value.messages = cachedMsgs
+        activeSession.value.messages = scrubBuggyReasoningInCache(cachedMsgs)
      }
    }

@@ -823,16 +851,66 @@ export const useChatStore = defineStore('chat', () => {
            case 'run.started':
              break

+            case 'reasoning.delta':
+            case 'thinking.delta': {
+              const text = evt.text || evt.delta || ''
+              if (!text) break
+              const msgs = getSessionMsgs(sid)
+              const last = msgs[msgs.length - 1]
+              if (last?.role === 'assistant' && last.isStreaming) {
+                last.reasoning = (last.reasoning || '') + text
+                noteReasoningStart(last.id)
+              } else {
+                const newId = uid()
+                addMessage(sid, {
+                  id: newId,
+                  role: 'assistant',
+                  content: '',
+                  timestamp: Date.now(),
+                  isStreaming: true,
+                  reasoning: text,
+                })
+                noteReasoningStart(newId)
+              }
+              schedulePersist()
+              break
+            }
+
+            case 'reasoning.available': {
+              // Upstream run_agent.py fires reasoning.available with
+              // `assistant_message.content[:500]` as the preview — i.e.,
+              // the main answer, not real reasoning. Ignore the payload
+              // and only use this event as a "thinking ended" signal so
+              // the duration counter stops.
+              const msgs = getSessionMsgs(sid)
+              const last = msgs[msgs.length - 1]
+              if (last?.role === 'assistant' && last.isStreaming) {
+                // 只有当 reasoning.delta 事件曾经启动过计时，才标记结束；
+                // 否则（上游未转发 delta，只发这一次 available）不显示时长。
+                noteReasoningEnd(last.id)
+              }
+              schedulePersist()
+              break
+            }
+
            case 'message.delta': {
              const msgs = getSessionMsgs(sid)
              const last = msgs[msgs.length - 1]
              if (last?.role === 'assistant' && last.isStreaming) {
-                last.content += evt.delta || ''
+                const prev = last.content
+                const next = prev + (evt.delta || '')
+                noteThinkingDelta(last.id, prev, next)
+                // 若之前有 reasoning 累积，则 content 到达即视为推理结束。
+                if (last.reasoning) noteReasoningEnd(last.id)
+                last.content = next
              } else {
+                const newId = uid()
+                const nextContent = evt.delta || ''
+                noteThinkingDelta(newId, '', nextContent)
                addMessage(sid, {
-                  id: uid(),
+                  id: newId,
                  role: 'assistant',
-                  content: evt.delta || '',
+                  content: nextContent,
                  timestamp: Date.now(),
                  isStreaming: true,
                })
@@ -1013,6 +1091,52 @@ export const useChatStore = defineStore('chat', () => {
    })
  }

+  // Transient observation of <think> boundaries during active streaming.
+  // Not persisted; cleared on session switch. See spec §5.3.
+  const thinkingObservation = new Map<string, { startedAt?: number; endedAt?: number }>()
+
+  function getThinkingObservation(messageId: string) {
+    return thinkingObservation.get(messageId)
+  }
+
+  function noteThinkingDelta(messageId: string, prevContent: string, nextContent: string) {
+    const { startedAtBoundary, endedAtBoundary } = detectThinkingBoundary(prevContent, nextContent)
+    if (!startedAtBoundary && !endedAtBoundary) return
+    const existing = thinkingObservation.get(messageId) || {}
+    if (startedAtBoundary && existing.startedAt === undefined) {
+      existing.startedAt = Date.now()
+    }
+    if (endedAtBoundary && existing.endedAt === undefined) {
+      existing.endedAt = Date.now()
+    }
+    thinkingObservation.set(messageId, existing)
+  }
+
+  /** 第一次见到某条消息的 reasoning 文本时，标记 startedAt。 */
+  function noteReasoningStart(messageId: string) {
+    const existing = thinkingObservation.get(messageId) || {}
+    if (existing.startedAt === undefined) {
+      existing.startedAt = Date.now()
+      thinkingObservation.set(messageId, existing)
+    }
+  }
+
+  /** 内容首次到达（视为推理结束）或显式收到 reasoning.available 时，标记 endedAt。 */
+  function noteReasoningEnd(messageId: string) {
+    const existing = thinkingObservation.get(messageId)
+    if (!existing || existing.startedAt === undefined) return
+    if (existing.endedAt === undefined) {
+      existing.endedAt = Date.now()
+      thinkingObservation.set(messageId, existing)
+    }
+  }
+
+  function clearThinkingObservationFor(_sessionId: string) {
+    // messageId 与 sessionId 的关联未单独持有；方案是切会话时一律清空。
+    // 这符合 spec 定义：observation 是"当前会话范围内"的 transient 状态。
+    thinkingObservation.clear()
+  }
+
  return {
    sessions,
    activeSessionId,
@@ -1034,5 +1158,10 @@ export const useChatStore = defineStore('chat', () => {
    stopStreaming,
    loadSessions,
    refreshActiveSession,
+    getThinkingObservation,
+    noteThinkingDelta,
+    noteReasoningStart,
+    noteReasoningEnd,
+    clearThinkingObservationFor,
  }
 })
@@ -0,0 +1,99 @@
+export interface ParsedThinking {
+  segments: string[]
+  pending: string | null
+  body: string
+  hasThinking: boolean
+}
+
+export interface ParseOptions {
+  streaming: boolean
+}
+
+const TAG_RE = /<(think|thinking|reasoning)>([\s\S]*?)<\/\1>/gi
+
+const PLACEHOLDER_PREFIX = '\u0000THKCODE'
+const PLACEHOLDER_SUFFIX = '\u0000'
+
+const FENCED_RE = /(```|~~~)([\s\S]*?)\1/g
+const INLINE_CODE_RE = /`[^`\n]*`/g
+
+function protectCodeBlocks(input: string): { masked: string; blocks: string[] } {
+  const blocks: string[] = []
+  let masked = input.replace(FENCED_RE, (m) => {
+    blocks.push(m)
+    return `${PLACEHOLDER_PREFIX}${blocks.length - 1}${PLACEHOLDER_SUFFIX}`
+  })
+  masked = masked.replace(INLINE_CODE_RE, (m) => {
+    blocks.push(m)
+    return `${PLACEHOLDER_PREFIX}${blocks.length - 1}${PLACEHOLDER_SUFFIX}`
+  })
+  return { masked, blocks }
+}
+
+function restoreCodeBlocks(text: string, blocks: string[]): string {
+  if (blocks.length === 0) return text
+  return text.replace(
+    new RegExp(`${PLACEHOLDER_PREFIX}(\\d+)${PLACEHOLDER_SUFFIX}`, 'g'),
+    (_, idx) => blocks[Number(idx)] ?? '',
+  )
+}
+
+export function parseThinking(content: string, opts: ParseOptions): ParsedThinking {
+  const { masked, blocks } = protectCodeBlocks(content)
+
+  const segments: string[] = []
+  let pending: string | null = null
+  let body = ''
+  let lastIndex = 0
+
+  TAG_RE.lastIndex = 0
+  let m: RegExpExecArray | null
+  while ((m = TAG_RE.exec(masked)) !== null) {
+    body += masked.slice(lastIndex, m.index)
+    segments.push(m[2])
+    lastIndex = m.index + m[0].length
+  }
+  const rest = masked.slice(lastIndex)
+
+  const openRe = /<(think|thinking|reasoning)>([\s\S]*)$/i
+  const openMatch = rest.match(openRe)
+  if (openMatch) {
+    body += rest.slice(0, openMatch.index)
+    if (opts.streaming) {
+      pending = openMatch[2]
+    } else {
+      body += rest.slice(openMatch.index!)
+    }
+  } else {
+    body += rest
+  }
+
+  return {
+    segments: segments.map(s => restoreCodeBlocks(s, blocks)),
+    pending: pending === null ? null : restoreCodeBlocks(pending, blocks),
+    body: restoreCodeBlocks(body, blocks),
+    hasThinking: segments.length > 0 || pending !== null,
+  }
+}
+
+export function countThinkingChars(parsed: ParsedThinking): number {
+  const len = (s: string) => [...s].length
+  return parsed.segments.reduce((a, s) => a + len(s), 0) + len(parsed.pending || '')
+}
+
+export interface ThinkingBoundary {
+  startedAtBoundary: boolean
+  endedAtBoundary: boolean
+}
+
+const ANY_OPEN_RE = /<(think|thinking|reasoning)>/i
+const ANY_CLOSE_RE = /<\/(think|thinking|reasoning)>/i
+
+export function detectThinkingBoundary(prev: string, next: string): ThinkingBoundary {
+  const prevMasked = protectCodeBlocks(prev).masked
+  const nextMasked = protectCodeBlocks(next).masked
+  return {
+    startedAtBoundary: !ANY_OPEN_RE.test(prevMasked) && ANY_OPEN_RE.test(nextMasked),
+    endedAtBoundary: !ANY_CLOSE_RE.test(prevMasked) && ANY_CLOSE_RE.test(nextMasked),
+  }
+}