feat(chat): 支持思考块实时流式与历史展示 (#191)

* feat: 添加文件下载功能，支持多 Terminal Backend 实现基于 FileProvider 抽象的文件下载能力，支持 local、Docker、SSH、 Singularity 四种 backend。主要变更： - 新增 FileProvider 接口及四种后端实现（含 SSH 命令注入防护） - 新增 GET /api/hermes/download 下载路由（含 MIME 类型检测） - 前端 Markdown 文件链接拦截下载 + 附件下载按钮 - 中英文 i18n 翻译 - 更新 README、CLAUDE.md 和设计文档 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat: 添加文件浏览器与下载功能，支持目录浏览、文件编辑、预览和上传 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * build: add prepare script so 'npm install git+url' auto-builds dist/ Allows installing this package directly from git without a pre-built dist/. When cloned via npm, prepare runs 'npm run build' if dist/ is missing, producing the artifacts declared in the files[] field before packing. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: use clipboard fallback for non-secure HTTP contexts navigator.clipboard is undefined on HTTP intranet deployments (only available in secure contexts). The previous synchronous calls threw silently and the success toast still fired, making 'copy' actions appear broken. - Add packages/client/src/utils/clipboard.ts with execCommand fallback via a hidden textarea - Use the helper in FileContextMenu (copy file path), CodexLoginModal (copy user code), NousLoginModal (copy user code), ChatPanel (copy session id) - Each call now awaits the result and shows success/failure based on the actual outcome Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * i18n: backfill files/download translations for de, es, fr, ja, ko, pt Add nav.files, files.* (39 keys), and download.* (9 keys) so the file browser UI is fully localized in these six locales instead of falling back to English. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(files): close preview when navigating or affected file changes Opening a preview and then navigating directories, deleting the previewed file, or renaming it left the preview pane stuck on stale content because previewFile was never cleared. - stores/hermes/files.ts: - fetchEntries clears previewFile on path change (in-place refresh keeps the preview). - deleteEntry / renameEntry clear preview/editor state when the affected entry matches the previewed/edited file or its parent. - Add isAffected(target, changed, isDir) helper. - components/hermes/files/FilePreview.vue: replace the misleading common.cancel close button with a dedicated files.closePreview key plus an X icon and quaternary style. - i18n: add files.closePreview to all 8 locales. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore: 清理已完成功能的计划与设计文档文件浏览器与文件下载功能均已被上游合并，对应的开发计划与设计稿不再需要在 fork 中保留： - plans/2025-07-20-file-browser.md - plans/2026-04-20-file-download.md - specs/2025-07-20-file-browser-design.md - specs/2026-04-20-file-download-design.md 清理后本 fork 与 upstream/main 代码层面完全对齐。 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: 添加 thinking 块分离与折叠展示设计稿（#164）针对上游 issue #164，设计 assistant 消息中 <think>/<thinking>/<reasoning> 标签的识别、分离与可折叠展示方案。关键决策（经 rubber-duck 审查修订）： - 不修改 Message.content 与持久化字段，确保 localStorage 向前兼容 - 耗时摘要改为纯运行时派生（store 内 Map），避免刷新/重连丢失 - 首版即实现代码块保护，避免误识别 - 流结束时未闭合标签降级为正文，防止吞答案 - 解析 computed 与 duration interval 分离，规避性能风险 - 解析器放置 packages/client/src/utils/ 避免反向依赖 - 显式不支持同名嵌套（罕见场景文档化） Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: 添加 thinking 块分离与折叠实施计划（#164） 12 Task TDD 计划： - Task 1-7：utils/thinking-parser.ts 纯函数模块 + 单元测试 - Task 8-9：chat store thinkingObservation Map 接入 SSE - Task 10：8 语言 i18n 新增 6 条 key - Task 11：MessageItem.vue 渲染折叠 UI + SCSS - Task 12：构建/测试/手动验证/推送 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(thinking-parser): 首个闭合 <think> 标签拆分 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test(thinking-parser): 覆盖多段/变体标签/大小写/空输入 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test(thinking-parser): 流式 pending 与终止态降级 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(thinking-parser): 代码块保护避免误识别伪标签 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test(thinking-parser): 同名嵌套与 chunk 边界行为 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(thinking-parser): countThinkingChars 辅助函数 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(thinking-parser): detectThinkingBoundary 边界检测 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(chat-store): 新增 thinkingObservation 运行时 Map Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(chat-store): message.delta 写入 thinking 边界 + switchSession 清理 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * i18n: 新增 thinking 块 6 条 key（8 语言） Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(chat): MessageItem 渲染 thinking 折叠区 - 复用 tool-line 风格 chevron - 两条响应链：parse computed + duration interval - 流式+pending 强制展开 - show_reasoning 控制默认态 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(chat): 支持思考块实时流式与历史展示 - 扩展 Message 接口增加 reasoning 字段，mapHermesMessages 从 HermesMessage.reasoning 透传历史会话的思考内容。 - RunEvent 类型新增 text 字段，chat store 处理三个新 SSE 事件： reasoning.delta / thinking.delta / reasoning.available。 - 思考时长观察：仅在 reasoning.delta 累积时记录起始时间戳， reasoning.available 时记录结束时间戳；无实时 delta 时不显示时长。 - MessageItem 采用双源渲染（reasoning 字段优先，<think> 标签作 fallback），duration > 0 才展示耗时。 - 新增 3 条单测覆盖三个 SSE 事件；测试 32/32 通过。 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(chat): reasoning 块不再短暂展示正文根因：上游 hermes-agent run_agent.py:11275 在每次模型响应结束时用 assistant content[:500] 作为 reasoning.available 的 preview 负载，致使 Web UI 把正文写入 last.reasoning，思考块短暂显示正文直到会话轮询/刷新从 session DB 读回正确的 reasoning 字段。修复： - reasoning.available 事件不再写入 last.reasoning，仅用于标记计时结束（noteReasoningEnd）；真实推理由 reasoning.delta 或会话 DB 提供 - 新增 scrubBuggyReasoningInCache：hydration 时治愈 localStorage 里已被污染的 assistant 消息（reasoning == content 或前缀时丢弃） - 两个 cache 加载入口（loadSessions / switchSession）均接入 scrubber 测试：新增 4 条单测，全套 280/280 通过。 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-25 08:46:50 +08:00
parent c1e72942ad
commit 369001824e
18 changed files with 2369 additions and 8 deletions
@@ -0,0 +1,310 @@
+# Think 块与正文分离、可折叠展示 — 设计稿
+
+- **Issue**: 上游 #164 —【Feature】think 块与正文区分开
+- **日期**: 2026-04-23
+- **分支**: `feat/thinking-block-collapse`
+- **状态**: 设计稿（含 rubber-duck 审查反馈修订）
+
+---
+
+## 1. 背景
+
+当前 assistant 回复中，思考链（reasoning/think）内容直接以 `<think>...</think>` 等原始标签形式嵌在 `Message.content` 里，由 `MarkdownRenderer` 原样渲染。用户反馈：
+
+1. think 块与正文混在一起，大段文本难以快速定位正文；
+2. 正文输出完成后，think 内容无法单独收起/查看；
+3. 已存在的 `settings.display.show_reasoning` 开关目前未真正影响渲染。
+
+## 2. 目标
+
+- assistant 消息中，**识别并分离** think 块与正文；
+- think 块以**可折叠 header** 形式展示，复用项目已有的 `tool-line` 折叠样式；
+- 折叠 header 显示**字数摘要**，正在流式观察到的消息额外显示**观察到的耗时**；
+- 默认展开/收起由 `settings.display.show_reasoning` 控制，每条消息可独立覆盖（运行时 transient 状态）；
+- 流式中未闭合标签**容错解析**；流结束时仍未闭合则**降级保留为正文**；
+- 代码块/内联 code 中出现的伪 `<think>` 标签**不识别**；
+- 不改动上游 SSE 协议、不修改 `Message.content`、不破坏 localStorage 旧数据。
+
+## 3. 非目标
+
+- 不修改后端 gateway 协议，不新增 SSE 事件类型；
+- 不持久化 thinking 耗时摘要 — 历史/刷新后恢复的消息仅显示字数；
+- 不持久化每条消息的手动折叠状态 — transient，刷新后回默认；
+- 不支持同名嵌套标签（`<think><think>...</think>...</think>` 内层按纯文本处理）。
+
+## 4. 识别规则
+
+### 4.1 标签范围
+
+正则匹配以下三类标签（大小写不敏感）：
+
+```
+<think>...</think>
+<thinking>...</thinking>
+<reasoning>...</reasoning>
+```
+
+> 选择理由：覆盖 DeepSeek R1、GLM reasoner、通义 Qwen reasoning、Claude thinking 等主流推理模型。
+
+### 4.2 代码块保护（首版必做）
+
+解析前先将 markdown 代码块内容替换为占位符，避免误识别：
+
+1. **Fenced code block**：匹配 `` ```lang\n...\n``` ``（含 `~~~` 变体）；
+2. **Inline code**：匹配 `` `...` ``（单反引号，非转义）；
+
+替换为 `\u0000CODE_N\u0000` 占位符 → 对剩余文本执行 4.3 → 解析完成后把占位符原位还原回 `body` 与 `segments`（segments 内本不应出现 code，但为简单起见统一还原，不影响结果）。
+
+### 4.3 解析算法
+
+对剥离代码块后的一条 assistant `content`：
+
+1. 非贪婪匹配所有 `<(think|thinking|reasoning)>[\s\S]*?</\1>`（大小写不敏感）；
+2. 同名嵌套**不支持**：内层同名开始标签被外层 `</>` 先闭合吞掉；解析器不处理 dangling 内层 `</tag>`，会作为正文残留（罕见场景可接受）；
+3. 若还残留一个**未闭合**的 `<think|thinking|reasoning>`：
+   - **流式中**（调用方传 `streaming=true`）→ 从该标签起到末尾视为 `pending` thinking；
+   - **非流式**（`streaming=false`）→ **降级**：视为正文保留（含标签字符原样），`pending=null`；
+4. 其余纯文本按顺序拼接为 `body`；
+5. 还原所有代码块占位符。
+
+### 4.4 TypeScript 签名
+
+文件位置：**`packages/client/src/utils/thinking-parser.ts`**（中性 utils 目录，避免 store → components 反向依赖）
+
+```ts
+export interface ParsedThinking {
+  /** 所有已闭合 thinking 片段纯文本（不含标签） */
+  segments: string[]
+  /** 流式中未闭合的 thinking；非流式时始终为 null */
+  pending: string | null
+  /** 正文（已剔除 thinking） */
+  body: string
+  /** 是否存在任何 thinking 内容（segments 非空 或 pending 非空） */
+  hasThinking: boolean
+}
+
+export function parseThinking(content: string, opts: { streaming: boolean }): ParsedThinking
+
+/** 检测 content 从 prev 变到 next 期间，是否跨越了"首次出现开始/结束标签"的边界 */
+export function detectThinkingBoundary(prev: string, next: string): {
+  startedAtBoundary: boolean
+  endedAtBoundary: boolean
+}
+```
+
+## 5. 数据模型
+
+### 5.1 `Message.content` 保持不变
+
+原始字符串原样存储 & 持久化。localStorage / sessions export 向前兼容。
+
+### 5.2 不新增持久化字段
+
+**采纳 rubber-duck #4 审查反馈**：不在 `Message` 接口上新增 `thinkingStartedAt/EndedAt` 字段。理由：
+
+- `mapHermesMessages()` 只映射服务端已知字段，新字段会被刷新/重连覆盖丢失；
+- `switchSession` / `startPolling` / `refreshActiveSession` 会用 server 数据覆盖本地消息；
+- thinking 耗时的语义本就是"前端观察到的 wall-clock 时间"，非模型真实思考时间，持久化反而误导。
+
+### 5.3 运行时观察态（store 内 Map）
+
+在 `stores/hermes/chat.ts` 新增：
+
+```ts
+/** Map<messageId, { startedAt, endedAt }>，仅记录本次会话流式期间观察到的时间戳 */
+const thinkingObservation = reactive(new Map<string, { startedAt?: number; endedAt?: number }>())
+```
+
+- 在 `message.delta` 事件处理中调用 `detectThinkingBoundary(prev, next)`，首次 started 写入 `startedAt`，首次 ended 写入 `endedAt`；
+- `run.completed` / `run.failed` 后不清除该 entry（以便流式结束后仍能展示"本次会话的观察耗时"，直到用户刷新或切换会话）；
+- `switchSession` 时清空 Map（跨会话不保留）；
+- 历史消息、刷新后恢复的消息、polling 拉取的消息均**无** entry，不显示耗时，仅显示字数。
+
+## 6. UI 设计
+
+### 6.1 位置
+
+assistant 气泡**内部顶部**，`MarkdownRenderer`（渲染 body）**之前**，独立渲染一个 thinking 折叠区。只有 `parsedThinking.hasThinking === true` 时才渲染。
+
+### 6.2 视觉样式
+
+复用现有 `tool-line` 折叠样式：
+
+```
+▸ 💭 思考过程 · 412 字                 （历史消息，仅字数）
+▸ 💭 思考过程 · 已观察 3s · 412 字       （本次会话流式完成的消息）
+▾ 💭 思考中… · 128 字                   （流式进行中）
+```
+
+展开后：
+
+```
+▾ 💭 ...
+  ┌─────────────────────────
+  │ thinking 内容（Markdown 渲染，
+  │ 字体略小、弱对比色）
+  └─────────────────────────
+```
+
+新增 SCSS 类 `.thinking-block`，复用 `.tool-line` / `.tool-details` 的布局，文本弱化（opacity 0.85 + italic 可选）。
+
+### 6.3 默认展开状态
+
+- **流式进行中**（`message.isStreaming && parsedThinking.pending`）→ **强制展开**；
+- **非流式**：
+  - `settings.display.show_reasoning === true` → 默认展开；
+  - `settings.display.show_reasoning === false` → 默认收起；
+- 用户手动点击 chevron 切换后，以组件内 `ref<boolean | null>(null)` 记录覆盖态（null = 跟随默认）。**Transient**：刷新 / 切会话 / 重挂载后回默认。
+
+### 6.4 Header 摘要计算
+
+两条独立响应链避免性能问题（采纳 rubber-duck #7）：
+
+```ts
+// 仅依赖 content 变化，重解析
+const parsed = computed(() => parseThinking(message.content, { streaming: message.isStreaming }))
+
+// 字数：Unicode 字符数
+const thinkingChars = computed(() => {
+  const len = (s: string) => [...s].length
+  return parsed.value.segments.reduce((a, s) => a + len(s), 0) + len(parsed.value.pending || '')
+})
+
+// 耗时：仅活跃 streaming 消息开秒表；非活跃时取定值或不显示
+const observation = chatStore.getThinkingObservation(message.id) // 可能为 undefined
+const liveNowTick = /* useInterval(1000) 仅在 isStreaming 时启用 */
+const durationMs = computed(() => {
+  if (!observation?.startedAt) return null
+  const end = observation.endedAt ?? (message.isStreaming ? liveNowTick.value : observation.startedAt)
+  return end - observation.startedAt
+})
+```
+
+- 字数计算开销小，随 content 变化；
+- duration interval 仅在 `message.isStreaming && observation?.startedAt && !observation?.endedAt` 时启用，非活跃消息不耗 CPU。
+
+### 6.5 终止态降级（采纳 rubber-duck #2）
+
+当 SSE `run.completed` / `run.failed` 触发后：
+
+- 消息 `isStreaming` 变为 `false`；
+- 解析时传入 `streaming: false`；
+- `parseThinking` 中未闭合的 `<think>` 不再视为 pending，**保留为正文的一部分**；
+- 避免"答案被永久折叠看不见"。
+
+### 6.6 i18n 新增 key（8 语言）
+
+```
+chat.thinkingLabel          "思考过程" / "Thinking"
+chat.thinkingInProgress     "思考中…" / "Thinking…"
+chat.thinkingShow           "展开思考过程"
+chat.thinkingHide           "收起思考过程"
+chat.thinkingDuration       "已观察 {duration}" / "Observed {duration}"
+chat.thinkingChars          "{count} 字" / "{count} chars"
+```
+
+## 7. 涉及文件
+
+| 文件 | 变更 |
+|---|---|
+| `packages/client/src/utils/thinking-parser.ts` | **新增** — 纯函数解析器 + 边界检测 |
+| `packages/client/src/components/hermes/chat/MessageItem.vue` | 新增 thinking 折叠区渲染，computed 拆分 |
+| `packages/client/src/stores/hermes/chat.ts` | 新增 `thinkingObservation` Map；`message.delta` 中写入边界；`switchSession` 清理；导出 `getThinkingObservation(messageId)` |
+| `packages/client/src/i18n/locales/{en,zh,de,es,fr,ja,ko,pt}.ts` | 新增 6 条 i18n key |
+| `tests/client/utils/thinking-parser.test.ts` | **新增** — 解析器单元测试 |
+| `tests/client/stores/chat-thinking-boundary.test.ts` | **新增** — 边界检测 / switchSession 清理测试 |
+
+## 8. 测试策略
+
+### 8.1 解析器（必测，覆盖边界）
+
+- 单个闭合 `<think>...</think>` → segments=[...], body=''
+- 多个闭合片段按顺序
+- 未闭合 `<think>x`，`streaming=true` → pending='x'
+- 未闭合 `<think>x`，`streaming=false`（终止态降级）→ body 原样保留 `<think>x`，pending=null
+- `<thinking>` / `<reasoning>` 变体
+- 大小写变体 `<Think>`, `<REASONING>`
+- **同名嵌套** `<think>a<think>b</think>c</think>` → segments=['a<think>b'], body='c</think>'（明确文档化此行为）
+- **Fenced code block 保护** `\`\`\`\n<think>not real</think>\n\`\`\`` → 不识别
+- **Inline code 保护** `` `<think>` `` → 不识别
+- 空 content → hasThinking=false
+- 纯正文 → hasThinking=false, body 原样
+- Chunk 边界场景（前半 `<thin`，后半 `k>hi</think>`）→ 基于累积 content 正确解析
+
+### 8.2 边界检测（必测）
+
+- `detectThinkingBoundary('', '<think>hi')` → startedAtBoundary=true
+- `detectThinkingBoundary('<think>hi', '<think>hi</think>')` → endedAtBoundary=true
+- `detectThinkingBoundary('abc', 'abcdef')` → both false
+- 代码块里的伪标签不触发边界
+
+### 8.3 Store 行为（必测）
+
+- `message.delta` 首次出现开始标签 → `thinkingObservation` Map 写入 startedAt
+- 首次结束标签 → 写入 endedAt
+- `switchSession` → Map 清空
+- `refreshActiveSession` 覆盖消息后，Map 已写入的 entry 保留（即仅 switchSession 清理）
+
+### 8.4 组件（若已有 Vue test-utils 基建）
+
+- 无 thinking 时 `.thinking-block` 不渲染
+- `show_reasoning=true` 默认展开；`=false` 默认收起
+- 流式且有 pending 时强制展开（忽略 show_reasoning）
+- 点击切换不改设置
+- 有 observation 显示 duration，无 observation 仅显示字数
+
+## 9. 兼容性与迁移
+
+### 9.1 数据层（完全兼容）
+
+- **`Message.content` 字段未变**：仍是原始字符串（含 `<think>...</think>` 等标签）；
+- **`Message` 接口无新持久化字段**（采纳 rubber-duck #4）；
+- **localStorage 旧数据**：无 schema 迁移，原样可读；
+- **Sessions export/import JSON 格式**：无变化；
+- **上游 hermes CLI `sessions export`**：读取的仍是 content 字符串，无副作用。
+
+### 9.2 渲染行为变化（正是需求本身）
+
+**升级到含本功能的版本后，老消息的视觉表现会发生变化**，这是功能预期效果：
+
+| 场景 | 旧版渲染 | 新版渲染 |
+|---|---|---|
+| `<think>x</think>body` | think 标签原样出现在正文中（或被 Markdown 当作 HTML 忽略） | 识别为独立可折叠块 + 正文 |
+| 仅 `<think>x</think>` 无正文 | 整条消息显示 think 内容（含标签） | 仅显示折叠 thinking，正文为空 |
+| 代码块中演示 `<think>` 字面量 | 同样原样显示 | **不识别**，保持原样 |
+
+**历史消息在新版下的限制**：
+
+- 无 `thinkingObservation` entry → **不显示耗时**，header 文案降级为 `💭 思考过程 · X 字`；
+- 该限制是刻意设计：耗时语义为"本次会话前端观察到的 wall-clock 时间"，历史消息无法回溯，显示任何数字都会误导。
+
+### 9.3 边界情况
+
+- **老消息中 `<think>` 未闭合**（极罕见，如旧版前端流式中断未完整保存）→ §6.5 终止态降级保留为正文，不会吞答案；
+- **同名嵌套 / 代码块伪标签 / chunk 边界** → §4 解析规则已明确处理。
+
+### 9.4 未来扩展
+
+若上游新增独立 `reasoning.delta` SSE 事件，可在 chat store 将 delta 单独拼接到 segments 虚拟字段，UI 层无需变动（解析器仍兼容标签形式）。
+
+## 10. 风险与决议
+
+| 风险 | 缓解 |
+|---|---|
+| 超长 content regex 性能 | `computed` 缓存；代码块替换 + 一次正则扫描级别 |
+| 代码块内伪标签误识别 | 首版即做代码块保护（4.2）|
+| 流式结束时标签未闭合，正文被吞 | 终止态降级（6.5）保留为正文 |
+| 嵌套标签误匹配 | 显式不支持，文档化；实际场景极罕见 |
+| 刷新后时间戳丢失 | 纯运行时派生，不持久化；历史消息仅显示字数 |
+| 多段 reasoning 的耗时失真 | 按"首个 started / 最后 ended"聚合；字数累加 |
+| duration 秒表造成非活跃消息 CPU 占用 | interval 只在 `isStreaming && hasStartedAt && !hasEndedAt` 时启动 |
+
+## 11. 实施阶段（交给 writing-plans 细化）
+
+1. 实现 `utils/thinking-parser.ts` + 解析器单测（TDD）
+2. 实现边界检测 + switchSession 清理 + store 单测
+3. `MessageItem.vue` 集成折叠 UI（两条响应链 + transient 状态）
+4. SCSS 复用 tool-line 样式
+5. i18n 8 语言
+6. 集成自测（DeepSeek R1 / GLM 真实对话）+ 手动验证刷新、切会话场景
+7. `npm run build` + `npm run test` 验证
@@ -23,6 +23,8 @@ export interface RunEvent {
  event: string
  run_id?: string
  delta?: string
+  /** Payload text for `reasoning.delta` / `thinking.delta` / `reasoning.available` events. */
+  text?: string
  tool?: string
  name?: string
  preview?: string
@@ -1,10 +1,13 @@
 <script setup lang="ts">
 import type { Message } from "@/stores/hermes/chat";
-import { computed, ref } from "vue";
+import { computed, onBeforeUnmount, ref, watchEffect } from "vue";
 import { useI18n } from "vue-i18n";
 import { useMessage } from "naive-ui";
 import { downloadFile } from "@/api/hermes/download";
 import MarkdownRenderer from "./MarkdownRenderer.vue";
+import { parseThinking, countThinkingChars } from "@/utils/thinking-parser";
+import { useChatStore } from "@/stores/hermes/chat";
+import { useSettingsStore } from "@/stores/hermes/settings";
 import {
  copyTextToClipboard,
  handleCodeBlockCopyClick,
@@ -20,6 +23,94 @@ const toast = useMessage();
 const isSystem = computed(() => props.message.role === "system");
 const toolExpanded = ref(false);

+const chatStore = useChatStore();
+const settingsStore = useSettingsStore();
+
+const parsedThinking = computed(() =>
+  parseThinking(props.message.content || "", { streaming: !!props.message.isStreaming }),
+);
+
+// 优先使用来自 reasoning 字段/事件的思考文本；否则回退到从 content 解析的 <think> 标签。
+// 若两者共存，则拼接展示（罕见，但保持信息不丢）。
+const hasReasoningField = computed(() => !!(props.message.reasoning && props.message.reasoning.length > 0));
+
+const hasThinking = computed(() => hasReasoningField.value || parsedThinking.value.hasThinking);
+
+const thinkingFullText = computed(() => {
+  const parts: string[] = [];
+  if (props.message.reasoning) parts.push(props.message.reasoning);
+  parts.push(...parsedThinking.value.segments);
+  if (parsedThinking.value.pending) parts.push(parsedThinking.value.pending);
+  return parts.join("\n\n");
+});
+
+const thinkingCharCount = computed(() => {
+  let count = countThinkingChars(parsedThinking.value);
+  if (props.message.reasoning) count += props.message.reasoning.length;
+  return count;
+});
+
+// 流式思考态：仍有未闭合 <think> 标签，或 reasoning 有内容但正文尚未开始。
+const thinkingStreamingNow = computed(() => {
+  if (!props.message.isStreaming) return false;
+  if (parsedThinking.value.pending !== null) return true;
+  if (hasReasoningField.value && !props.message.content) return true;
+  return false;
+});
+
+const thinkingOverride = ref<boolean | null>(null);
+
+const thinkingExpanded = computed(() => {
+  if (thinkingStreamingNow.value) return true;
+  if (thinkingOverride.value !== null) return thinkingOverride.value;
+  return !!settingsStore.display.show_reasoning;
+});
+
+function toggleThinking() {
+  thinkingOverride.value = !thinkingExpanded.value;
+}
+
+const nowTick = ref(Date.now());
+let tickTimer: number | null = null;
+
+function ensureTick() {
+  const ob = chatStore.getThinkingObservation(props.message.id);
+  const shouldTick = !!(
+    props.message.isStreaming &&
+    ob?.startedAt !== undefined &&
+    ob.endedAt === undefined
+  );
+  if (shouldTick && tickTimer === null) {
+    tickTimer = window.setInterval(() => {
+      nowTick.value = Date.now();
+    }, 1000);
+  } else if (!shouldTick && tickTimer !== null) {
+    window.clearInterval(tickTimer);
+    tickTimer = null;
+  }
+}
+
+watchEffect(ensureTick);
+
+onBeforeUnmount(() => {
+  if (tickTimer !== null) window.clearInterval(tickTimer);
+});
+
+const thinkingDurationMs = computed<number | null>(() => {
+  const ob = chatStore.getThinkingObservation(props.message.id);
+  if (!ob?.startedAt) return null;
+  const end = ob.endedAt ?? (props.message.isStreaming ? nowTick.value : ob.startedAt);
+  return Math.max(0, end - ob.startedAt);
+});
+
+function formatDuration(ms: number): string {
+  const s = Math.floor(ms / 1000);
+  if (s < 60) return `${s}s`;
+  const m = Math.floor(s / 60);
+  const r = s % 60;
+  return r === 0 ? `${m}m` : `${m}m ${r}s`;
+}
+
 const timeStr = computed(() => {
  const d = new Date(props.message.timestamp);
  return d.toLocaleTimeString([], { hour: "2-digit", minute: "2-digit" });
@@ -275,9 +366,46 @@ const renderedToolResult = computed(() => {
                </template>
              </div>
            </div>
+            <div
+              v-if="hasThinking"
+              class="thinking-block"
+              :class="{ expanded: thinkingExpanded }"
+            >
+              <div class="thinking-header" @click="toggleThinking">
+                <svg
+                  width="10"
+                  height="10"
+                  viewBox="0 0 24 24"
+                  fill="none"
+                  stroke="currentColor"
+                  stroke-width="2"
+                  class="thinking-chevron"
+                  :class="{ rotated: thinkingExpanded }"
+                >
+                  <polyline points="9 18 15 12 9 6" />
+                </svg>
+                <span class="thinking-icon">💭</span>
+                <span class="thinking-label">
+                  {{
+                    thinkingStreamingNow
+                      ? t('chat.thinkingInProgress')
+                      : t('chat.thinkingLabel')
+                  }}
+                </span>
+                <span v-if="thinkingDurationMs !== null && thinkingDurationMs > 0" class="thinking-meta">
+                  · {{ t('chat.thinkingDuration', { duration: formatDuration(thinkingDurationMs) }) }}
+                </span>
+                <span class="thinking-meta">
+                  · {{ t('chat.thinkingChars', { count: thinkingCharCount }) }}
+                </span>
+              </div>
+              <div v-if="thinkingExpanded" class="thinking-body">
+                <MarkdownRenderer :content="thinkingFullText" />
+              </div>
+            </div>
            <MarkdownRenderer
-              v-if="message.content"
-              :content="message.content"
+              v-if="parsedThinking.body"
+              :content="parsedThinking.body"
            />

            <span v-if="message.isStreaming && !message.content" class="streaming-dots">
@@ -427,6 +555,63 @@ const renderedToolResult = computed(() => {
  }
 }

+.thinking-block {
+  margin-bottom: 8px;
+  padding: 4px 0;
+  border-bottom: 1px dashed $border-light;
+
+  .thinking-header {
+    display: flex;
+    align-items: center;
+    gap: 6px;
+    font-size: 11px;
+    color: $text-muted;
+    cursor: pointer;
+    padding: 2px 4px;
+    border-radius: $radius-sm;
+    user-select: none;
+
+    &:hover {
+      background: rgba(0, 0, 0, 0.03);
+    }
+  }
+
+  .thinking-chevron {
+    flex-shrink: 0;
+    transition: transform 0.15s ease;
+
+    &.rotated {
+      transform: rotate(90deg);
+    }
+  }
+
+  .thinking-icon {
+    font-size: 11px;
+    flex-shrink: 0;
+  }
+
+  .thinking-label {
+    font-weight: 500;
+    flex-shrink: 0;
+  }
+
+  .thinking-meta {
+    color: $text-muted;
+    font-variant-numeric: tabular-nums;
+  }
+
+  .thinking-body {
+    margin-top: 6px;
+    padding: 6px 10px;
+    border-left: 2px solid $border-light;
+    font-size: 13px;
+    opacity: 0.85;
+    font-style: italic;
+
+    :deep(p) { margin: 0.3em 0; }
+  }
+}
+
 .message-time {
  font-size: 11px;
  color: $text-muted;
@@ -131,6 +131,12 @@ export default {
    arguments: 'Argumente',
    result: 'Ergebnis',
    truncated: '... (abgeschnitten)',
+    thinkingLabel: 'Denkprozess',
+    thinkingInProgress: 'Denkt…',
+    thinkingShow: 'Denkprozess anzeigen',
+    thinkingHide: 'Denkprozess ausblenden',
+    thinkingDuration: 'Beobachtet {duration}',
+    thinkingChars: '{count} Zeichen',
  },

  // Jobs
@@ -154,6 +154,12 @@ export default {
    arguments: 'Arguments',
    result: 'Result',
    truncated: '... (truncated)',
+    thinkingLabel: 'Thinking',
+    thinkingInProgress: 'Thinking…',
+    thinkingShow: 'Show thinking',
+    thinkingHide: 'Hide thinking',
+    thinkingDuration: 'Observed {duration}',
+    thinkingChars: '{count} chars',
  },

  // Jobs
@@ -131,6 +131,12 @@ export default {
    arguments: 'Argumentos',
    result: 'Resultado',
    truncated: '... (truncado)',
+    thinkingLabel: 'Pensamiento',
+    thinkingInProgress: 'Pensando…',
+    thinkingShow: 'Mostrar pensamiento',
+    thinkingHide: 'Ocultar pensamiento',
+    thinkingDuration: 'Observado {duration}',
+    thinkingChars: '{count} caracteres',
  },

  // Jobs
@@ -131,6 +131,12 @@ export default {
    arguments: 'Arguments',
    result: 'Resultat',
    truncated: '... (tronque)',
+    thinkingLabel: 'Raisonnement',
+    thinkingInProgress: 'En réflexion…',
+    thinkingShow: 'Afficher le raisonnement',
+    thinkingHide: 'Masquer le raisonnement',
+    thinkingDuration: 'Observé {duration}',
+    thinkingChars: '{count} caractères',
  },

  // Jobs
@@ -131,6 +131,12 @@ export default {
    arguments: '引数',
    result: '結果',
    truncated: '... (省略)',
+    thinkingLabel: '思考過程',
+    thinkingInProgress: '思考中…',
+    thinkingShow: '思考過程を表示',
+    thinkingHide: '思考過程を隠す',
+    thinkingDuration: '観測 {duration}',
+    thinkingChars: '{count} 文字',
  },

  // スケジュールジョブ
@@ -131,6 +131,12 @@ export default {
    arguments: '인수',
    result: '결과',
    truncated: '... (잘림)',
+    thinkingLabel: '사고 과정',
+    thinkingInProgress: '사고 중…',
+    thinkingShow: '사고 과정 펼치기',
+    thinkingHide: '사고 과정 접기',
+    thinkingDuration: '관측 {duration}',
+    thinkingChars: '{count}자',
  },

  // 예약 작업
@@ -131,6 +131,12 @@ export default {
    arguments: 'Argumentos',
    result: 'Resultado',
    truncated: '... (truncado)',
+    thinkingLabel: 'Raciocínio',
+    thinkingInProgress: 'Pensando…',
+    thinkingShow: 'Mostrar raciocínio',
+    thinkingHide: 'Ocultar raciocínio',
+    thinkingDuration: 'Observado {duration}',
+    thinkingChars: '{count} caracteres',
  },

  // Jobs
@@ -154,6 +154,12 @@ export default {
    arguments: '参数',
    result: '结果',
    truncated: '... (已截断)',
+    thinkingLabel: '思考过程',
+    thinkingInProgress: '思考中…',
+    thinkingShow: '展开思考过程',
+    thinkingHide: '收起思考过程',
+    thinkingDuration: '已观察 {duration}',
+    thinkingChars: '{count} 字',
  },

  // 定时任务
@@ -4,6 +4,7 @@ import { defineStore } from 'pinia'
 import { ref, computed } from 'vue'
 import { useAppStore } from './app'
 import { useProfilesStore } from './profiles'
+import { detectThinkingBoundary } from '@/utils/thinking-parser'

 export interface Attachment {
  id: string
@@ -26,6 +27,11 @@ export interface Message {
  toolStatus?: 'running' | 'done' | 'error'
  isStreaming?: boolean
  attachments?: Attachment[]
+  // 思考/推理文本。两条来源：
+  //   1) 历史消息：来自 HermesMessage.reasoning 字段
+  //   2) 流式：由 reasoning.delta / thinking.delta / reasoning.available 事件累加
+  // 不含 <think> 包裹标签；内容自身可以为多段纯文本。
+  reasoning?: string
 }

 export interface Session {
@@ -141,6 +147,7 @@ function mapHermesMessages(msgs: HermesMessage[]): Message[] {
      role: msg.role,
      content: msg.content || '',
      timestamp: Math.round(msg.timestamp * 1000),
+      reasoning: msg.reasoning ? msg.reasoning : undefined,
    })
  }
  return result
@@ -305,6 +312,26 @@ function sanitizeForCache(msgs: Message[]): Message[] {
  })
 }

+// Heals assistant messages whose `reasoning` field was polluted by the
+// old bug where `reasoning.available` clobbered it with the assistant
+// content. Detection heuristic: reasoning is a prefix of content (the
+// bug always derived `reasoning` from `content[:500]` with tags stripped).
+// Legitimate reasoning is almost never a prefix of the final answer.
+function scrubBuggyReasoningInCache(msgs: Message[] | null | undefined): Message[] {
+  if (!msgs) return []
+  return msgs.map(m => {
+    if (m.role !== 'assistant' || !m.reasoning || !m.content) return m
+    const r = m.reasoning.trim()
+    const c = m.content.trim()
+    if (!r || !c) return m
+    if (c === r || c.startsWith(r)) {
+      const { reasoning: _drop, ...rest } = m
+      return rest as Message
+    }
+    return m
+  })
+}
+
 export const useChatStore = defineStore('chat', () => {
  const sessions = ref<Session[]>([])
  const activeSessionId = ref<string | null>(null)
@@ -471,7 +498,7 @@ export const useChatStore = defineStore('chat', () => {
          const cachedActive = cachedSessions.find(s => s.id === savedId) || null
          if (cachedActive) {
            const cachedMsgs = loadJsonWithFallback<Message[]>(msgsCacheKey(savedId), legacyMsgsCacheKey(savedId))
-            if (cachedMsgs) cachedActive.messages = cachedMsgs
+            if (cachedMsgs) cachedActive.messages = scrubBuggyReasoningInCache(cachedMsgs)
            activeSession.value = cachedActive
            activeSessionId.value = savedId
          }
@@ -561,6 +588,7 @@ export const useChatStore = defineStore('chat', () => {
  }

  async function switchSession(sessionId: string, focusId?: string | null) {
+    clearThinkingObservationFor(sessionId)
    activeSessionId.value = sessionId
    focusMessageId.value = focusId ?? null
    setItemBestEffort(storageKey(), sessionId)
@@ -577,7 +605,7 @@ export const useChatStore = defineStore('chat', () => {
    if (!hasLocalMessages) {
      const cachedMsgs = loadJsonWithFallback<Message[]>(msgsCacheKey(sessionId), legacyMsgsCacheKey(sessionId))
      if (cachedMsgs?.length) {
-        activeSession.value.messages = cachedMsgs
+        activeSession.value.messages = scrubBuggyReasoningInCache(cachedMsgs)
      }
    }

@@ -823,16 +851,66 @@ export const useChatStore = defineStore('chat', () => {
            case 'run.started':
              break

+            case 'reasoning.delta':
+            case 'thinking.delta': {
+              const text = evt.text || evt.delta || ''
+              if (!text) break
+              const msgs = getSessionMsgs(sid)
+              const last = msgs[msgs.length - 1]
+              if (last?.role === 'assistant' && last.isStreaming) {
+                last.reasoning = (last.reasoning || '') + text
+                noteReasoningStart(last.id)
+              } else {
+                const newId = uid()
+                addMessage(sid, {
+                  id: newId,
+                  role: 'assistant',
+                  content: '',
+                  timestamp: Date.now(),
+                  isStreaming: true,
+                  reasoning: text,
+                })
+                noteReasoningStart(newId)
+              }
+              schedulePersist()
+              break
+            }
+
+            case 'reasoning.available': {
+              // Upstream run_agent.py fires reasoning.available with
+              // `assistant_message.content[:500]` as the preview — i.e.,
+              // the main answer, not real reasoning. Ignore the payload
+              // and only use this event as a "thinking ended" signal so
+              // the duration counter stops.
+              const msgs = getSessionMsgs(sid)
+              const last = msgs[msgs.length - 1]
+              if (last?.role === 'assistant' && last.isStreaming) {
+                // 只有当 reasoning.delta 事件曾经启动过计时，才标记结束；
+                // 否则（上游未转发 delta，只发这一次 available）不显示时长。
+                noteReasoningEnd(last.id)
+              }
+              schedulePersist()
+              break
+            }
+
            case 'message.delta': {
              const msgs = getSessionMsgs(sid)
              const last = msgs[msgs.length - 1]
              if (last?.role === 'assistant' && last.isStreaming) {
-                last.content += evt.delta || ''
+                const prev = last.content
+                const next = prev + (evt.delta || '')
+                noteThinkingDelta(last.id, prev, next)
+                // 若之前有 reasoning 累积，则 content 到达即视为推理结束。
+                if (last.reasoning) noteReasoningEnd(last.id)
+                last.content = next
              } else {
+                const newId = uid()
+                const nextContent = evt.delta || ''
+                noteThinkingDelta(newId, '', nextContent)
                addMessage(sid, {
-                  id: uid(),
+                  id: newId,
                  role: 'assistant',
-                  content: evt.delta || '',
+                  content: nextContent,
                  timestamp: Date.now(),
                  isStreaming: true,
                })
@@ -1013,6 +1091,52 @@ export const useChatStore = defineStore('chat', () => {
    })
  }

+  // Transient observation of <think> boundaries during active streaming.
+  // Not persisted; cleared on session switch. See spec §5.3.
+  const thinkingObservation = new Map<string, { startedAt?: number; endedAt?: number }>()
+
+  function getThinkingObservation(messageId: string) {
+    return thinkingObservation.get(messageId)
+  }
+
+  function noteThinkingDelta(messageId: string, prevContent: string, nextContent: string) {
+    const { startedAtBoundary, endedAtBoundary } = detectThinkingBoundary(prevContent, nextContent)
+    if (!startedAtBoundary && !endedAtBoundary) return
+    const existing = thinkingObservation.get(messageId) || {}
+    if (startedAtBoundary && existing.startedAt === undefined) {
+      existing.startedAt = Date.now()
+    }
+    if (endedAtBoundary && existing.endedAt === undefined) {
+      existing.endedAt = Date.now()
+    }
+    thinkingObservation.set(messageId, existing)
+  }
+
+  /** 第一次见到某条消息的 reasoning 文本时，标记 startedAt。 */
+  function noteReasoningStart(messageId: string) {
+    const existing = thinkingObservation.get(messageId) || {}
+    if (existing.startedAt === undefined) {
+      existing.startedAt = Date.now()
+      thinkingObservation.set(messageId, existing)
+    }
+  }
+
+  /** 内容首次到达（视为推理结束）或显式收到 reasoning.available 时，标记 endedAt。 */
+  function noteReasoningEnd(messageId: string) {
+    const existing = thinkingObservation.get(messageId)
+    if (!existing || existing.startedAt === undefined) return
+    if (existing.endedAt === undefined) {
+      existing.endedAt = Date.now()
+      thinkingObservation.set(messageId, existing)
+    }
+  }
+
+  function clearThinkingObservationFor(_sessionId: string) {
+    // messageId 与 sessionId 的关联未单独持有；方案是切会话时一律清空。
+    // 这符合 spec 定义：observation 是"当前会话范围内"的 transient 状态。
+    thinkingObservation.clear()
+  }
+
  return {
    sessions,
    activeSessionId,
@@ -1034,5 +1158,10 @@ export const useChatStore = defineStore('chat', () => {
    stopStreaming,
    loadSessions,
    refreshActiveSession,
+    getThinkingObservation,
+    noteThinkingDelta,
+    noteReasoningStart,
+    noteReasoningEnd,
+    clearThinkingObservationFor,
  }
 })
@@ -0,0 +1,99 @@
+export interface ParsedThinking {
+  segments: string[]
+  pending: string | null
+  body: string
+  hasThinking: boolean
+}
+
+export interface ParseOptions {
+  streaming: boolean
+}
+
+const TAG_RE = /<(think|thinking|reasoning)>([\s\S]*?)<\/\1>/gi
+
+const PLACEHOLDER_PREFIX = '\u0000THKCODE'
+const PLACEHOLDER_SUFFIX = '\u0000'
+
+const FENCED_RE = /(```|~~~)([\s\S]*?)\1/g
+const INLINE_CODE_RE = /`[^`\n]*`/g
+
+function protectCodeBlocks(input: string): { masked: string; blocks: string[] } {
+  const blocks: string[] = []
+  let masked = input.replace(FENCED_RE, (m) => {
+    blocks.push(m)
+    return `${PLACEHOLDER_PREFIX}${blocks.length - 1}${PLACEHOLDER_SUFFIX}`
+  })
+  masked = masked.replace(INLINE_CODE_RE, (m) => {
+    blocks.push(m)
+    return `${PLACEHOLDER_PREFIX}${blocks.length - 1}${PLACEHOLDER_SUFFIX}`
+  })
+  return { masked, blocks }
+}
+
+function restoreCodeBlocks(text: string, blocks: string[]): string {
+  if (blocks.length === 0) return text
+  return text.replace(
+    new RegExp(`${PLACEHOLDER_PREFIX}(\\d+)${PLACEHOLDER_SUFFIX}`, 'g'),
+    (_, idx) => blocks[Number(idx)] ?? '',
+  )
+}
+
+export function parseThinking(content: string, opts: ParseOptions): ParsedThinking {
+  const { masked, blocks } = protectCodeBlocks(content)
+
+  const segments: string[] = []
+  let pending: string | null = null
+  let body = ''
+  let lastIndex = 0
+
+  TAG_RE.lastIndex = 0
+  let m: RegExpExecArray | null
+  while ((m = TAG_RE.exec(masked)) !== null) {
+    body += masked.slice(lastIndex, m.index)
+    segments.push(m[2])
+    lastIndex = m.index + m[0].length
+  }
+  const rest = masked.slice(lastIndex)
+
+  const openRe = /<(think|thinking|reasoning)>([\s\S]*)$/i
+  const openMatch = rest.match(openRe)
+  if (openMatch) {
+    body += rest.slice(0, openMatch.index)
+    if (opts.streaming) {
+      pending = openMatch[2]
+    } else {
+      body += rest.slice(openMatch.index!)
+    }
+  } else {
+    body += rest
+  }
+
+  return {
+    segments: segments.map(s => restoreCodeBlocks(s, blocks)),
+    pending: pending === null ? null : restoreCodeBlocks(pending, blocks),
+    body: restoreCodeBlocks(body, blocks),
+    hasThinking: segments.length > 0 || pending !== null,
+  }
+}
+
+export function countThinkingChars(parsed: ParsedThinking): number {
+  const len = (s: string) => [...s].length
+  return parsed.segments.reduce((a, s) => a + len(s), 0) + len(parsed.pending || '')
+}
+
+export interface ThinkingBoundary {
+  startedAtBoundary: boolean
+  endedAtBoundary: boolean
+}
+
+const ANY_OPEN_RE = /<(think|thinking|reasoning)>/i
+const ANY_CLOSE_RE = /<\/(think|thinking|reasoning)>/i
+
+export function detectThinkingBoundary(prev: string, next: string): ThinkingBoundary {
+  const prevMasked = protectCodeBlocks(prev).masked
+  const nextMasked = protectCodeBlocks(next).masked
+  return {
+    startedAtBoundary: !ANY_OPEN_RE.test(prevMasked) && ANY_OPEN_RE.test(nextMasked),
+    endedAtBoundary: !ANY_CLOSE_RE.test(prevMasked) && ANY_CLOSE_RE.test(nextMasked),
+  }
+}
@@ -0,0 +1,191 @@
+// @vitest-environment jsdom
+import { beforeEach, describe, expect, it, vi } from 'vitest'
+import { createPinia, setActivePinia } from 'pinia'
+
+const mockChatApi = vi.hoisted(() => ({
+  startRun: vi.fn(),
+  streamRunEvents: vi.fn(),
+}))
+
+const mockSessionsApi = vi.hoisted(() => ({
+  fetchSessions: vi.fn(),
+  fetchSession: vi.fn(),
+  deleteSession: vi.fn(),
+  renameSession: vi.fn(),
+  fetchSessionUsageSingle: vi.fn(),
+}))
+
+vi.mock('@/api/hermes/chat', () => mockChatApi)
+vi.mock('@/api/hermes/sessions', () => mockSessionsApi)
+
+import { useChatStore } from '@/stores/hermes/chat'
+
+const PROFILE = 'default'
+
+async function flush() {
+  for (let i = 0; i < 4; i += 1) await Promise.resolve()
+}
+
+type EventHandler = (evt: any) => void
+
+function setupStream(events: Array<any>) {
+  mockChatApi.streamRunEvents.mockImplementation((
+    _runId: string,
+    onEvent: EventHandler,
+  ) => {
+    // Fire events synchronously on microtask queue so they land on the
+    // same streaming message that sendMessage just created.
+    queueMicrotask(() => {
+      for (const e of events) onEvent(e)
+    })
+    return { abort: vi.fn() }
+  })
+}
+
+describe('chat store — reasoning.available should not clobber content', () => {
+  beforeEach(() => {
+    setActivePinia(createPinia())
+    vi.clearAllMocks()
+    window.localStorage.clear()
+    mockSessionsApi.fetchSessions.mockResolvedValue([])
+    mockSessionsApi.fetchSession.mockResolvedValue(null)
+    mockSessionsApi.fetchSessionUsageSingle?.mockResolvedValue?.(null)
+    mockChatApi.startRun.mockResolvedValue({ run_id: 'run-1', status: 'queued' })
+  })
+
+  it('keeps streamed reasoning.delta when a later reasoning.available carries the assistant content (upstream bug)', async () => {
+    // Simulates the bug path from hermes-agent run_agent.py:11275, which
+    // fires reasoning.available with `assistant_message.content[:500]` as
+    // the preview — i.e., the *main answer*, not real reasoning.
+    // The store must not replace the already-accumulated reasoning with
+    // the content payload.
+    setupStream([
+      { event: 'run.started', run_id: 'run-1' },
+      { event: 'reasoning.delta', run_id: 'run-1', text: 'Let me think ' },
+      { event: 'reasoning.delta', run_id: 'run-1', text: 'about this.' },
+      { event: 'message.delta', run_id: 'run-1', delta: 'The answer is 42.' },
+      // Upstream misclassification: text == the assistant content
+      { event: 'reasoning.available', run_id: 'run-1', text: 'The answer is 42.' },
+      { event: 'run.completed', run_id: 'run-1' },
+    ])
+
+    const store = useChatStore()
+    await flush()
+    await store.sendMessage('hi')
+    await flush()
+    await flush()
+
+    const asst = store.messages.find(m => m.role === 'assistant')
+    expect(asst).toBeDefined()
+    expect(asst!.content).toBe('The answer is 42.')
+    expect(asst!.reasoning).toBe('Let me think about this.')
+  })
+
+  it('also rejects reasoning.available when delta-less stream already flushed content', async () => {
+    // Upstream main (no PR #15169) does not emit reasoning.delta at all.
+    // The only reasoning-flavored event is the misclassified reasoning.available
+    // carrying content as the text. We still must not write it into the
+    // thinking block, because content has already arrived — that's a strong
+    // signal the payload is the content-misclassification bug.
+    setupStream([
+      { event: 'run.started', run_id: 'run-1' },
+      { event: 'message.delta', run_id: 'run-1', delta: 'Plain answer.' },
+      { event: 'reasoning.available', run_id: 'run-1', text: 'Plain answer.' },
+      { event: 'run.completed', run_id: 'run-1' },
+    ])
+
+    const store = useChatStore()
+    await flush()
+    await store.sendMessage('hi')
+    await flush()
+    await flush()
+
+    const asst = store.messages.find(m => m.role === 'assistant')
+    expect(asst).toBeDefined()
+    expect(asst!.content).toBe('Plain answer.')
+    // No delta events arrived and content already present → still must not
+    // hijack the thinking block. Leave it empty so the UI simply doesn't show
+    // a thinking block (better than showing the answer twice).
+    expect(asst!.reasoning ?? '').toBe('')
+  })
+
+  it('marks reasoning end-of-thinking observation even when the payload is ignored', async () => {
+    // We drop reasoning.available's text payload because upstream misclassifies
+    // content as reasoning preview (see run_agent.py:11275). But we still want
+    // the event to serve as an "end-of-thinking" signal so the UI can stop
+    // the thinking-duration counter for messages that had reasoning.delta.
+    setupStream([
+      { event: 'run.started', run_id: 'run-1' },
+      { event: 'reasoning.delta', run_id: 'run-1', text: 'pondering…' },
+      { event: 'message.delta', run_id: 'run-1', delta: 'done' },
+      { event: 'reasoning.available', run_id: 'run-1', text: 'done' },
+      { event: 'run.completed', run_id: 'run-1' },
+    ])
+
+    const store = useChatStore()
+    await flush()
+    await store.sendMessage('hi')
+    await flush()
+    await flush()
+
+    const asst = store.messages.find(m => m.role === 'assistant')
+    expect(asst).toBeDefined()
+    // reasoning preserved (not clobbered)
+    expect(asst!.reasoning).toBe('pondering…')
+    // thinking observation must have endedAt stamped
+    const ob = store.getThinkingObservation(asst!.id)
+    expect(ob?.endedAt).toBeDefined()
+  })
+
+  it('heals old localStorage cache where reasoning was clobbered with content', async () => {
+    // Users who ran the previous buggy version have sessions in
+    // localStorage where assistant.reasoning === assistant.content (or
+    // reasoning is a prefix of content because the bug truncated to 500
+    // chars). Hydration must drop such stale reasoning so the UI doesn't
+    // flash the wrong thinking block before fetchSession completes.
+    const sid = 'sess-cache'
+    window.localStorage.setItem(`hermes_active_session_${PROFILE}`, sid)
+    window.localStorage.setItem(
+      `hermes_sessions_cache_v1_${PROFILE}`,
+      JSON.stringify([
+        {
+          id: sid,
+          title: 'Corrupted',
+          source: 'api_server',
+          messages: [],
+          createdAt: 1,
+          updatedAt: 1,
+        },
+      ]),
+    )
+    window.localStorage.setItem(
+      `hermes_session_msgs_v1_${PROFILE}_${sid}_`,
+      JSON.stringify([
+        { id: 'u', role: 'user', content: 'ask', timestamp: 1 },
+        {
+          id: 'a',
+          role: 'assistant',
+          content: 'The capital of France is Paris. It sits on the Seine.',
+          reasoning: 'The capital of France is Paris.', // prefix of content — buggy
+          timestamp: 2,
+        },
+        {
+          id: 'b',
+          role: 'assistant',
+          content: 'Another answer.',
+          reasoning: 'Real thinking that happens before the answer.', // legitimate
+          timestamp: 3,
+        },
+      ]),
+    )
+
+    const store = useChatStore()
+    await store.loadSessions()
+
+    const hydrated = store.messages
+    const a = hydrated.find(m => m.id === 'a')!
+    const b = hydrated.find(m => m.id === 'b')!
+    expect(a.reasoning).toBeUndefined()
+    expect(b.reasoning).toBe('Real thinking that happens before the answer.')
+  })
+})
@@ -0,0 +1,90 @@
+// @vitest-environment jsdom
+import { describe, it, expect, beforeEach } from 'vitest'
+import { setActivePinia, createPinia } from 'pinia'
+import { useChatStore } from '@/stores/hermes/chat'
+
+describe('chat store thinkingObservation', () => {
+  beforeEach(() => {
+    setActivePinia(createPinia())
+  })
+
+  it('starts empty', () => {
+    const store = useChatStore()
+    expect(store.getThinkingObservation('any-id')).toBeUndefined()
+  })
+
+  it('records startedAt when delta first introduces an opening tag', () => {
+    const store = useChatStore()
+    store.noteThinkingDelta('msg-1', '', '<think>hi')
+    const ob = store.getThinkingObservation('msg-1')
+    expect(ob).toBeDefined()
+    expect(typeof ob!.startedAt).toBe('number')
+    expect(ob!.endedAt).toBeUndefined()
+  })
+
+  it('records endedAt when delta first introduces closing tag', () => {
+    const store = useChatStore()
+    store.noteThinkingDelta('msg-1', '', '<think>hi')
+    store.noteThinkingDelta('msg-1', '<think>hi', '<think>hi</think>done')
+    const ob = store.getThinkingObservation('msg-1')
+    expect(ob!.startedAt).toBeDefined()
+    expect(typeof ob!.endedAt).toBe('number')
+  })
+
+  it('is idempotent for subsequent openings/closings', () => {
+    const store = useChatStore()
+    store.noteThinkingDelta('m', '', '<think>a</think>')
+    const first = store.getThinkingObservation('m')!
+    const firstStarted = first.startedAt
+    const firstEnded = first.endedAt
+    store.noteThinkingDelta(
+      'm',
+      '<think>a</think>',
+      '<think>a</think><think>b</think>',
+    )
+    const second = store.getThinkingObservation('m')!
+    expect(second.startedAt).toBe(firstStarted)
+    expect(second.endedAt).toBe(firstEnded)
+  })
+
+  it('is ignored when delta is inside a code block', () => {
+    const store = useChatStore()
+    store.noteThinkingDelta('m', '', '```\n<think>fake</think>\n```')
+    expect(store.getThinkingObservation('m')).toBeUndefined()
+  })
+
+  it('clears observations on clearThinkingObservationFor', () => {
+    const store = useChatStore()
+    store.noteThinkingDelta('m', '', '<think>hi</think>')
+    expect(store.getThinkingObservation('m')).toBeDefined()
+    store.clearThinkingObservationFor('any-session')
+    expect(store.getThinkingObservation('m')).toBeUndefined()
+  })
+
+  it('noteReasoningStart records startedAt only once', () => {
+    const store = useChatStore()
+    store.noteReasoningStart('r1')
+    const t1 = store.getThinkingObservation('r1')!.startedAt
+    expect(typeof t1).toBe('number')
+    store.noteReasoningStart('r1')
+    expect(store.getThinkingObservation('r1')!.startedAt).toBe(t1)
+  })
+
+  it('noteReasoningEnd requires prior start', () => {
+    const store = useChatStore()
+    store.noteReasoningEnd('r2')
+    expect(store.getThinkingObservation('r2')).toBeUndefined()
+    store.noteReasoningStart('r2')
+    store.noteReasoningEnd('r2')
+    expect(store.getThinkingObservation('r2')!.endedAt).toBeDefined()
+  })
+
+  it('noteReasoningEnd is idempotent', () => {
+    const store = useChatStore()
+    store.noteReasoningStart('r3')
+    store.noteReasoningEnd('r3')
+    const end1 = store.getThinkingObservation('r3')!.endedAt
+    store.noteReasoningEnd('r3')
+    expect(store.getThinkingObservation('r3')!.endedAt).toBe(end1)
+  })
+})
@@ -1,6 +1,7 @@
 // @vitest-environment jsdom
 import { beforeEach, describe, expect, it, vi } from 'vitest'
 import { mount } from '@vue/test-utils'
+import { createPinia, setActivePinia } from 'pinia'

 vi.mock('vue-i18n', () => ({
  useI18n: () => ({
@@ -22,6 +23,7 @@ import type { Message } from '@/stores/hermes/chat'

 describe('MessageItem tool details', () => {
  beforeEach(() => {
+    setActivePinia(createPinia())
    Object.defineProperty(navigator, 'clipboard', {
      configurable: true,
      value: {
@@ -0,0 +1,167 @@
+import { describe, it, expect } from 'vitest'
+import { parseThinking, countThinkingChars, detectThinkingBoundary } from '@/utils/thinking-parser'
+
+describe('parseThinking', () => {
+  it('splits a single closed <think> block from body', () => {
+    const r = parseThinking('<think>inner</think>body', { streaming: false })
+    expect(r.segments).toEqual(['inner'])
+    expect(r.body).toBe('body')
+    expect(r.pending).toBeNull()
+    expect(r.hasThinking).toBe(true)
+  })
+
+  it('collects multiple closed blocks in order', () => {
+    const r = parseThinking('<think>a</think>mid<thinking>b</thinking>end', { streaming: false })
+    expect(r.segments).toEqual(['a', 'b'])
+    expect(r.body).toBe('midend')
+  })
+
+  it('supports <thinking> and <reasoning> variants', () => {
+    const r = parseThinking('<reasoning>r</reasoning>body', { streaming: false })
+    expect(r.segments).toEqual(['r'])
+    expect(r.body).toBe('body')
+  })
+
+  it('is case-insensitive on tag names', () => {
+    const r = parseThinking('<Think>x</Think><REASONING>y</REASONING>z', { streaming: false })
+    expect(r.segments).toEqual(['x', 'y'])
+    expect(r.body).toBe('z')
+  })
+
+  it('returns hasThinking=false and body unchanged for plain text', () => {
+    const r = parseThinking('hello world', { streaming: false })
+    expect(r.hasThinking).toBe(false)
+    expect(r.body).toBe('hello world')
+    expect(r.segments).toEqual([])
+  })
+
+  it('returns hasThinking=false for empty content', () => {
+    const r = parseThinking('', { streaming: false })
+    expect(r.hasThinking).toBe(false)
+    expect(r.body).toBe('')
+  })
+
+  it('treats trailing unclosed tag as pending when streaming', () => {
+    const r = parseThinking('body<think>in-progress', { streaming: true })
+    expect(r.pending).toBe('in-progress')
+    expect(r.body).toBe('body')
+    expect(r.segments).toEqual([])
+    expect(r.hasThinking).toBe(true)
+  })
+
+  it('degrades trailing unclosed tag to body when NOT streaming (terminal state)', () => {
+    const r = parseThinking('body<think>orphan', { streaming: false })
+    expect(r.pending).toBeNull()
+    expect(r.body).toBe('body<think>orphan')
+    expect(r.segments).toEqual([])
+    expect(r.hasThinking).toBe(false)
+  })
+
+  it('combines closed segments with trailing pending (streaming)', () => {
+    const r = parseThinking('<think>done</think>mid<thinking>now', { streaming: true })
+    expect(r.segments).toEqual(['done'])
+    expect(r.pending).toBe('now')
+    expect(r.body).toBe('mid')
+  })
+
+  it('does NOT recognize <think> inside fenced code block', () => {
+    const src = 'before\n```\n<think>fake</think>\n```\nafter'
+    const r = parseThinking(src, { streaming: false })
+    expect(r.hasThinking).toBe(false)
+    expect(r.body).toBe(src)
+  })
+
+  it('does NOT recognize <think> inside tilde-fenced code block', () => {
+    const src = '~~~\n<think>fake</think>\n~~~'
+    const r = parseThinking(src, { streaming: false })
+    expect(r.hasThinking).toBe(false)
+    expect(r.body).toBe(src)
+  })
+
+  it('does NOT recognize <think> inside inline code', () => {
+    const src = 'the tag `<think>x</think>` is a literal'
+    const r = parseThinking(src, { streaming: false })
+    expect(r.hasThinking).toBe(false)
+    expect(r.body).toBe(src)
+  })
+
+  it('parses real <think> outside code blocks even when code blocks contain fake ones', () => {
+    const src = '<think>real</think>text\n```\n<think>fake</think>\n```'
+    const r = parseThinking(src, { streaming: false })
+    expect(r.segments).toEqual(['real'])
+    expect(r.body).toBe('text\n```\n<think>fake</think>\n```')
+  })
+
+  it('same-name nesting: inner tag absorbed into first segment (documented limitation)', () => {
+    const r = parseThinking('<think>a<think>b</think>c</think>', { streaming: false })
+    expect(r.segments).toEqual(['a<think>b'])
+    expect(r.body).toBe('c</think>')
+  })
+
+  it('handles chunk boundary: partial opening tag not yet identified', () => {
+    const mid = parseThinking('<thin', { streaming: true })
+    expect(mid.hasThinking).toBe(false)
+    expect(mid.body).toBe('<thin')
+
+    const after = parseThinking('<think>hi</think>done', { streaming: true })
+    expect(after.segments).toEqual(['hi'])
+    expect(after.body).toBe('done')
+  })
+})
+
+describe('countThinkingChars', () => {
+  it('counts all segments + pending as Unicode chars', () => {
+    const n = countThinkingChars({
+      segments: ['abc', '你好'],
+      pending: '🎉!',
+      body: '',
+      hasThinking: true,
+    })
+    expect(n).toBe(7)
+  })
+
+  it('returns 0 when no thinking', () => {
+    expect(countThinkingChars({ segments: [], pending: null, body: 'x', hasThinking: false })).toBe(0)
+  })
+})
+
+describe('detectThinkingBoundary', () => {
+  it('detects first appearance of opening tag', () => {
+    const r = detectThinkingBoundary('', '<think>x')
+    expect(r.startedAtBoundary).toBe(true)
+    expect(r.endedAtBoundary).toBe(false)
+  })
+
+  it('detects first appearance of closing tag', () => {
+    const r = detectThinkingBoundary('<think>hi', '<think>hi</think>')
+    expect(r.startedAtBoundary).toBe(false)
+    expect(r.endedAtBoundary).toBe(true)
+  })
+
+  it('detects both when both emerge in one delta', () => {
+    const r = detectThinkingBoundary('', '<think>x</think>')
+    expect(r.startedAtBoundary).toBe(true)
+    expect(r.endedAtBoundary).toBe(true)
+  })
+
+  it('reports no boundary when neither crossed', () => {
+    const r = detectThinkingBoundary('abc', 'abcdef')
+    expect(r.startedAtBoundary).toBe(false)
+    expect(r.endedAtBoundary).toBe(false)
+  })
+
+  it('ignores fake tags inside code blocks', () => {
+    const r = detectThinkingBoundary('', '```\n<think>fake</think>\n```')
+    expect(r.startedAtBoundary).toBe(false)
+    expect(r.endedAtBoundary).toBe(false)
+  })
+
+  it('is idempotent for repeated open/close after initial', () => {
+    const r = detectThinkingBoundary(
+      '<think>a</think><think>b',
+      '<think>a</think><think>b</think>',
+    )
+    expect(r.startedAtBoundary).toBe(false)
+    expect(r.endedAtBoundary).toBe(false)
+  })
+})