AI Search Evaluation ④Answer Length — How Many Characters It Returns to the User on Average

til/applied-sciences/engineering/ai-search-metrics/04-answer-length

04-answer-length.mdupdated 2026-07-162394 words

ダブルクリックで英日反転

Applied Sciences · Engineering

AI Search Metric ④ — Answer Length

EN

Answer Length is the average character count (Japanese code points) of an AI search engine's response. It reveals each engine's explanatory stance and surfaces trade-offs with accuracy and latency.

Why It Matters

Too long → users stop reading (click drop-off); too short → low information density.
Longer answers correlate with more misinformation creep (trade-off with ③ accuracy).
Weakly correlated with ⑧ latency; HELM (Liang et al. 2023) treats it as an auxiliary metric.

Formula

Answer Length = Σ(character count per answer) ÷ (number of answers).
Measured in Japanese code points, not tokens; half-width alphanumerics count as 1 character.

Engine Benchmarks (Hypothetical)

ChatGPT Search: ~850 chars (moderate); Gemini Pro: ~1,400 (verbose).
AI Overview: ~320 chars — short enough to finish reading on the SERP itself.
AI Overview's brevity weakly correlates with a lower ① citation rate.

Usage in the ai-search Project

Measured across all 600 target queries; distributed by engine × intent category.
Factual queries tend to be short; comparison queries tend to be long.
Serves as a precondition for C4 (Agent Task Completion Fidelity); not a standalone construct.

→ Answer Length is an auxiliary signal: pair it with accuracy and latency to understand each engine's explanatory stance.

応用科学 · エンジニアリング

AI 検索評価指標 ④ — 回答長

JP

回答長とは、AI 検索エンジンが返す回答テキストの平均文字数（日本語コードポイント数）。各エンジンの「説明スタンス」を示し、精度・レイテンシとのトレードオフ分析に用いる。

なぜ重要か

長すぎると読み飛ばされ（離脱）、短すぎると情報密度が低下する。
回答が長いほど誤情報が混入しやすい（③ 精度とのトレードオフ）。
⑧ レイテンシ（応答時間）と弱い正の相関。HELM（Liang et al. 2023）でも補助指標として記録。

計算式

回答長＝ Σ（回答文字数）÷（回答数）
トークン数でなく日本語コードポイント数で測定。半角英数も 1 文字として計算。

エンジン別の比較例（仮想値）

ChatGPT Search: 約 850 字（中程度）、Gemini Pro: 約 1,400 字（冗長）。
AI Overview: 約 320 字 — SERP（検索結果ページ）上で読み切れるほどコンパクト。
AI Overview の短さは ① 引用率の低さとも弱い正の相関を示す。

ai-search プロジェクトでの活用

対象クエリ全 600 問で計測し、エンジン × 意図カテゴリ別に分布を確認。
ファクト系クエリは短く、比較系クエリは長い傾向（意図依存）。
C4「エージェントタスク完了忠実度」の前提条件として機能。単独の構成概念としては扱わない。

→ 回答長は補助シグナル。精度・レイテンシと組み合わせて各エンジンの説明スタンスを把握する。

Applied Sciences · Engineering

AI Search Metric ④ — Answer Length

EN

Answer Length is the average character count (Japanese code points) of an AI search engine's response. It reveals each engine's explanatory stance and surfaces trade-offs with accuracy and latency.

Why It Matters

Too long → users stop reading (click drop-off); too short → low information density.
Longer answers correlate with more misinformation creep (trade-off with ③ accuracy).
Weakly correlated with ⑧ latency; HELM (Liang et al. 2023) treats it as an auxiliary metric.

Formula

Answer Length = Σ(character count per answer) ÷ (number of answers).
Measured in Japanese code points, not tokens; half-width alphanumerics count as 1 character.

Engine Benchmarks (Hypothetical)

ChatGPT Search: ~850 chars (moderate); Gemini Pro: ~1,400 (verbose).
AI Overview: ~320 chars — short enough to finish reading on the SERP itself.
AI Overview's brevity weakly correlates with a lower ① citation rate.

Usage in the ai-search Project

Measured across all 600 target queries; distributed by engine × intent category.
Factual queries tend to be short; comparison queries tend to be long.
Serves as a precondition for C4 (Agent Task Completion Fidelity); not a standalone construct.

→ Answer Length is an auxiliary signal: pair it with accuracy and latency to understand each engine's explanatory stance.

応用科学 · エンジニアリング

AI 検索評価指標 ④ — 回答長

JP

回答長とは、AI 検索エンジンが返す回答テキストの平均文字数（日本語コードポイント数）。各エンジンの「説明スタンス」を示し、精度・レイテンシとのトレードオフ分析に用いる。

なぜ重要か

長すぎると読み飛ばされ（離脱）、短すぎると情報密度が低下する。
回答が長いほど誤情報が混入しやすい（③ 精度とのトレードオフ）。
⑧ レイテンシ（応答時間）と弱い正の相関。HELM（Liang et al. 2023）でも補助指標として記録。

計算式

回答長＝ Σ（回答文字数）÷（回答数）
トークン数でなく日本語コードポイント数で測定。半角英数も 1 文字として計算。

エンジン別の比較例（仮想値）

ChatGPT Search: 約 850 字（中程度）、Gemini Pro: 約 1,400 字（冗長）。
AI Overview: 約 320 字 — SERP（検索結果ページ）上で読み切れるほどコンパクト。
AI Overview の短さは ① 引用率の低さとも弱い正の相関を示す。

ai-search プロジェクトでの活用

対象クエリ全 600 問で計測し、エンジン × 意図カテゴリ別に分布を確認。
ファクト系クエリは短く、比較系クエリは長い傾向（意図依存）。
C4「エージェントタスク完了忠実度」の前提条件として機能。単独の構成概念としては扱わない。

→ 回答長は補助シグナル。精度・レイテンシと組み合わせて各エンジンの説明スタンスを把握する。

Related notes

148 notestil

AI Search Metric ④ — Answer Length

Why It Matters

Formula

Engine Benchmarks (Hypothetical)

Usage in the ai-search Project

AI 検索評価指標 ④ — 回答長

なぜ重要か

計算式

エンジン別の​比較例​（仮想値）

ai-search プロジェクトでの​活用

AI Search Metric ④ — Answer Length

Why It Matters

Formula

Engine Benchmarks (Hypothetical)

Usage in the ai-search Project

AI 検索評価指標 ④ — 回答長

なぜ重要か

計算式

エンジン別の​比較例​（仮想値）

ai-search プロジェクトでの​活用

Related notes

エンジン別の比較例（仮想値）

ai-search プロジェクトでの活用

エンジン別の比較例（仮想値）

ai-search プロジェクトでの活用