init: 导入团队知识库内容

This commit is contained in:
yueqian-ai
2026-05-14 16:56:48 +08:00
commit acca2041f0
1681 changed files with 285734 additions and 0 deletions
+544
View File
@@ -0,0 +1,544 @@
---
name: deep-research
description: |
Generate format-controlled research reports with evidence tracking, citations, source governance, and multi-pass synthesis.
This skill should be used when users request a research report, literature review, market or industry analysis,
competitive landscape, policy or technical brief. Triggers: "帮我调研一下", "深度研究", "综述报告", "深入分析",
"research this topic", "write a report on", "survey the literature on", "competitive analysis of",
"技术选型分析", "竞品研究", "政策分析", "行业报告".
V6 adds: source-type governance, AS_OF freshness checks, mandatory counter-review, and citation registry. V6.1 adds: source accessibility (circular verification forbidden, exclusive advantage encouraged).
---
# Deep Research
Create high-fidelity research reports with strict format control, evidence mapping, source governance, and multi-pass synthesis.
## Architecture: Lead Agent + Subagents
```
Lead Agent (coordinator — minimizes raw search context)
|
P0: Environment + source policy setup
|
P1: Research Task Board (roles, queries, parallel groups)
|
Dispatch ──→ Subagent A ──→ writes task-a.md ──┐
──→ Subagent B ──→ writes task-b.md ──┤ (parallel)
──→ Subagent C ──→ writes task-c.md ──┘
| |
| research-notes/ <────────────────────────┘
|
P2: Build citation registry with source_type + as_of + authority
P3: Evidence-mapped outline with counter-claim flags
P4: Draft from notes (never from raw search results)
P5: Counter-review (claims, confidence, alternatives)
P6: Verify (every [n] in registry, traceability check)
P7: Polish → final report with confidence markers
```
**Context efficiency:** Subagents' raw search results stay in their context and are discarded. Lead agent sees only distilled notes (~60-70% context reduction).
## Mode Selection
Determine the research mode before starting:
| Dimension | Options |
|-----------|---------|
| **Topic Mode** | Enterprise Research (company/corporation) OR General Research (industry/policy/tech) |
| **Depth Mode** | Standard (5-6 tasks, 3000-8000 words) OR Lightweight (3-4 tasks, 2000-4000 words) |
- **Enterprise Research Mode**: Six-dimension data collection with structured analysis frameworks (SWOT, risk matrix, competitive barrier quantification)
- **General Research Mode**: Standard P0-P7 research pipeline with source governance
- **Depth Selection**: Lightweight for single entity/concept < 30 words; Standard for multi-entity comparison or "深入"/"comprehensive" requests
## Source Governance (V6)
### Source Accessibility Classification
**CRITICAL RULE**: Every source must be classified by accessibility:
| Accessibility | Definition | Examples | Usage Rule |
|--------------|------------|----------|------------|
| `public` | Available to any external researcher without authentication | Public websites, news articles, WHOIS (without privacy), academic papers | ✅ Always allowed |
| `semi-public` | Requires registration or limited access | LinkedIn profiles, Crunchbase basic, industry reports (free tier) | ✅ Allowed with disclosure |
| `exclusive-user-provided` | User's paid subscriptions, private APIs, proprietary databases | Crunchbase Pro, PitchBook, private data feeds, internal databases | ✅ **ALLOWED** for third-party research |
| `private-user-owned` | User's own accounts when researching themselves | User's registrar for user's own company, user's bank for user's own finances | ❌ **FORBIDDEN** - circular verification |
**⚠️ CIRCULAR VERIFICATION BAN**: You must NOT:
- Use user's private data to "discover" what they already know about themselves
- Research user's own company by accessing user's private accounts
- Present user's private knowledge as "research findings"
**✅ EXCLUSIVE INFORMATION ADVANTAGE**: You SHOULD:
- Use user's Crunchbase Pro to research competitors
- Use user's proprietary databases for market research
- Use user's private APIs for investment analysis
- Leverage any exclusive source user provides for third-party research
### Source Type Labels
Every source MUST also be tagged with:
| Label | Definition | Examples |
|-------|------------|----------|
| `official` | Primary source, official documentation | Company SEC filings, government reports, official blog |
| `academic` | Peer-reviewed research | Journal articles, conference papers, dissertations |
| `secondary-industry` | Professional analysis | Industry reports, analyst coverage, trade publications |
| `journalism` | News reporting | Reputable media outlets, investigative journalism |
| `community` | User-generated content | Forums, reviews, social media, Q&A sites |
| `other` | Uncategorized or mixed | Aggregators, unverified sources |
**Quality Gates:**
- Standard mode: ≥30% official sources in final approved set
- Lightweight mode: ≥20% official sources
- Maximum single-source share: ≤25% (Standard), ≤30% (Lightweight)
- Minimum unique domains: 5 (Standard), 3 (Lightweight)
## AS_OF Date Policy
Set `AS_OF` date explicitly at P0. For all time-sensitive claims:
- Include source publication date with every citation
- Downgrade confidence if source is older than relevant horizon
- Flag stale sources in registry (studies >3 years, news >6 months for fast-moving topics)
## P0: Environment & Policy Setup
Check capabilities before starting:
| Check | Requirement | Impact if Missing |
|-------|-------------|-------------------|
| web_search available | Required | Stop - cannot proceed |
| web_fetch available | Required for DEEP tasks | SCAN-only mode |
| Subagent dispatch | Preferred | Degrade to sequential |
| Filesystem writable | Required | In-memory notes only |
Set policy variables:
- `AS_OF`: Today's date (YYYY-MM-DD) - mandatory for timed topics
- `MODE`: Standard (default) or Lightweight
- `SOURCE_TYPE_POLICY`: Enforce official/academic/secondary/journalism/community/other labels
- `COUNTER_REVIEW_PLAN`: What opposing interpretation to test
Report: `[P0 complete] Subagent: {yes/no}. Mode: {standard/lightweight}. AS_OF: {YYYY-MM-DD}.`
When researching a specific company/enterprise, follow this specialized workflow that ensures six-dimension coverage, quantified analysis frameworks, and three-level quality control.
### Enterprise Workflow Overview
```
Enterprise Research Progress:
- [ ] E1: Intake — confirm company entity, research depth, format contract
- [ ] E2: Six-dimension data collection (parallel where possible)
- [ ] D1: Company fundamentals (entity, founding, funding, ownership)
- [ ] D2: Business & products (segments, products, revenue structure)
- [ ] D3: Competitive position (industry rank, competitors, barriers)
- [ ] D4: Financial & operations (3-year financials, efficiency metrics)
- [ ] D5: Recent developments (6-month events, strategic signals)
- [ ] D6: Internal/proprietary sources (or note limitation)
- [ ] E3: Structured analysis frameworks
- [ ] SWOT analysis (evidence-backed, 4 quadrants × 3-5 entries)
- [ ] Competitive barrier quantification (7 dimensions, weighted score)
- [ ] Risk matrix (8 categories, probability × impact)
- [ ] Comprehensive scorecard (6 dimensions, weighted total)
- [ ] E4: L1/L2/L3 quality checks at each stage transition
- [ ] E5: Draft report using 7-chapter enterprise template
- [ ] E6: Multi-pass drafting + UNION merge (same as general Step 6-7)
- [ ] E7: Present draft for human review and iterate
```
## P1: Research Task Board
Decompose the research question into 4-6 investigation tasks (Standard) or 3-4 tasks (Lightweight).
Each task assignment includes:
- **Expert Role**: Specialist persona (e.g., "Policy Historian", "Ecosystem Mapper")
- **Objective**: One-sentence investigation goal
- **Queries**: 2-3 pre-planned search queries
- **Depth**: DEEP (fetch 2-3 full articles) or SCAN (snippets sufficient)
- **Output**: Path to research notes file
- **Parallel Group**: Group A (independent) or Group B (depends on Group A)
### Task Decomposition Rules
1. Each task covers one coherent sub-topic a specialist would own
2. Group A tasks must be independent and source-diverse
3. Max 3 tasks per parallel group (concurrency limit)
4. Every task must flag time-sensitive claims and expected citation aging risk
### Enterprise Research Integration
When in Enterprise Research Mode, task board maps to six dimensions:
- Task A: Company fundamentals (entity, founding, funding, ownership)
- Task B: Business & products (segments, products, revenue structure)
- Task C: Competitive position (industry rank, competitors, barriers)
- Task D: Financial & operations (3-year financials, efficiency metrics)
- Task E: Recent developments (6-month events, strategic signals)
- Task F: Internal/proprietary sources (or document limitation)
Report: `[P1 complete] {N} tasks in {M} groups. Dispatching Group A.`
---
## Enterprise Research Mode (Specialized Pipeline)
When researching a specific company/enterprise, follow this specialized workflow that ensures six-dimension coverage, quantified analysis frameworks, and three-level quality control.
### E1: Intake
Same as P0/P1 above, plus:
- Confirm the exact legal entity being researched (parent vs subsidiary)
- Select research depth: Quick scan (3-5 pages) / Standard (10-20 pages) / Deep (20-40 pages)
- Identify any specific comparison targets (benchmark companies)
## P2: Dispatch + Investigate
Subagents execute tasks using [references/subagent_prompt.md](references/subagent_prompt.md) and output to [references/research_notes_format.md](references/research_notes_format.md).
### With Subagents (Claude Code / Cowork / DeerFlow)
1. Dispatch Group A tasks in parallel (max 3 concurrent)
2. Each subagent searches, fetches, and tags source types
3. Every source line includes `Source-Type` and `As Of`
4. Wait for Group A completion
5. Dispatch Group B (can read Group A notes)
### Subagent Output Requirements
Each task-{id}.md must contain:
- **Sources section**: URLs from actual search results with Source-Type, As Of, Authority (1-10)
- **Findings section**: Max 10 one-sentence facts with source numbers
- **Deep Read Notes** (DEEP tasks): 2-3 sources read in full with key data/insights
- **Gaps section**: What was searched but NOT found, alternative interpretations
### Without Subagents (Degraded Mode)
Lead agent executes tasks sequentially, acting as each specialist. Raw search results are discarded after writing notes.
### Enterprise Research: Six-Dimension Collection
Follow [references/enterprise_research_methodology.md](references/enterprise_research_methodology.md) for:
- Detailed collection workflow per dimension (query strategies, data fields, validation)
- Data source priority matrix (P0-P3 ranking)
- Cross-validation rules (min sources, max deviation thresholds)
**Key principles**:
- Evidence-driven: every conclusion must trace to a citable source
- Multi-source validation: key data requires ≥2 independent sources
- Restrained judgment: mark speculation explicitly, avoid unsubstantiated claims
- Structured presentation: complex information via tables, lists, hierarchies
Run L1 quality check after completing each dimension (see enterprise_quality_checklist.md).
Status per task: `[P2 task-{id} complete] {N} sources, {M} findings.`
Status all: `[P2 complete] {N} tasks done, {M} total sources. Building registry.`
### E3: Structured Analysis Frameworks
Apply frameworks from [references/enterprise_analysis_frameworks.md](references/enterprise_analysis_frameworks.md) in order:
1. **SWOT analysis** — each entry with evidence + source + impact assessment
2. **Competitive barrier quantification** — 7 dimensions with weighted scoring → A+/A/B+/B/C+/C rating
3. **Risk matrix** — 8 mandatory categories, probability × impact → Red/Yellow/Green
4. **Comprehensive scorecard** — 6-dimension weighted total → X/10
Run L2 quality check after analysis is complete.
### E4: Quality Control
Three-level checks from [references/enterprise_quality_checklist.md](references/enterprise_quality_checklist.md):
- **L1 (Data)**: Source count, attribution, cross-validation, timeliness
- **L2 (Analysis)**: SWOT completeness, risk coverage, barrier scoring, conclusion support
- **L3 (Document)**: Structure compliance, format consistency, readability, appendices
### E5: Draft Using Enterprise Template
Use the 7-chapter enterprise report template from enterprise_quality_checklist.md:
1. Company Overview
2. Business & Product Structure
3. Market & Competitive Position
4. Financial & Operations Analysis
5. Risks & Concerns
6. Recent Developments
7. Comprehensive Assessment & Conclusion
Plus appendices: Data Source Index, Glossary, Disclaimer.
### E3-E7: Enterprise Analysis, Drafting, and Review
- **E3: Structured Analysis** — Apply frameworks from [references/enterprise_analysis_frameworks.md](references/enterprise_analysis_frameworks.md)
- **E4: Quality Control** — Run L1/L2/L3 checks per [references/enterprise_quality_checklist.md](references/enterprise_quality_checklist.md)
- **E5: Draft** — Use 7-chapter enterprise template
- **E6-E7: Multi-Pass Drafting and Review** — Same as P4-P7 below
---
## P3: Citation Registry + Source Governance
Lead agent reads all task notes and builds unified registry.
### Registry Process
1. Read every task file's `## Sources` section
2. Merge all sources, deduplicate by URL
3. Assign sequential [n] numbers by first appearance
4. Tag: source_type, as_of date, authority score (1-10), task id
5. **Apply quality gates:**
- Standard: ≥12 approved sources, ≥5 unique domains, ≥30% official
- Lightweight: ≥6 approved sources, ≥3 unique domains, ≥20% official
- Max single-source share: ≤25% (Standard), ≤30% (Lightweight)
6. **Drop sources** below threshold and list them explicitly
### Registry Output Format
```
CITATION REGISTRY
Approved:
[1] Author/Org — Title | URL | Source-Type: official | Accessibility: public | Date: 2026-03-01 | Auth: 8 | task-a
[2] ...
Dropped:
x Source | URL | Source-Type: community | Accessibility: privileged | Auth: 3 | Reason: PRIVILEGED SOURCE - NOT ALLOWED
Stats: {approved}/{total}, {N} domains, official_share {xx}%
Privileged sources rejected: {N}
```
**Critical rule:** These [n] are FINAL. P5 may only cite from Approved list. Dropped sources never reappear.
**Circular verification handling**: When researching the user's own company/assets, if you discover data in user's private accounts (e.g., user's domain registrar showing they own domains), you MUST:
1. Reject it from the registry (user already knows this)
2. Note it as "CIRCULAR - USER ALREADY KNOWS" in Dropped
3. Search for equivalent PUBLIC sources (e.g., public WHOIS, news articles)
4. Report from external investigator perspective only
**Exclusive source handling**: When user EXPLICITLY PROVIDES their paid subscriptions or private APIs for third-party research (e.g., "Use my Crunchbase Pro to research competitors"), you SHOULD:
1. Accept it as "exclusive-user-provided" accessibility
2. Use it as competitive advantage
3. Cite it properly in registry
4. If no public equivalent exists, mark as [unverified] or omit the claim
Report: `[P3 complete] {approved}/{total} sources. {N} domains. Official share: {xx}%. Privileged rejected: {N}.`
### Handling Information Black Box
When researching entities with no public footprint (like the "字节跳动子公司" example):
**What an external researcher would find:**
- WHOIS: Privacy protected → No owner info
- Web search: No news, no press releases
- Social media: No company pages
- Business registries: No public API or requires local access
- Result: **Complete information black box**
**Correct response:**
```
Findings: NO PUBLIC INFORMATION AVAILABLE
Sources checked:
- WHOIS (public): Privacy protected [failed]
- Company registry (public): Access denied/No API [failed]
- News media: No coverage [failed]
- Corporate website: Placeholder only [minimal]
Verdict: UNABLE TO VERIFY COMPANY EXISTENCE from external perspective
Sources found: 0 (or minimal, e.g., only WHOIS showing domain exists)
Confidence: N/A - Insufficient evidence
```
**DO NOT:**
- ❌ Use user's own credentials to "fill in the gaps"
- ❌ Assume the company exists based on domain registration alone
- ❌ Fill missing data with speculation
- ❌ Claim to have "verified" information you accessed through privileged means
**DO:**
- ✅ Clearly state what an external researcher can/cannot verify
- ✅ Document all failed search attempts
- ✅ Mark claims as [unverified] or omit entirely
- ✅ Downgrade mode to Lightweight or stop if insufficient public sources
- ✅ Recommend direct contact for due diligence
---
## P4: Evidence-Mapped Outline
Lead agent reads notes + registry to build outline.
1. Identify cross-task patterns
2. Design sections topic-first, not task-order-first
3. Map each section to specific findings with source numbers
4. Flag sections needing counter-review
5. Mark recency-sensitive claims with AS_OF checks
Outline format:
```
## N. {Section Title}
Sources: [1][3][7] from tasks a, b
Claims: {claim from task-a finding 3}, {claim from task-b finding 1}
Counter-claim candidates: {alternative explanations}
Recency checks: {source dates + AS_OF}
Gaps: {limited official evidence}
```
---
## P5: Draft from Notes
Write section by section using [references/report_template_v6.md](references/report_template_v6.md).
**Rules:**
- Every factual claim needs citation [n]
- Numbers/percentages must have source
- Add **confidence marker** per section: High/Medium/Low with rationale
- Add **counter-claim sentence** when evidence conflicts
- No new sources may be introduced
- Use [unverified] for unsupported statements
**Anti-hallucination:**
- Lead agent never invents URLs — only from subagent notes
- Lead agent never fabricates data — mark [unverified] if number not in notes
Status: `[P5 in progress] {N}/{M} sections, ~{words} words.`
---
## P6: Counter-Review (Mandatory)
For each major conclusion, perform opposite-view checks:
1. **Could the conclusion be wrong?**
2. **Which high-impact claims depend on a single source?**
3. **Which claims lack official/academic support?**
4. **Are stale sources used for time-sensitive claims?**
5. **Find ≥3 issues** (re-examine if 0 found)
### Using Counter-Review Team (Recommended)
For comprehensive parallel review, use the Counter-Review Team:
```bash
# 1. Prepare inputs
counter-review-inputs/
├── draft_report.md
├── citation_registry.md
├── task-notes/
└── p0_config.md
# 2. Dispatch to 4 specialist agents in parallel
SendMessage to: claim-validator
SendMessage to: source-diversity-checker
SendMessage to: recency-validator
SendMessage to: contradiction-finder
# 3. Wait for all specialists to complete
# 4. Send to coordinator for synthesis
SendMessage to: counter-review-coordinator
inputs: [4 specialist reports]
# 5. Receive final P6 Counter-Review Report
```
See [references/counter_review_team_guide.md](references/counter_review_team_guide.md) for detailed usage.
### Manual Counter-Review (Fallback)
If Counter-Review Team is unavailable, perform manual checks:
- Verify every high-confidence claim has ≥2 sources
- Check official/academic backing for key claims
- Verify AS_OF dates on time-sensitive claims
- Document opposing interpretations
### Output
Include in final report:
```
## 核心争议 / Key Controversies
- **争议 1:** [主张 A 与反向证据 B 对比] [n][m]
- **争议 2:** ...
```
Report: `[P6 complete] {N} issues found: {critical} critical, {high} high, {medium} medium.`
---
## P7: Verify
Cross-check before finalization:
1. **Registry cross-check:** List every [n] in report vs approved registry
2. **Spot-check 5+ claims:** Trace to task notes
3. **Remove/fix non-traceable claims**
4. **Validate no dropped source resurrected**
5. **Check source concentration** for key claims
Report: `[P7 complete] {N} spot-checks, {M} violations fixed.`
---
## Output Requirements
- Match the requested language and tone
- Preserve technical terms in English
- Respect the report spec and formatting rules
- Include a references section or bibliography
## Reference Files
### Core V6 Pipeline References
| File | When to Load |
| --- | --- |
| [source_accessibility_policy.md](references/source_accessibility_policy.md) | **P0 (CRITICAL)**: Source classification rules - read first |
| [subagent_prompt.md](references/subagent_prompt.md) | P2: Task dispatch to subagents |
| [research_notes_format.md](references/research_notes_format.md) | P2: Subagent output format |
| [report_template_v6.md](references/report_template_v6.md) | P5: Draft with confidence markers and counter-review |
| [quality_gates.md](references/quality_gates.md) | All phases: Quality thresholds and anti-hallucination checks |
### General Research References
| File | When to Load |
| --- | --- |
| [research_report_template.md](references/research_report_template.md) | Build outline and draft structure |
| [formatting_rules.md](references/formatting_rules.md) | Enforce section formatting and citation rules |
| [source_quality_rubric.md](references/source_quality_rubric.md) | Score and triage sources |
| [research_plan_checklist.md](references/research_plan_checklist.md) | Build research plan and query set |
| [completeness_review_checklist.md](references/completeness_review_checklist.md) | Review for coverage, citations, and compliance |
### Enterprise Research References (load when in Enterprise Research Mode)
| File | When to Load |
| --- | --- |
| [enterprise_research_methodology.md](references/enterprise_research_methodology.md) | Six-dimension data collection workflow, source priority, cross-validation rules |
| [enterprise_analysis_frameworks.md](references/enterprise_analysis_frameworks.md) | SWOT template, competitive barrier quantification, risk matrix, comprehensive scoring |
| [enterprise_quality_checklist.md](references/enterprise_quality_checklist.md) | L1/L2/L3 quality checks, per-dimension checklists, 7-chapter report template |
## Anti-Patterns
- Single-pass drafting without parallel complete passes
- Splitting passes by section instead of full report drafts
- Ignoring the format contract or user template
- Claims without citations or evidence table mapping
- Mixing conflicting dates without calling out discrepancies
- Copying external AI output without verification
- Deleting intermediate drafts or raw research outputs
- **Lead agent reading raw search results** — only read subagent notes
- **Inventing URLs** — only use URLs from actual search results
- **Resurrecting dropped sources** — dropped in P3 never reappear
- **Missing AS_OF for time-sensitive claims** — always include source date
- **Skipping counter-review** — mandatory P6 must find ≥3 issues
- **CIRCULAR VERIFICATION** — never use user's private data to "discover" what they already know about themselves
- **IGNORING EXCLUSIVE SOURCES** — when user provides Crunchbase Pro etc. for competitor research, USE IT
## Next Step: Verify and Deliver
After completing research, suggest verification and output:
```
Research report complete: [N] sources cited, [M] claims made.
Options:
A) Verify facts — run /fact-checker on the report (Recommended)
B) Create slides — run /ppt-creator from the findings
C) Export as PDF — run /pdf-creator for formal delivery
D) No thanks — the report is ready as-is
```