# QA Log — ObesityPreprintRadar

날짜: 2026-05-21
디스클레이머: 본 도구는 참고용·연구용. 임상 결정 대체 금지. 모든 데이터는 합성(synthetic).

## 자동 검수 결과

### 1. app.py AST parse
- 명령: `python3 -c "import ast; ast.parse(open('app.py').read())"`
- 결과: PASS (`AST OK`)

### 2. JSON load
- `data/preprints.json` — PASS (preprints count = 51)
- `data/publications.json` — PASS (publications count = 25)
- `data/topics.json` — PASS (label rules count = 13)
- `data/kol_seed.json` — PASS (kols count = 10)

### 3. CLI summary 실행
- 명령: `python3 app.py --summary`
- 결과: PASS
- 출력:
  - Total preprints: 51
  - Server distribution: bioRxiv 22, medRxiv 14, Research Square 10, ChemRxiv 5 (4개 서버 모두 포함)
  - Average availability score: 2.49/5
  - Top 5 topics: incretin(18), leptin_axis(9), MC4R_pathway(8), sarcopenic_obesity(7), uncategorized(6)
  - Matched PubMed publications: 25/51
  - Average preprint->publication lag (days): 27.6
  - Cross-server dedup candidates: 0

### 4. requirements.txt pinned version
- 결과: PASS
- streamlit==1.36.0, pandas==2.2.2, python-docx==1.1.2 (모두 `==` pin)

## 기능 검수 (코드 정적 확인)
- 다중 preprint server 수집: 데이터에 bioRxiv/medRxiv/Research Square/ChemRxiv 4종 포함 (server distribution 출력)
- 공통 schema: id, server, doi, title, authors, affiliations, abstract, posted_date, version, subject, data_links, code_links, protocol_links
- availability scoring 0~5: `availability_score()` 함수 (data/code/protocol/Zenodo·OSF/v2+ 가산)
- 동일 저자 PubMed 추적: `compute_publication_lag()` — published/unpublished 분기, 평균 lag 산출
- watchlist sqlite: `watchlist_conn`, `add_watch`, `remove_watch`, `list_watches`
- dedup: `detect_duplicates` (Jaccard cross-server, threshold 0.72)
- digest ko/en + docx export: `build_digest`, `digest_to_docx`, `trend_to_docx`
- 외부 네트워크 호출: 없음 (모든 데이터 로컬 JSON)
- 모든 Streamlit 탭에 `_render_disclaimer(st)` 호출

## 최종 판정
- 모든 자동 검수 항목 PASS — 1회 통과 (재시도 불필요)