File format matters. Here’s the reliability ranking for how well AI reads different formats:.txt / .md — Minimal noise, clear structure (best)JSON / CSV — Great for structured dataDOCX — Fine if formatting is simpleDigital PDFs — Extraction can mix headers, footers, columnsPPTX — Text order can be unpredictableScanned PDFs / images — Worst; requires OCR, highly error-prone
How AI reads file formats and what they are good for