Back to projects
ToolsComplete

Switch to MD

Convert any file format to Markdown, enabling AI Agents to perfectly understand all content. One line of code, supports cloud & local models.

TypeScriptpnpmMonorepoPDFOpenAI

Overview

Switch to MD is a library that converts any file format to structured Markdown, enabling AI Agents to perfectly understand all document content. The core idea abstracts conversion into a "sniff → route → adapt → normalize" pipeline, with plugin support for local or cloud-based OCR and semantic analysis models.

Installation

# Core package (documents work out of the box)
npm i switch-to-md
 
# Image processing (optional)
npm i @switch-to-md/vision-openai  # OpenAI Vision
npm i @switch-to-md/vision-local   # Local Tesseract
 
# Audio processing (optional)
npm i @switch-to-md/audio-openai   # OpenAI Whisper
npm i @switch-to-md/audio-local    # Local Whisper

Quick Start

import { convert } from 'switch-to-md';
 
// Convert PDF
const md = await convert('report.pdf');
 
// Convert Word
const md = await convert('document.docx');
 
// Output to file
await convert('report.pdf', { output: 'report.md' });
 
// Batch conversion
const results = await convert(['a.pdf', 'b.docx']);

Supported Formats

CategoryFormatsNotes
DocumentsPDF, DOCX, XLSX, PPTX, TXT, HTML, EPUBNo config needed
ImagesPNG, JPG, GIF, BMP, TIFF, WebPRequires vision plugin
AudioMP3, WAV, FLAC, OGG, M4ARequires audio plugin
CodeJSON, YAML, XML, CSV, multi-language sourceNo config needed
VectorSVGNo config needed

Configuration

import { configure } from 'switch-to-md';
 
// OpenAI
configure({
  provider: 'openai',
  apiKey: process.env.OPENAI_API_KEY,
});
 
// Local models
configure({
  vision: { provider: 'local', model: 'tesseract' },
});

CLI Usage

# Convert file
npx switch-to-md convert report.pdf
 
# Batch convert
npx switch-to-md convert ./docs/**/*.pdf -o ./output/
 
# Setup wizard
npx switch-to-md setup

Output Format

---
source: report.pdf
type: PDF
pages: 12
extracted_at: 2026-04-29T00:00:00Z
backend: pdf-parse
---
 
# Document Title
 
## Body
[Content]
 
## Tables
| Col1 | Col2 |
|------|------|
| Data | Data |

Status

Core implementation complete, supporting PDF/Word/Excel/PPT/image/audio and other mainstream formats for Markdown conversion.