HTML to Markdown Converter API — Examples & Documentation

Convert HTML to clean Markdown with one API call. Turn HTML from CMS exports, web scrapers, WYSIWYG editors, and emails into deterministic GitHub Flavored Markdown. No per-project libraries. No maintenance. Same input → same output every time.

Try the API on RapidAPI →

What is the HTML to Markdown Converter API?

The HTML to Markdown Converter API is a REST API that converts arbitrary HTML into clean, deterministic Markdown. It strips scripts, styles, layout noise, and tracking attributes while preserving semantic structure—headings, lists, tables, links, images, and code blocks. Output is GitHub Flavored Markdown (GFM) compatible with GitHub, Notion, static site generators, and LLM pipelines.

Key Features

Feature	Description
Deterministic	Same HTML + options → same Markdown every time
Clean output	Strips scripts, event handlers, `data-*` attributes, inline styles
Malformed HTML	Best-effort parsing; handles unclosed tags, invalid nesting
Stateless	No data stored or logged; 25MB max per request
Three modes	`strict`, `readable` (default), `llm-friendly`

Use Cases

CMS migration — WordPress, Notion, Drupal HTML exports → Markdown
Web scraping — Normalize scraped HTML before search indexing or analytics
WYSIWYG output — TinyMCE, Quill, CKEditor HTML → version-controlled Markdown
LLM pipelines — Clean text for embeddings, RAG, or prompt context
Documentation — Migrate HTML docs to Markdown for GitHub, MkDocs, Docusaurus
Email processing — Extract readable content from HTML emails

Also Searchable As

HTML to Markdown API • HTML Markdown converter • CMS HTML to MD • WYSIWYG to Markdown • GitHub Flavored Markdown API • scraped HTML converter • document conversion API • content migration Markdown • LLM text preprocessing API

Quick Start

Endpoint: POST /convert
Try it: RapidAPI — HTML to Markdown Converter

cURL

curl -X POST "https://html-to-markdown-converter1.p.rapidapi.com/convert" \
  -H "Content-Type: application/json" \
  -H "x-rapidapi-key: YOUR_RAPIDAPI_KEY" \
  -H "x-rapidapi-host: html-to-markdown-converter1.p.rapidapi.com" \
  -d '{"html":"<h1>Hello</h1><p>World <strong>bold</strong></p>","mode":"readable"}'

JavaScript / Node.js

const response = await fetch('https://html-to-markdown-converter1.p.rapidapi.com/convert', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'x-rapidapi-key': 'YOUR_RAPIDAPI_KEY',
    'x-rapidapi-host': 'html-to-markdown-converter1.p.rapidapi.com'
  },
  body: JSON.stringify({
    html: '<h1>Hello</h1><p>World <strong>bold</strong></p>',
    mode: 'readable'
  })
});
const { markdown } = await response.json();
console.log(markdown); // "# Hello\n\nWorld **bold**"

Python

import requests

url = "https://html-to-markdown-converter1.p.rapidapi.com/convert"
headers = {
    "Content-Type": "application/json",
    "x-rapidapi-key": "YOUR_RAPIDAPI_KEY",
    "x-rapidapi-host": "html-to-markdown-converter1.p.rapidapi.com"
}
payload = {
    "html": "<h1>Hello</h1><p>World <strong>bold</strong></p>",
    "mode": "readable"
}
response = requests.post(url, json=payload, headers=headers)
data = response.json()
print(data["markdown"])

Raw HTML (text/html)

Send HTML directly as the request body. Mode defaults to readable.

curl -X POST "https://html-to-markdown-converter1.p.rapidapi.com/convert" \
  -H "Content-Type: text/html" \
  -H "x-rapidapi-key: YOUR_RAPIDAPI_KEY" \
  -H "x-rapidapi-host: html-to-markdown-converter1.p.rapidapi.com" \
  -d '<h1>Hello</h1><p>World</p>'

Real-World Examples

Example 1: CMS Content Pipeline

Ingest HTML from a CMS or third-party API and convert to Markdown for your search index or storage layer.

const API_URL = 'https://html-to-markdown-converter1.p.rapidapi.com/convert';
const RAPIDAPI_HEADERS = {
  'Content-Type': 'application/json',
  'x-rapidapi-key': process.env.RAPIDAPI_KEY,
  'x-rapidapi-host': 'html-to-markdown-converter1.p.rapidapi.com'
};

async function ingestCmsArticle(cmsHtml) {
  const res = await fetch(API_URL, {
    method: 'POST',
    headers: RAPIDAPI_HEADERS,
    body: JSON.stringify({ html: cmsHtml, mode: 'readable' })
  });
  const { markdown } = await res.json();
  return markdown; // Ready for Elasticsearch, DB, or file storage
}

Example 2: LLM Pipeline — RAG & Embeddings

Prepare clean text from web or CMS content for embeddings or RAG. Use llm-friendly mode for link references and predictable structure.

async function prepareForEmbedding(htmlContent) {
  const res = await fetch('https://html-to-markdown-converter1.p.rapidapi.com/convert', {
    method: 'POST',
    headers: RAPIDAPI_HEADERS,
    body: JSON.stringify({
      html: htmlContent,
      mode: 'llm-friendly',
      includeMetadata: true
    })
  });
  const { markdown, metadata } = await res.json();
  // Feed markdown to OpenAI, Cohere, or your embedding model
  return { markdown, charCount: metadata?.outputCharacterCount };
}

Example 3: Batch Doc Migration (WordPress → Markdown)

Convert WordPress or Notion exports in parallel. One schema, one service.

async function migrateDocsToMarkdown(htmlChunks) {
  const results = await Promise.all(
    htmlChunks.map(html =>
      fetch('https://html-to-markdown-converter1.p.rapidapi.com/convert', {
        method: 'POST',
        headers: {
          'Content-Type': 'text/html',
          'x-rapidapi-key': RAPIDAPI_KEY,
          'x-rapidapi-host': 'html-to-markdown-converter1.p.rapidapi.com'
        },
        body: html
      }).then(r => r.json())
    )
  );
  return results.map(r => r.markdown);
}

Example 4: Web Scraper → Search Index

Normalize scraped HTML before indexing. Strip ads, scripts, and layout; keep structure.

import requests

def scrape_and_convert(url: str) -> str:
    html = requests.get(url).text
    res = requests.post(
        "https://html-to-markdown-converter1.p.rapidapi.com/convert",
        headers={
            "Content-Type": "application/json",
            "x-rapidapi-key": RAPIDAPI_KEY,
            "x-rapidapi-host": "html-to-markdown-converter1.p.rapidapi.com"
        },
        json={"html": html, "mode": "strict"}
    )
    return res.json()["markdown"]

API Reference

POST /convert

Convert HTML to Markdown.

Request body (application/json):

Field	Type	Required	Default	Description
`html`	string	Yes	—	Raw HTML to convert
`mode`	string	No	`readable`	`strict`, `readable`, or `llm-friendly`
`includeMetadata`	boolean	No	`false`	Include character counts, tags removed

Alternative: Send Content-Type: text/html with raw HTML as body. Mode defaults to readable.

Modes:

strict — Minimal Markdown; maximum cleanup
readable — Balanced; human-readable (default)
llm-friendly — Link references; predictable structure for LLMs

Response (200):

{
  "markdown": "# Hello\n\nWorld **bold**"
}

With includeMetadata: true:

{
  "markdown": "# Hello\n\nWorld",
  "metadata": {
    "originalCharacterCount": 45,
    "outputCharacterCount": 18,
    "mode": "readable",
    "tagsRemoved": 2
  }
}

Error codes: MISSING_HTML, INVALID_HTML, PAYLOAD_TOO_LARGE, UNRECOVERABLE_PARSE_FAILURE

GET /health

Returns { "status": "ok" }.

Conversion Coverage

HTML	Markdown
Headings (h1–h6)	`#`, `##`, ...
Paragraphs	Blank-line separated
strong, em	`bold`, `italic`
Links	`[text](url)`
Images	`![alt](src)`
Lists (ul, ol)	`-` or `1.`
Tables	GitHub-flavored tables
Code blocks	Fenced ``` blocks
Blockquotes	`>`

Stripped: Scripts, styles, event handlers, data-* attributes, inline styles, javascript: links, input/button elements, SVG. Iframes and videos are converted to links.

Related APIs

Explore more developer tools from Precision Solutions Tech on RapidAPI:

API	Description
HTML to Markdown Converter	This API — convert HTML to Markdown
JSON Schema Validator	Validate JSON against structural schemas
JSON Diff Checker	Detect breaking changes between JSON versions
JSON Payload Consistency Checker	Detect data consistency issues in JSON
API Error & Status Normalization	Canonical error taxonomy and retry guidance
Sensitive Data Detection & Redaction	Detect and redact PII in text
Job Posting Normalization	Normalize job postings from 15+ job boards
Calendar Event Normalization	Normalize calendar events from Google, Outlook, Apple

View all APIs →

FAQ

Can I send raw HTML without JSON?

Yes. Use Content-Type: text/html and send the HTML as the request body. Mode defaults to readable.

Does the API fetch URLs?

No. You must send the HTML in the request body. URL fetching is not supported.

What if the HTML is malformed?

The API uses best-effort parsing. It handles unclosed tags, invalid nesting, and partial fragments. If parsing fails entirely, it returns 422 with UNRECOVERABLE_PARSE_FAILURE.

Is the output deterministic?

Yes. Same input + same options always produce the same Markdown. No randomness or timestamps.

What Markdown dialect is used?

GitHub Flavored Markdown (GFM)—tables, fenced code blocks, standard syntax. Compatible with GitHub, Notion, MkDocs, Docusaurus, and most renderers.

Is my data stored or logged?

No. The API is stateless. HTML is processed in memory and discarded.

Can I use this for LLM pipelines?

Yes. Use mode: "llm-friendly" for link references and predictable structure. Output is suitable for embeddings, RAG, and prompt context.

What's the maximum payload size?

25MB per request. Larger payloads return 413 PAYLOAD_TOO_LARGE.

Try HTML to Markdown Converter API on RapidAPI · All APIs by Precision Solutions Tech

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HTML to Markdown Converter API — Examples & Documentation

Table of Contents

What is the HTML to Markdown Converter API?

Key Features

Use Cases

Also Searchable As

Quick Start

cURL

JavaScript / Node.js

Python

Raw HTML (text/html)

Real-World Examples

Example 1: CMS Content Pipeline

Example 2: LLM Pipeline — RAG & Embeddings

Example 3: Batch Doc Migration (WordPress → Markdown)

Example 4: Web Scraper → Search Index

API Reference

POST /convert

GET /health

Conversion Coverage

Related APIs

FAQ

Can I send raw HTML without JSON?

Does the API fetch URLs?

What if the HTML is malformed?

Is the output deterministic?

What Markdown dialect is used?

Is my data stored or logged?

Can I use this for LLM pipelines?

What's the maximum payload size?

About

Uh oh!

Releases

Packages

License

precisionsolutionstech-netizen/html-to-markdown-normalizer-api

Folders and files

Latest commit

History

Repository files navigation

HTML to Markdown Converter API — Examples & Documentation

Table of Contents

What is the HTML to Markdown Converter API?

Key Features

Use Cases

Also Searchable As

Quick Start

cURL

JavaScript / Node.js

Python

Raw HTML (text/html)

Real-World Examples

Example 1: CMS Content Pipeline

Example 2: LLM Pipeline — RAG & Embeddings

Example 3: Batch Doc Migration (WordPress → Markdown)

Example 4: Web Scraper → Search Index

API Reference

POST /convert

GET /health

Conversion Coverage

Related APIs

FAQ

Can I send raw HTML without JSON?

Does the API fetch URLs?

What if the HTML is malformed?

Is the output deterministic?

What Markdown dialect is used?

Is my data stored or logged?

Can I use this for LLM pipelines?

What's the maximum payload size?

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages