Python bindings for the htmd Rust library, a fast HTML to Markdown converter.
pip install htmd-py- Python 3.9+
You can customise the HTML to Markdown conversion with the following options:
heading_style: Style for headings (values fromhtmd.HeadingStyle)hr_style: Style for horizontal rules (values fromhtmd.HrStyle)br_style: Style for line breaks (values fromhtmd.BrStyle)link_style: Style for links (values fromhtmd.LinkStyle)link_reference_style: Style for referenced links (values fromhtmd.LinkReferenceStyle)code_block_style: Style for code blocks (values fromhtmd.CodeBlockStyle)code_block_fence: Fence style for code blocks (values fromhtmd.CodeBlockFence)bullet_list_marker: Marker for unordered lists (values fromhtmd.BulletListMarker)preformatted_code: Whether to preserve whitespace in inline code (boolean)skip_tags: List of HTML tags to skip during conversion (list of strings)
All options are exposed in a simple manner:
import htmd
# Simple conversion with default options
markdown = htmd.convert_html("<h1>Hello World</h1>")
print(markdown) # "# Hello World"
# Using custom options
options = htmd.Options()
options.heading_style = htmd.HeadingStyle.SETEX
options.bullet_list_marker = htmd.BulletListMarker.DASH
markdown = htmd.convert_html("<h1>Hello World</h1><ul><li>Item 1</li></ul>", options)
print(markdown)
# Skip specific HTML tags
options = htmd.create_options_with_skip_tags(["script", "style"])
markdown = htmd.convert_html("<h1>Hello</h1><script>alert('Hi');</script>", options)
print(markdown) # "# Hello" (script tag is skipped)Refer to the htmd docs for all available options.
The module provides enumeration-like objects for all option values:
import htmd
# HeadingStyle
htmd.HeadingStyle.ATX # "atx"
htmd.HeadingStyle.SETEX # "setex"
# HrStyle
htmd.HrStyle.DASHES # "dashes"
htmd.HrStyle.ASTERISKS # "asterisks"
htmd.HrStyle.UNDERSCORES # "underscores"
# BrStyle
htmd.BrStyle.TWO_SPACES # "two_spaces"
htmd.BrStyle.BACKSLASH # "backslash"
# LinkStyle
htmd.LinkStyle.INLINED # "inlined"
htmd.LinkStyle.REFERENCED # "referenced"
# LinkReferenceStyle
htmd.LinkReferenceStyle.FULL # "full"
htmd.LinkReferenceStyle.COLLAPSED # "collapsed"
htmd.LinkReferenceStyle.SHORTCUT # "shortcut"
# CodeBlockStyle
htmd.CodeBlockStyle.INDENTED # "indented"
htmd.CodeBlockStyle.FENCED # "fenced"
# CodeBlockFence
htmd.CodeBlockFence.TILDES # "tildes"
htmd.CodeBlockFence.BACKTICKS # "backticks"
# BulletListMarker
htmd.BulletListMarker.ASTERISK # "asterisk"
htmd.BulletListMarker.DASH # "dash"Tested with small (12 lines) and medium (1000 lines) markdown strings
- vs. markdownify: 10x (S) - 30x (M) faster
Maintained by lmmx. Contributions welcome!
- Issues & Discussions: Please open a GitHub issue or discussion for bugs, feature requests, or questions.
- Pull Requests: PRs are welcome!
- Install the dev extra (e.g. with uv:
uv pip install -e .[dev]) - Run tests (when available) and include updates to docs or examples if relevant.
- If reporting a bug, please include the version and the error message/traceback if available.
- Install the dev extra (e.g. with uv:
- htmd - The underlying Rust library
- Inspired by comrak - Python bindings for Comrak, a fast Markdown to HTML converter.
Licensed under the Apache License, Version 2.0.