StripHTML (formally known as PlainHTML) is a simple extension that is run at the end of post-processing. It searches the final output stripping out unwanted comments and/or tag attributes. Though it does its best to be loaded at the very end of the process, it helps to include this one last when loading up your extensions.
<!-- We are only allowing strip_comments and strip_js_on_attributes in this example. --> Here is a <strong onclick="myFunction();">test</strong>.
<p>Here is a <strong>test</strong>.</p>
Because comments aren't stripped until the end in a post-processing step, they are present throughout the entire Markdown conversion process and could possibly affect parsing, so be careful how you generally insert comments.
/// caution | Warning This is not meant to be a sanitizer for HTML. This is just meant to try and strip out style, script, classes, etc. to provide a plain HTML output for the times this is desired; this is not meant as a security extension. If you want something to secure the output, you should consider running a sanitizer like Bleach. ///
The StripHTML extension can be included in Python Markdown by using the following:
import markdown md = markdown.Markdown(extensions=['pymdownx.striphtml'])
By default, StripHTML strips the following attributes:
on<name>. StripHTML also strips HTML comments. If desired, its behavior can be configured to strip less or even more, but it is limited to attributes and comments.
| ||bool|| ||Strip HTML comments during post process.|
| ||[string]|| ||A list of tag attribute names to strip.|