Skip to content

Blocks Extension API

Block Structure

The various different block types are created via the Block class. The Block class can define all the various parts and handling of the various block parts. The basic structure of a block is shown below:

/// name | argument
    options: per block options

Markdown content.
///

Block Extension Anatomy

Normally with Python Markdown, you'd create a processor derived from the various processor types available in the library. You'd then derive an extension from markdown.Extension that would register the the processor. Block extensions are very similar.

A Block extension is comprised of two parts: the Block object and the BlocksExtension. It should be noted that we do not use markdown.Extension, but BlocksExtension which is derived from it. This is done so we can abstract away the management of all the various registered Block extensions. It is important to note though that when using the BlocksExtension that we do not override the extendMarkdown method, but instead override extendMarkdownBlocks. In all other respects, BlocksExtension is just like markdown.Extension and you can register traditional processors via the md object or register Block objects via the block_mgr object.

Below we have the very bare minimum required to create an extension.

from pymdownx.blocks import BlocksExtension
from pymdownx.blocks.block import Block
import xml.etree.ElementTree as etree

class MyBlock(Block):
    NAME = 'my-block'

    def on_create(self, parent):

        return etree.SubElement(parent, 'div')


class MyBlockExtension(BlocksExtension):

    def extendMarkdownBlocks(self, md, block_mgr):

        block_mgr.register(MyBlock, self.getConfigs())


def makeExtension(*args, **kwargs):
    """Return extension."""

    return MyBlockExtension(*args, **kwargs)

Then we can register and run it:

import markdown

MD = """
/// my-block
content
///
"""

print(markdown.markdown(MD, extensions=[MyBlockExtension()]))
<div>
<p>content</p>
</div>

The Block Object

The block object allows us to define the name of a block, whether an argument and/or options are allowed, and how we generally handle the content.

Global Options

Global options are often set in the BlocksExtension object like traditional extensions. They are meant to globally control the behavior of a specific block type:

class AdmonitionExtension(BlocksExtension):
    """Admonition Blocks Extension."""

    def __init__(self, *args, **kwargs):
        """Initialize."""

        self.config = {
            "types": [
                ['note', 'attention', 'caution', 'danger', 'error', 'tip', 'hint', 'warning'],
                "Generate Admonition block extensions for the given types."
            ]
        }

        super().__init__(*args, **kwargs)

These options are available in the instantiated Block object via the self.config attribute.

Tracking Data Across Blocks

There are times when it can be useful to store data across multiple blocks. Each block instance has access to a tracker that is specific to a specific block type and persists across all blocks. It is only cleared when reset is called on the Markdown object.

The tracker is accessed by a given Block extension via the self.tracker attribute. The attribute contains a dictionary where various keys and values can be stored. This can be used to count blocks on a page or anything else you can think of.

Accessing the Markdown Object

Some plugins occasionally need access to the current Markdown object. If this is needed, it can be accessed via the class attribute self.md.

Argument

The argument is used to declare a common block specific input for a particular block type. This is often, but not exclusively, used for things like titles. It is specified on the same line as the initial block deceleration.

Blocks are not required to use an argument and it is not required by default and must be declared as either optional or required in order for the block to accept an argument. An argument is declared by setting ARGUMENT to True if it is required, None if it is optionally allowed, or False if it is not allowed.

The argument is always parsed as a single string, if it is desired to validate the format of the argument or even to process it as multiple arguments, this can be done in the on_validate event.

class MyBlock(Block):
    # Name used for the block
    NAME = 'my-block'
    ARGUMENT = True

Options

Options is how a block specifies any per block features via an indented YAML block immediately after the block declaration. The YAML indented block is considered a part of the header and is great for options that don't make sense as part of the first line declaration.

An option consists of a keyword to specify the option name, and then a list containing the default value and a validator callback. The callback function should take the input and validate the type and/or coerce the value to an appropriate value. If the input, for whatever reason, is deemed invalid, the callback function should raise an error.

After processing, all options will be available as a dictionary via the instance attribute self.options. Options will be accessible via the keyword and will return the resolved value.

Built-in validators

A number of Built-in validators are provided. Check out Built-in Validators to learn more, or feel free to write your own.

class MyBlock(Block):
    # Name used for the block
    NAME = 'my-block'
    OPTIONS = {
        'tag_name': ['default', type_html_indentifier]
    }

Warning

attrs is a reserved option that is automatically applied to all Block extensions. This should not be overridden. attrs takes a dictionary of str keys and str values describing the attributes to apply to the outer element of the block as returned by the on_create.

The attrs input input is sent through type_html_attribute_dict and is accessible to developers via self.options['attrs']. The result is a dictionary of key/value pairs where the key is a str and the value is a str (or list[str] in the special case of class).

is_raw

def is_raw(self, tag: Element) -> bool:
    ...

This method, given a tag will determine if the block should be considered a "raw" tag based on the Blocks extension's internal logic.

is_block

def is_block(self, tag: Element) -> bool:
    ...

This method, given a tag will determine if the block should be considered a "block" tag based on the Blocks extension's internal logic.

html_escape

def html_escape(self, text: str) -> str:
    ...

Takes a string intended for an HTML tag's content and returns it after applying HTML escaping on it. Escapes &, <, and >.

on_init Event

def on_init(self) -> None:
    ...

The on_init event is run every time a new block class is instantiated. This is usually where a specific block type would handle global options and initialize class variables that are needed. If the specified block name in Markdown matches the name of a registered block, that block class will be instantiated, triggering the on_init event to execute. Each block in a document that is encountered generates its own, new instance.

Only the global config is available at this time via self.config. The Markdown object is also available via self.md.

The can be a good way to perform setup based on on global or local options.

on_validate Event

def on_validate(self, parent: Element) -> bool:
    ...

Executed right after the per block argument and option parsing occurs. The argument and options are accessible via self.argument and self.options. parent is the current parent element.

on_validate is a hook meant to allow the developer to invalidate a block if the options, argument, or even the parent element do not meet some arbitrary criteria. This hook can also be used to make adjustments variables and even do some initialization of class variables based on the results of specific options, arguments, or even the parent element.

If validation fails, False should be returned and the block will not be parsed as a generic block.

on_create Event

    def on_create(self, parent: Element) -> Element:
        ...

Called when a block is initially found and initialized. The on_create method should create the container for the block under the parent element. Other child elements can be created on the root of the container, but outer element of the created container should be returned.

on_add Event

def on_add(self, block: Element) -> Element:
    ...

When any calls occur to process new content, on_add is called. This gives the block a chance to return the element where the content is desired.

This can be useful if the outer element is not the element where the content should go. Keep in mind that content can also be rearranged if needed in the on_end event.

on_markdown Event

def on_markdown(self) -> str:
    """Check how element should be treated by the Markdown parser."""
    ...

The on_markdown event is used to declare how the content of the block should be handled by the Markdown parser. A string with one of the following values must be returned. All content is treated as HTML content and is stored under the etree element returned via the on_add event.

Only during the on_end event will all the content be fully accumulated and processed by relevant block processors, and only during the on_inline_end event will both block and inline processing be completed.

Result Value Description
block Parsed block content will be handled by the Markdown parser as content under a block element.
inline Parsed block content will be handled by the Markdown parser as content under an inline element.
raw Parsed block content will be preserved as is. No additional Markdown parsing will be applied. Content is expected to be indented and should be documented as such.
auto Depending on whether the wrapping parent is a block element, inline element, or something like a code element, Blocks will choose the best approach for the content. Decision is made based on the element returned by the on_add event.

When using raw mode, all text will be accumulated under the specified element as an AtomicString. If nothing is done with the content during the on_end event, all the content will be HTML escaped by the Python Markdown parser. If desired, the content can be placed into the Python Markdown HTML stash which will protect it from any other rouge Treeprocessors. Keep in mind, if the content is stashed HTML escaping will not be applied automatically, so HTML escape if it is required.

Indent Raw Content

Because Python Markdown implements HTML processing as a preprocessor, content for a raw block must be indented 4 spaces to avoid the HTML processing step. The content will not be indented when it reaches the on_end event. Failure to indent will still allow the code to be processed, but it may not process as expected. An extension that uses raw should make clear that this is a requirement to avoid unexpected results.

on_end Event

def on_end(self, block: Element) -> None:
    ...

When a block is parsed to completion, the on_end event is executed. This allows an extension to perform any post processing on the elements. You could save the data as raw text and then parse it special at the end or you could walk the HTML elements and move content around, add attributes, or whatever else is needed.

on_inline_end Event

def on_inline_end(self, block: Element) -> None:
    ...

When a block is parsed to completion and all inline parsing has been applied, the on_inline_end event is executed. It is the very last event for a block. This allows an extension to perform any post processing on an element after inline processing.

Built-in Validators

A number of validators are provided via for the purpose of validating YAML option inputs. If what you need is not present, feel free to write your own. All validators are imported from pymdownx.blocks.block.

type_any

def type_any(value: Any) -> Any:
    ...

This takes a YAML input and simply passes it through. If you do not want to validate the input because it does not need to be checked, or if you just want to do it manually in the on_validate event, then this is what you'd want to use.

class Block:
    OPTIONS = {'name': [{}, type_any]}

type_none

def type_none(value: Any) -> None:
    ...

This takes a YAML input and ensures it is None (or null) in YAML. This is most useful paired with other types to indicate the option is "unset". See type_multi to learn how to combine multiple existing types.

class Block:
    OPTIONS = {'name': [none, type_multi(type_none, type_string)]}

type_number

def type_number(value: Any) -> int | float:
    ...

Takes a YAML input value and verifies that it is a float or int.

Returns the valid number (float or int) or raises a ValueError.

class Block:
    OPTIONS = {'keyword': [0.0, type_number]}

type_integer

def type_integer(value: Any) -> int:
    ...

Takes a YAML input value and verifies that it is an int.

Returns the valid int or raises a ValueError.

class Block:
    OPTIONS = {'keyword': [0, type_integer]}

type_ranged_number

def type_ranged_number(minimum: int | float = None, maximum: int | float = None) -> Callable[[Any], int | float]:

Takes a minimum and/or maximum and returns a type function that accepts an input and validates that it is a number (float or int) that is within the specified range. If None is provided for either minimum or maximum, they will be unbounded.

Returns the valid number (float or int) or raises a ValueError.

class Block:
    OPTIONS = {'keyword': [0.0, type_ranged_number(0.0, 100.0)]}

type_ranged_integer

def type_ranged_integer(minimum: int = None, maximum: int = None) -> Callable[[Any], int]:
    ...

Takes a minimum and/or maximum and returns a type function that accepts an input and validates that it is an int that is within the specified range. If None is provided for either minimum or maximum, they will be unbounded.

Returns the valid int or raises a ValueError.

class Block:
    OPTIONS = {'keyword': [0, type_ranged_integer(0, 100)]}

type_boolean

def type_boolean(value: Any) -> bool:
    ...

Takes a YAML input and validates that it is a boolean value.

Returns the valid boolean or raises a ValueError.

class Block:
    OPTIONS = {'keyword': [False, type_boolean]}

type_ternary

def type_ternary(value: Any) -> bool | None:
    ...

Takes a YAML input and validates that it is a bool value or None.

Returns the valid bool or None or raises a ValueError.

class Block:
    OPTIONS = {'keyword': [None, type_ternary]}

type_string

def type_string(value: Any) -> str:
    ...

Takes a YAML input and validates that it is a str value.

Returns the valid str or raises a ValueError.

class Block:
    OPTIONS = {'keyword': ['default', type_string]}

type_insensitive_string

def type_insensitive_string(value: Any) -> str:
    ...

Takes a YAML input and validates that it is a str value and normalizes it by lower casing it.

Returns the valid, lowercase str or raises a ValueError.

class Block:
    OPTIONS = {'keyword': ['default', type_insensitive_string]}

type_string_in

def type_string_in(value: list[str], insensitive: bool = True) -> Callable[[Any], str]:
    ...

Takes a list of acceptable string inputs and a boolean indicating whether comparison should be case insensitive. Returns a type function that takes an input and then validates that it is a str and that the str value is found in the acceptable string list.

Returns the valid str or raises a ValueError.

class Block:
    OPTIONS = {'keyword': ['this', type_string_in(['this', 'that'], type_insensitive_string)]}

type_string_delimiter

def type_string_delimiter(value: str, string_type: Callable[[Any], str] = type_string) -> str:
    ...

Takes a delimiter and string type callback and returns a function that takes an input, verifies that it is a str, splits it by the delimiter, and ensures that each part validates with the given string type callback.

Returns a list of valid str values or raises a ValueError.

class Block:
    OPTIONS = {'keyword': ['default', type_string_delimiter(',' type_insensitive_string)]}

type_html_identifier

def type_html_identifier(value: Any) -> str:
    ...

Tests that a string is an "identifier" as described in CSS. This would normally match tag names, IDs, classes, and attribute names. This is useful if you'd like to validate such HTML constructs.

Returns a str that is a valid identifier or raises ValueError.

class Block:
    OPTIONS = {'keyword': ['default', type_html_indentifier]}

type_html_classes

def type_html_classes(value: Any) -> list[str]:
    ...

Takes a YAML input value and verifies that it is a str and treats it as a space delimited input. The input will be split by spaces and each part will be run through type_html_identifier.

Returns a list of str that are valid CSS classes or raises ValueError.

class Block:
    OPTIONS = {'keyword': ['default', type_html_classes]}

type_html_attribute_dict

def type_html_classes(value: Any) -> dict[str, Any]:
    ...

Note

The returned dictionary will have all values set to string except classes which will be a list of strings. The class attribute is processed with type_html_classes.

The id attribute is also run through type_html_identifier to ensure a good ID that can be targeted with traditional CSS selectors: #id.

Takes a YAML input value and verifies that it is a dict. Keys will be verified to be HTML identifiers and the values to be strings.

Returns a dict[str, Any] where the values will either be str or list[str] as previously noted or raises ValueError.

class Block:
    OPTIONS = {'attributes': [{}, type_html_attribute_dict]}

type_multi

def type_multi(*args: Any) -> Callable[[Any], Any]:
    ...

Takes a multiple type functions and returns a single type function that takes a YAML input and validates it with all the provided type functions. If the input fails all the validation functions, a ValueError is raised.