Spelling Pipeline

Overview

PySpelling's pipeline utilizes special plugins to provide text filtering and to control the flow of the text down the pipeline. The plugins can be arranged in any order and even included multiple times, the only restriction is that you can't start the pipeline with FlowControl plugins, the first plugin must be a Filter plugin.

A number of plugins are included with PySpelling, but additional plugins can be written using the plugin API.

Filter

Filter plugins are used to augment and/or filter a given chunk of text returning only the portions that are desired. Once a plugin is done with the text, it passes it down the pipeline. A filter may return one or many chunks, each with a little contextual information. Some filters may return only one chunk of text that is the entirety of the file, and some may return context specific chunks: one for each docstring, one for each comment, etc. The metadata associated with the chunks can also be used by FlowControl plugins to allow certain types of text to skip certain filters.

Aside from filtering the text, the first filter in the pipeline is always responsible for initially reading the file from disk and getting the file content into a Unicode buffer that PySpelling can work with. It is also responsible for setting the default encoding and/or identifying the encoding from the file header if there is special logic to determine such things.

The following Filter plugins are included:

Name	Include Path
Context	`pyspelling.filters.context`
CPP	`pyspelling.filters.cpp`
HTML	`pyspelling.filters.html`
JavaScript	`pyspelling.filters.javascript`
Markdown	`pyspelling.filters.markdown`
ODF	`pyspelling.filters.odf`
OOXML	`pyspelling.filters.ooxml`
Python	`pyspelling.filters.python`
Stylesheets	`pyspelling.filters.stylesheets`
Text	`pyspelling.filters.text`
URL	`pyspelling.filters.url`
XML	`pyspelling.filters.xml`

Flow Control

FlowControl plugins are responsible for controlling the flow of the text down the pipeline. The category of a text chunk is passed to the plugin, and it will return one of three directives:

ALLOW: the chunk(s) of text is allowed to be evaluated by the next filter.
SKIP: the chunk(s) of text should skip the next filter.
HALT: halts the progress of the text chunk(s) down the pipeline and sends it directly to the spell checker.

The following FlowControl plugins are included:

Name	Include Path
Wildcard	`pyspelling.flow_control.wildcard`