Changelog

2.12.1

NEW: Don't disallow the size option in Aspell.

2.12

NEW: When jobs is set to 0, use maximum available cores.

2.11

NEW: Add new command line option --skip-dict-compile which will skip the dictionary compiling step if the dictionary already exists. Changes to a custom dictionary will be ignored.
NEW: Drop support for Python 3.8.
NEW: Add official support for Python 3.13.
NEW: Add official support for Python 3.14.
NEW: Python filter is now aware of Python 3.14+ t-strings.

2.10

NEW: Allow specifying parallel processes to speed up spell checking. Number of jobs can be specified either by command line or via the config. Command line overrides the config.

2.9

NEW: Officially support Python 3.11 and 3.12.
NEW: Drop support for Python 3.7.

2.8.2

FIX: Ensure that Aspell actually uses the encoding passed to it for dictionaries.
FIX: Use a disallow list for problematic or unsupported arguments to the underlying spell checker instead of using a more restrictive allow list.
FIX: Fix logic bug in JavaScript filter.

2.8.1

FIX: Fix missing command line application after migration to new build system.

2.8.0

NEW: Officially drop support for Python 3.6 and add support for Python 3.10.
NEW: Switch build system to Hatchling.
FIX: Fix typo in Aspell accepted parameters.

2.7.3

FIX: Fix context reporting in the XML, HTML, and other filters derived from XML the filter.

2.7.2

FIX: Add note in --help option about --source behavior.
FIX: Better documentation on language options and Unicode normalization in international languages.

2.7.1

FIX: Allow camel case options in Aspell.

2.7

NEW: Check for .pyspelling.yml or .pyspelling.yaml by default.
FIX: Fix documentation about how to specify languages in Aspell and how to specify languages when compiling custom wordlists. In short, d should be used for specifying languages in general, but when using custom wordlists, lang should be specified, and it should reference the .dat file name.
FIX: Fix spelling in help output.
FIX: Raise error in cases where pipeline options are not indented enough and parsed as an additional pipeline name.
FIX: Drop Python 3.5 support and officially support Python 3.9.

2.6.1

FIX: Upgrade to wcmatch 6.0.3 which fixes issues dealing with dot files and globstar (**) when dot globbing is not enabled. Also fixes a small logic error with symlink following and globstar. 6.0.3 is now the minimum requirement.

2.6

NEW: Add support for wcmatch version 6.0.
NEW: wcmatch version 6.0 adds a default pattern limit of 1000 to help protect against really large pattern expansions such as {1..1000000}. If you wish to control this default, or disable it entirely, you can via the new glob_pattern_limit configuration option.

2.5.1

FIX: Add workaround for wcmatch version 5.0.

2.5

NEW: Add expect_match option to prevent a rule from failing if it finds no matching files.
NEW: Formally support Python 3.8.

2.4

NEW: Disable Aspell filters by default. Users must explicitly set the mode parameter under the aspell option to enable default Aspell filters.
New: Throw an exception with a message if no configuration is found or there is some other issue.
New: Throw an exception with a message when no tasks are found in the matrix or when no tasks match a given name or group.
New: Throw an exception with a message when a task is run but no files are found.

2.3.1

FIX: Properly handle docstring content and detection in files that have single line functions.

2.3

NEW: Support new wcmatch glob feature flags and upgrade to wcmatch 4.0.
FIX: Don't use recursion when parsing XML or HTML documents.

2.2.6

FIX: Require wcmatch 3.0 for glob related fixes.

2.2.5

FIX: Rework comment extraction in XML plugin.
FIX: Newer versions of Soup Sieve will not compile an empty string, so adjust XML and HTML plugin logic to account for this behavior.

2.2.4

FIX: Explicitly require Beautiful Soup 4 dependency.

2.2.3

FIX: There is no need to un-escape content for HTML/XML as it is already un-escaped in the bs4 objects.
FIX: Upgrade to latest beta of Soup Sieve.

2.2.2

FIX: Fix :empty and :root and :nth-* selectors not working properly without a tag name specified before. This is now done via our external lib called soupsieve which is the same homegrown CSS library that we were using internally.
FIX: Potential infinite loop when using :nth-child().

2.2.1

FIX: Comments in HTML/XML should be returned regardless of whether they are in an ignored tag or not.

2.2

NEW: Add support for CSS4 selectors: :empty, :first-child, :last-child, :only-child, :first-of-type, :last-of-type, :only-of-type, :nth-child(an+b [of S]?), :nth-last-child(an+b [of S]?), :nth-of-type(an+b), and :nth-last-of-type(an+b). (#58)

2.1.1

FIX: CSS4 allows :not(), :has(), and :is() to be nested in :not(). (#62)

2.1

NEW: Add support for div p, div>p, div+p, div~p in the HTML/XML filter's CSS selectors. (#51)
NEW: Add support for the :root CSS selector. (#57)
NEW: Add support for experimental :has() selector. (#54)
FIX: According to CSS4 specification, :is() is the final name for :matches() but the :matches() is an allowed alias. (#53)
FIX: Allow :not() to be nested in :is()/:matches(). (#56)

2.0

NEW: (Breaking change) Task names should be unique and using --name from the command line will only target one name (the last task defined with that name). If you were not using name to run a group of tasks, you will not notice any changes.
NEW: Task option group has been added to target multiple tasks with the --group command line option. group name can be shared across different tasks.
NEW: Add XML filter (PySpelling now has a dependency on lxml).
NEW: Add Open Document Format (ODF) filter for .odt, .ods, and .odp files.
NEW: Add Office Open XML format (newer Microsoft document format) for .docx, .xlsx, and .pptx files.
NEW: CSS selectors in XML and HTML filters now support :not() and :matches() pseudo class.
NEW: CSS selectors now support , in patterns.
NEW: CSS selectors now support i in attribute selectors: [attr=value i].
NEW: CSS selectors now support namespaces (some configuration required).
NEW: For better HTML context, display a tag's ancestry (just tag name of parents).
NEW: Captured tags are now configurable via captures, but tags that are not captured still have their children crawled unless they are under ignores.
NEW: Support modes added for HTML filter: html, html5, and xhtml.
NEW: CHECK_BOM plugin attribute has been deprecated in favor of overriding the exposed has_bom function.
NEW: Tasks can be hidden with the hidden configuration option. Tasks with hidden enabled will only run if they are explicitly called by name.
NEW: Add normal string support to Python filter.
NEW: Add string and template literal support for JavaScript filter.
NEW: Add string support for CPP filter.
NEW: Add generic_mode option to CPP to allow for generic C/C++ comment style capture from non C/C++ file types.
NEW: Context will normalize line endings before applying context (can be disabled).
NEW: CPP, Stylesheet, and JavaScript plugins now normalize line endings of block comments.
NEW: UTF-16 and UTF-32 is not really supported by Aspell and Hunspell, so at the end of the pipeline, Unicode strings that have the associated encoding of UTF-16 or UTF-32 will encoding with the compatible UTF-8. This does not apply to files being processed with a disabled pipeline. When the pipeline is disabled, files are sent directly to the spell checker with no modifications.
FIX: Case related issues when comparing tags and attributes in HTML.
FIX: CSS selectors should only compare case insensitive for ASCII characters A-Z and a-z.
FIX: Allow CSS escapes in selectors.
FIX: Don't send empty (or strings that are just whitespace) to spell checker to prevent Aspell 0.50 series from crashing (also to increase performance).
FIX: Catch and bubble up errors better.
FIX: Fix issue where Python module docstrings would not get spell checked if they followed a shebang.

1.1

NEW: Add URL/email address filter. (#30)
NEW: If pipeline configuration key is set to null, do not use any filters, and send the filename, not the content, to the spell checker.
NEW: Add encoding option to dictionary configuration for the purpose of communicating what encoding the main dictionary is when compiling wordlists (only Aspell takes advantage of this).
FIX: Fix Hunspell -O option which was mistakenly -o. (#31)

1.0

NEW: Allow multiple names on command line via: pyspelling -n name1 -n name2.
FIX: Fix empty HTML tags not properly having their attributes evaluated.
FIX: Fix case where a deprecation warning for filters is shown when it shouldn't.
FIX: Better docstring recognition in Python filter.
FIX: Catch comments outside of the <HTML> tag.
FIX: Filter out Doctype, CData, and other XML or non-content type information.

1.0b2

FIX: Fix CPP comment regular expression.

1.0b1

NEW: Better context for HTML elements. HTML is now returned by block level elements, and the elements selector is given as context. Attributes also return a selector as context and are returned individually. HTML comments are returned as individual hunks.
NEW: Add Stylesheet and CPP filters (#17)
NEW: JavaScript is now derived from CPP.
NEW: PySpelling looks for .spelling.yml or .pyspelling.yml with a priority for the latter. (#12)
NEW: Spelling pipeline adjustments: you can now explicitly allow only certain categories, skip categories, or halt them in the pipeline. Pipeline flow control is now done via a new FlowControl plugin. When avoiding, including, or skipping categories, they are now done with wildcard patterns. (#16)
NEW: Drop scanning python normal strings in plugin.
NEW: Use get_plugin instead of get_filter, but allow a backwards compatible path for now.
NEW: In configuration, documents is now matrix and filters is now pipeline, but a deprecation path has been added. (#15)
NEW: Provide a class attribute that will cause a Filter object to avoid BOM detection if it is not appropriate for the given file.
NEW: Wordlists should get the desired language/dictionary from the spell checker specific options.
NEW: Add global configuration option to specify the preferred spell checker, but it is still overridable via command line.
FIX: Internal cleanup in regards to error handling and debug.
FIX: Fix context issue when no escapes are defined.

0.2a4

NEW: Text filter can handle Unicode normalization and converting to other encodings.
NEW: Default encoding is now utf-8 for all filters.
FIX: Internal encoding handling.

0.2a3

FIX: Text filter was returning old Parser name instead of new Filter name.

0.2a2

NEW: Incorporate the Decoder class into the filter class.
NEW: Add Hunspell support.
NEW: Drop specifying spell checker in configuration file. It must be set from command line.
FIX: Add missing documentation about Context filter.

0.2a1

NEW: Better filters (combine filters and parsers into just filters).
NEW: Drop Python 2 support.
NEW: Better Python encoding detection.
NEW: Better HTML encoding detection.
NEW: Drop file_extensions option and parser option.
NEW: Filters no longer define file extensions. Sources must specify a wildcard path that matches desired files.
NEW: Drop regular expression support for sources.
NEW: Drop raw filter.

0.1a3

NEW: Add JavaScript parser.

0.1a2

NEW: Add option to group consecutive Python comments.
FIX: Properly return error.
FIX: Only retry with default encoding if exception thrown was a UnicodeDecodeError.

0.1a1

NEW: Initial alpha release.