Changelog
5.6.post1
- Fix: Update project metadata to indicate Python 3.12 support.
5.6
- NEW: Officially support Python 3.12.
5.5.1
- FIX: Fix some flag issues in
bregex
.
5.5
- NEW:
\e
and\h
have both been deprecated in 6.0. Please migrate to using\x1b
and\p{Horiz_Space}
in their respective place. - FIX: Fix flag issue with
sub
functions.
5.4
- NEW: Officially support Python 3.11.
- NEW: Add to Bre compatible custom Unicode properties
\p{Vert_Space}
and\p{Horiz_Space}
that match Regex's new custom properties. This helps to expose vertical space shorthand that was not previously present.
5.3
- NEW: Drop Python 3.6 support.
- NEW: Update build backend to use Hatch.
5.2
- NEW: Add type annotations.
- FIX: Re format replacement captures behave more like Regex in that you can technically index into the captures of a given group in Re, but in Re there is only ever one or zero captures. Documentation was never really explicit on what one should expect if indexing a group in Re occurred. The documentation seemed to vaguely insinuate that it would behave like a Regex capture list, just with one or zero values in the list. In reality, the value was a simple string or
None
. This caused a bug in some cases where you'd haveNone
inserted for a group if a group was optional, but referenced in the replacement template. Now the implementation matches the description in the documentation with the documentation now being more explicit about behavior. - FIX: Match Re and Regex handling when doing a non-format replacement that references a group that is present in the search pattern but has no actual captures. Such a case should not fail, but simply return an empty string for the group.
- FIX: Format replacements that that have groups with no captures will yield an empty string as the only capture as long as the user does not try to index into any captures as there are no actual captures. This behavior was a bug in Regex that we duplicated and should now be fixed in the latest Regex (mrabarnett/mrab-regex#439) as well as in Backrefs.
5.1
- NEW: Add support for Python 3.10.
5.0.1
- FIX: Fix wheel names.
5.0
- NEW: Significant improvements to Unicode handling. A lot of testing was implemented to catch existing bugs and to improve result.
- NEW: POSIX style properties now handle all existing Unicode properties.
- NEW: POSIX properties now follow the Unicode specification for POSIX compatibility. Read the documentation to learn more.
- NEW: Unicode properties are now sensitive to the
ASCII
flag and will properly restrict the range of properties to the ASCII range even in Unicode strings. - NEW: Removed the old deprecated search references:
\l
,\L
,\c
, and\C
. These are available in various other forms:[[:lower:]]
,\p{lower}
, etc. - NEW: To reduce conflicts of naming, Binary properties are evaluated before Block properties when using short names. Block has conflicts with some other properties of various types, using short names for blocks is discouraged.
- FIX: Numerous fixes to existing Unicode properties: missing values, incorrect values, etc.
4.6
- NEW: Provide wheels for all officially supported versions of Python.
4.5
- NEW: Added new back reference
\h
to Re. To get similar functionality with Regex, users must update to the latest Regex release.
4.4
- NEW: Added the following binary properties for Unicode 13.0 support (Python 3.9):
emoji
,emojicomponent
,emojimodifier
,emojimodifierbase
, andemojipresentation
. Associated aliases are also included:ecomp
,emod
,ebase
, andepres
.
4.3
- NEW: Install Regex library along Backrefs via
pip install backrefs[extras]
. - NEW: Remove
version
and__version__
and remove associated deprecation code.
4.2.1
- FIX: Fix Python 3.8 installation issue due to Unicode bundle having an incorrect encoding in some files.
4.2
- NEW: Deprecate the search references
\l
,\L
,\c
, and\C
. The POSIX alternatives (which these were shortcuts for) should be used instead:[[:lower:]]
,[[:^lower:]]
,[:upper:]]
, and[[:^upper:]]
respectively. - NEW: Formally drop support for Python 3.4.
4.1.1
- FIX: Later pre-release versions of Python 3.8 will support Unicode 12.1.0.
4.1
- NEW: Add official support for Python 3.8.
- NEW: Vendor the
Pep562
library instead of requiring as a dependency. - NEW: Input parameters accept
*args
and**kwargs
instead of specify every parameter in order to allow Backrefs to work even when the Re or Regex API changes. Change was made to support new Regextimeout
parameter.
4.0.2
- FIX: Fix compatibility issues with latest Regex versions.
4.0.1
- FIX: Ensure that when generating the Unicode property tables, that the property files are read in with
UTF-8
encoding.
4.0
- NEW: Drop support for new features in Python 2. Python 2 support is limited to the 3.X.X series and will only receive bug fixes up to 2020. All new features moving forward will be on the 4.X.X series and will be for Python 3+ only.
3.6
- NEW: Make version available via the new, and more standard,
__version__
attribute and add the__version_info__
attribute as well. Deprecate the oldversion
andversion_info
attribute for future removal.
3.5.2
- FIX: Include zip for Unicode 11 (Python 3.7) to make installation more reliable.
3.5.1
- FIX: POSIX character classes should not be part of a range.
- FIX: Replace string casing logic properly follows other implementations like Boost etc.
\L
,\C
, and\E
should all terminate\L
, and\C
.\l
and\c
will be ignored if followed by\C
or\L
.
3.5
- NEW: Use a more advanced format string implementation that implements all string features, included those found in
format_spec
. - FIX: Relax validation so not to exclude valid named Unicode values.
- FIX: Caching issues where byte string patterns were confused with Unicode patterns.
- FIX: More protection against using conflicting string type combinations with search and replace.
3.4
- NEW: Add support for generic line breaks (
\R
) to Re. - NEW: Add support for an overly simplified form of grapheme clusters (
\X
) to Re. Roughly equivalent to(?>\PM\pM*)
. - NEW: Add support for
Vertical_Orientation
property for Unicode 10.0.0 on Python 3.7.
3.3
- NEW: Add support for
Indic_Positional_Category
\Indic_Matra_Category
andIndic_Syllabic_Category
properties.
3.2.1
- FIX:
Bidi_Paired_Bracket_type
property'sNone
value should be equivalent to all characters that are notopen
orclose
characters.
3.2
- NEW: Add support for
Script_Extensions
Unicode properties (Python 3 only as Python 2, Unicode 5.2.0 does not define these). Can be accessed via\p{scripts_extensions: kana}
or\p{scx: kana}
. - NEW: When defining scripts with just their name
\p{Kana}
, useScript_Extensions
instead ofScripts
. To getScripts
results, you must specify\p{scripts: kana}
or\p{sc: scripts}
. - NEW: Add
Bidi_Paired_Bracket_Type
Unicode property (Python 3.4+ only). - NEW: Add support for
IsBinary
for binary properties:\p{IsAlphabetic}
==\p{Alphabetic: Y}
. - FIX: Tweaks/improvements to string iteration.
3.1.2
- FIX: Properly escape any problematic characters in Unicode tables.
3.1.1
- FIX:
bregex.compile
now supports additional keyword arguments for named lists likebregex.compile_search
does.
3.1
- NEW: Start and end word boundary back references are now specified with
\m
and\M
like Regex does.\<
and\>
have been removed from Regex. - FIX: Escaped
\<
and\>
are no longer processed as Re is known to escape these in versions less than Python 3.7.
3.0.5
- FIX: Process non raw string equivalent escaped Unicode on Python 2.7.
- FIX: Compiled objects should return the pattern string, not the pattern object via the property
pattern
.
3.0.4
- FIX: Formally enable Python 3.7 support.
- FIX: Tweak to Unicode wide character handling.
3.0.3
- FIX: Compiled search and replace objects should be hashable.
- FIX: Handle cases where a new compiled pattern object is passed back through compile functions.
3.0.2
- FIX: Bregex purge was calling Re's purge instead of Regex's purge.
3.0.1
- FIX: Do not accidentally
\.
as a group in replace strings (don't useisdigit
string method). - FIX: Group names can start with
_
in replace strings. - FIX: Do not rely on Re for parsing string.
- FIX: Match behavior in
\g<group>
parsing better. - FIX: Raise some exceptions in a few places we weren't.
3.0
- NEW: Added new
compile
function that returns a pattern object that feels like Re's and Regex's pattern object. - NEW: Add some caching of search and replace patterns.
- NEW: Completely refactored algorithm for search and replace pattern augmentation.
- NEW: Add support for
\e
for escape character\x1b
in both Re and Regex. - NEW: Add support for
\R
for generic newlines in the Regex module (Regex only). - NEW: Support Unicode property form
\pP
and\PP
. - NEW: Add support for properly handling per group, scoped verbose flags in the preprocess step (Regex).
- NEW: Handle
(?#comments)
properly in the preprocess step. - NEW: Add support for
\N
in byte strings (characters out of range won't be included). - NEW: Add support for
\p
and\P
in byte strings (characters out of range won't be included). - NEW: Add support for
\<
and\>
word boundary escapes. - FIX: Missing block properties on narrow systems when the property starts beyond the narrow limit.
- FIX: Fix issue where an invalid general category could sometimes pass and return no characters.
- FIX: Fix
\Q...\E
behavior so it is applied first as a separate step. No longer avoids\Q...\E
in things like character groups or comments. - FIX: Flag related parsing issues in Regex and Re Python 3.6+.
2.2
- NEW: Proper support for
\N{Unicode Name}
. - FIX: Incomplete escapes will not be passed through, but will instead throw an error. For instance
\p
should only be passed through if it is complete\p{category}
. Python 3.7 will error on this if we pass it through, and Python 3.6 will generate warnings. We should just consistently fail on it for all Python versions.
2.1
- NEW: Handle Unicode and byte notation in Re replace templates.
- NEW: Rework algorithm to handle replace casing back references in Python 3.7 development builds in preparation for Python 3.7 release.
- NEW: Add support for case back references when using the Regex module's
subf
andsubfn
. - NEW: Add new convenience method
expandf
to Regex that can take a format string and apply format style replaces. - NEW: Add
FORMAT
flag tocompile_replace
to apply format style replaces when applicable. - NEW: Add the same support that Regex has in relation to format style replacements to Re.
- NEW: Compiled replacements are now immutable.
- NEW: Various logic checking proper types and values.
- FIX: Fix octal/group logic in Regex and Re.
- FIX: Fix issue dealing with trailing backslashes in replace templates.
2.0
- NEW: First attempt at bringing Python 3.7 support, fixing back reference logic, and adding new back reference. Released and then removed due to very poor behavior.
1.0.2
- FIX: Issues related to downloading Unicode data and Unicode table generation. Include Unicode data in release.
1.0.1
- FIX: Fixes for Python 3.6.
1.0
- NEW: Initial release.
Last update: September 2, 2023