Changelog
2.6
- NEW: Add official support for Python 3.13.
- NEW: Add support for
&
as scoping root per the CSS Nesting Module, Level 1. When&
is used outside the context of nesting, it is treated as the scoping root (equivalent to:scope
). - FIX: Improve error message when an unrecognized pseudo-class is used.
2.5
- NEW: Update to support Python 3.12.
- NEW: Drop support for Python 3.7.
2.4.1
- FIX: Attribute syntax for case insensitive flag optionally allows a space, it does not require one.
2.4
- NEW: Update to support changes related to
:lang()
in the official CSS spec.:lang("")
should match unspecified languages, e.g.lang=""
, but notlang=und
. - NEW: Only
:is()
and:where()
should allow forgiving selector lists according to latest CSS (as far as Soup Sieve supports "forgiving" which is limited to empty selectors). - NEW: Formally drop Python 3.6.
- NEW: Formally declare support for Python 3.11.
2.3.2.post1
- FIX: Documentation for installation from source is outdated.
2.3.2
- FIX: Fix some typos in error messages.
2.3.1
- FIX: Ensure attribute selectors match tags that have new line characters in attributes. (#233)
2.3
- NEW: Officially support Python 3.10.
- NEW: Add static typing.
- NEW:
:has()
,:is()
, and:where()
now use use a forgiving selector list. While not as forgiving as CSS might be, it will forgive such things as empty sets and empty slots due to multiple consecutive commas, leading commas, or trailing commas. Essentially, these pseudo-classes will match all non-empty selectors and ignore empty ones. As the scraping environment is different than a browser environment, it was chosen not to aggressively forgive bad syntax and invalid features to ensure the user is alerted that their program may not perform as expected. - NEW: Add support to output a pretty print format of a compiled
SelectorList
for debug purposes. - FIX: Some small corner cases discovered with static typing.
2.2.1
- FIX: Fix an issue with namespaces when one of the keys is
self
.
2.2
- NEW:
:link
and:any-link
no longer include<link>
due to a change in the level 4 selector specification. This actually yields more sane results. - FIX: BeautifulSoup, when using
find
, is quite forgiving of odd types that a user may place in an element's attribute value. Soup Sieve will also now be more forgiving and attempt to match these unexpected values in a sane manner by normalizing them before compare. (#212)
2.1
- NEW: Officially support Python 3.9.
- NEW: Drop official support for Python 3.5.
- NEW: In order to avoid conflicts with future CSS specification changes, non-standard pseudo classes will now start with the
:-soup-
prefix. As a consequence,:contains()
will now be known as:-soup-contains()
, though for a time the deprecated form of:contains()
will still be allowed with a warning that users should migrate over to:-soup-contains()
. - NEW: Added new non-standard pseudo class
:-soup-contains-own()
which operates similar to:-soup-contains()
except that it only looks at text nodes directly associated with the currently scoped element and not its descendants. - FIX: Import
bs4
globally instead of in local functions as it appears there are no adverse affects due to circular imports asbs4
does not immediately referencesoupsieve
functions andsoupsieve
does not immediately referencebs4
functions. This should give a performance boost to functions that had previously includedbs4
locally.
2.0.1
- FIX: Remove unused code.
2.0
- NEW:
SelectorSyntaxError
is derived fromException
notSyntaxError
. - NEW: Remove deprecated
comments
andicomments
from the API. - NEW: Drop support for EOL Python versions (Python 2 and Python < 3.5).
- FIX: Corner case with splitting namespace and tag name that that have an escaped
|
.
1.9.6
Last version for Python 2.7
- FIX: Prune dead code.
- FIX: Corner case with splitting namespace and tag name that that have an escaped
|
.
1.9.5
- FIX:
:placeholder-shown
should not match if the element has content that overrides the placeholder.
1.9.4
- FIX:
:checked
rule was too strict withoption
elements. The specification for:checked
does not require anoption
element to be under aselect
element. - FIX: Fix level 4
:lang()
wildcard match handling with singletons. Implicit wildcard matching should not match any singleton. Explicit wildcard matching (*
in the language range:*-US
) is allowed to match singletons.
1.9.3
- FIX:
[attr!=value]
pattern was mistakenly using:not([attr|=value])
logic instead of:not([attr=value])
. - FIX: Remove undocumented
_QUIRKS
mode flag. Beautiful Soup was meant to use it to help with transition to Soup Sieve, but never released with it. Help with transition at this point is no longer needed.
1.9.2
- FIX: Shortcut last descendant calculation if possible for performance.
- FIX: Fix issue where
Doctype
strings can be mistaken for a normal text node in some cases. - FIX: A top level tag is not a
:root
tag if it has sibling text nodes or tag nodes. This is an issue that mostly manifests when usinghtml.parser
as the parser will allow multiple root nodes.
1.9.1
- FIX:
:root
,:contains()
,:default
,:indeterminate
,:lang()
, and:dir()
will properly account for HTMLiframe
elements in their logic when selecting or matching an element. Their logic will be restricted to the document for which the element under consideration applies. - FIX: HTML pseudo-classes will check that all key elements checked are in the XHTML namespace (HTML parsers that do not provide namespaces will assume the XHTML namespace).
- FIX: Ensure that all pseudo-class names are case insensitive and allow CSS escapes.
1.9
- NEW: Allow
:contains()
to accept a list of text to search for. (#115) - NEW: Add new
escape
function for escaping CSS identifiers. (#125) - NEW: Deprecate
comments
andicomments
functions in the API to ensure Soup Sieve focuses only on CSS selectors.comments
andicomments
will most likely be removed in 2.0. (#130) - NEW: Add Python 3.8 support. (#133)
- FIX: Don't install test files when installing the
soupsieve
package. (#111) - FIX: Improve efficiency of
:contains()
comparison. - FIX: Null characters should translate to the Unicode REPLACEMENT CHARACTER (
U+FFFD
) according to the specification. This applies to CSS escaped NULL characters as well. (#124) - FIX: Escaped EOF should translate to
U+FFFD
outside of CSS strings. In a string, they should just be ignored, but as there is no case where we could resolve such a string and still have a valid selector, string handling remains the same. (#128)
1.8
- NEW: Add custom selector support. (#92)(#108)
- FIX: Small tweak to CSS identifier pattern to ensure it matches the CSS specification exactly. Specifically, you can't have an identifier of only
-
. (#107) - FIX: CSS string patterns should allow escaping newlines to span strings across multiple lines. (#107)
- FIX: Newline regular expression for CSS newlines should treat
\r\n
as a single character, especially in cases such as string escapes:\\\r\n
. (#107) - FIX: Allow
--
as a valid identifier or identifier start. (#107) - FIX: Bad CSS syntax now raises a
SelectorSyntaxError
, which is still currently derived fromSyntaxError
, but will most likely be derived fromException
in the future.
1.7.3
- FIX: Fix regression with tag names in regards to case sensitivity, and ensure there are tests to prevent breakage in the future.
- FIX: XHTML should always be case sensitive like XML.
1.7.2
- FIX: Fix HTML detection
type
selector. - FIX: Fixes for
:enabled
and:disabled
. - FIX: Provide a way for Beautiful Soup to parse selectors in a quirks mode to mimic some of the quirks of the old select method prior to Soup Sieve, but with warnings. This is to help old scripts to not break during the transitional period with newest Beautiful Soup. In the future, these quirks will raise an exception as Soup Sieve requires selectors to follow the CSS specification.
1.7.1
- FIX: Fix issue with
:has()
selector where a leading combinator can only be provided in the first selector in a relative selector list.
1.7
- NEW: Add support for
:in-range
and:out-of-range
selectors. (#60) - NEW: Add support for
:defined
selector. (#76) - FIX: Fix pickling issue when compiled selector contains a
NullSelector
object. (#70) - FIX: Better exception messages in the CSS selector parser and fix a position reporting issue that can occur in some exceptions. (#72, #73)
- FIX: Don't compare prefixes when evaluating attribute namespaces, compare the actual namespace. (#75)
- FIX: Split whitespace attribute lists by all whitespace characters, not just space.
- FIX:
:nth-*
patterns were converting numbers to base 16 when they should have been converting to base 10.
1.6.2
- FIX: Fix pattern compile issues on Python < 2.7.4.
- FIX: Don't use
\d
in UnicodeRe
patterns as they will contain characters outside the range of[0-9]
.
1.6.1
- FIX: Fix warning about not importing
Mapping
fromcollections.abc
.
1.6
- NEW: Add
closest
method to the API that matches closest ancestor. - FIX: Add missing
select_one
reference to module's__all__
.
1.5
- NEW: Add
select_one
method like Beautiful Soup has. - NEW: Add
:dir()
selector (HTML only). - FIX: Fix issues when handling HTML fragments (elements without a
BeautifulSoup
object as a parent). - FIX: Fix internal
nth
range check.
1.4.0
- NEW: Throw
NotImplementedError
for at-rules:@page
, etc. - NEW: Match nothing for
:host
,:host()
, and:host-context()
. - NEW: Add support for
:read-write
and:read-only
. - NEW: Selector patterns can be annotated with CSS comments.
- FIX:
\r
,\n
, and\f
cannot be escaped with\
in CSS. You must use Unicode escapes.
1.3.1
- FIX: Fix issue with undefined namespaces.
1.3
- NEW: Add support for
:scope
. - NEW:
:user-invalid
,:playing
,:paused
, and:local-link
will not cause a failure, but all will match nothing as their use cases are not possible in an environment outside a web browser. - FIX: Fix
[attr~=value]
handling of whitespace. According to the spec, if the value contains whitespace, or is an empty string, it should not match anything. - FIX: Precompile internal patterns for pseudo-classes to prevent having to parse them again.
1.2.1
- FIX: More descriptive exceptions. Exceptions will also now mention position in the pattern that is problematic.
- FIX:
filter
ignoresNavigableString
objects in normal iterables andTag
iterables. Basically, it filters all Beautiful Soup document parts regardless of iterable type where as it used to only filter out aNavigableString
in aTag
object. This is viewed as fixing an inconsistency. - FIX:
DEBUG
flag has been added to help with debugging CSS selector parsing. This is mainly for development. - FIX: If forced to search for language in
meta
tag, and no language is found, cache that there is no language in themeta
tag to prevent searching again during the current select. - FIX: If a non
BeautifulSoup
/Tag
object is given to the API to compare against, raise aTypeError
.
1.2
- NEW: Add Python 2.7 support.
- NEW: Remove old pre 1.0 deprecations.
1.1
- NEW: Adds support for
[attr!=value]
which is equivalent to:not([attr=value])
. - NEW: Add support for
:active
,:focus
,:hover
,:visited
,:target
,:focus-within
,:focus-visible
,:target-within
,:current()
/:current
,:past
, and:future
, but they will never match as these states don't exist in the Soup Sieve environment. - NEW: Add support for
:checked
,:enabled
,:disabled
,:required
,:optional
,:default
, and:placeholder-shown
which will only match in HTML documents as these concepts are not defined in XML. - NEW: Add support for
:link
and:any-link
, both of which will target all<a>
,<area>
, and<link>
elements with anhref
attribute as all links will be treated as unvisited in Soup Sieve. - NEW: Add support for
:lang()
(CSS4) which works in XML and HTML. - NEW: Users must install Beautiful Soup themselves. This requirement is removed in the hopes that Beautiful Soup may use this in the future.
- FIX: Attributes in the form
prefix:attr
can be matched with the form[prefix\:attr]
without specifying a namespaces if desired. - FIX: Fix exception when
[type]
is used (with no value).
1.0.2
- FIX: Use proper CSS identifier patterns for tag names, classes, ids, etc. Things like
#3
or#-3
should not match and should require#\33
or#-\33
. - FIX: Do not raise
NotImplementedError
for supported pseudo classes/elements with bad syntax, instead raiseSyntaxError
.
1.0.1
- FIX: When giving a tag to
select
, it should only return the children of that tag, never the tag itself. - FIX: For informational purposes, raise a
NotImplementedError
when an unsupported pseudo class is used.
1.0
- NEW: Official 1.0.0 release.
1.0.0b2
- NEW: Drop document flags. Document type can be detected from the Beautiful Soup object directly.
- FIX: CSS selectors should be evaluated with CSS whitespace rules.
- FIX: Processing instructions, CDATA, and declarations should all be ignored in
:contains
and child considerations for:empty
. - FIX: In Beautiful Soup, the document itself is the first tag. Do not match the "document" tag by returning false for any tag that doesn't have a parent.
1.0.0b1
- NEW: Add support for non-standard
:contains()
selector. - FIX: Compare pseudo class names case insensitively when matching unexpected cases.
- FIX: Don't allow attribute case flags when no attribute value is defined.
0.6
- NEW:
mode
attribute is now calledflags
to allow for other options in the future. - FIX: More corner cases for
nth
selectors.
0.5.3
- FIX: Previously, all pseudo classes' selector lists were evaluated as one big group, but now each pseudo class's selector lists are evaluated separately.
- FIX: CSS selector tokens are not case sensitive.
0.5.2
- FIX: Add missing
s
flag to attribute selector for forced case sensitivity of attribute values. - FIX: Relax attribute pattern matching to allow non-essential whitespace.
- FIX: Attribute selector flags themselves are not case sensitive.
- FIX:
type
attribute in HTML is handled special. While all other attributes values are case sensitive,type
in HTML is usually treated special and is insensitive. In XML, this is not the case.
0.5.1
- FIX: Fix namespace check for
:nth-of-type
.
0.5
- NEW: Deprecate
commentsiter
andselectiter
in favor oficomments
andiselect
. Expect removal in version 1.0.
0.4
- NEW: Initial prerelease.