Changelog
2.8
- NEW: Drop support for Python 3.8.
- NEW: Add support for Python 3.14.
- NEW: Deploy with PyPI's "Trusted Publisher".
2.7
- NEW: Add :openpseudo selector.
- NEW: Add :mutedpseudo selector.
- NEW: Recognize the following pseudo selectors: :autofill,:buffering,:fullscreen,:picture-in-picture,:popover-open,:seeking,:stalled, and:volume-locked. These selectors, while recognized, will not match any element as they require a live environment to check element states and browser states. This just prevents Soup Sieve from failing when any of these selectors are specified.
- NEW: A number of existing pseudo-classes are no longer noted as experimental.
- FIX: Typing fixes.
2.6
- NEW: Add official support for Python 3.13.
- NEW: Add support for &as scoping root per the CSS Nesting Module, Level 1. When&is used outside the context of nesting, it is treated as the scoping root (equivalent to:scope).
- FIX: Improve error message when an unrecognized pseudo-class is used.
2.5
- NEW: Update to support Python 3.12.
- NEW: Drop support for Python 3.7.
2.4.1
- FIX: Attribute syntax for case insensitive flag optionally allows a space, it does not require one.
2.4
- NEW: Update to support changes related to :lang()in the official CSS spec.:lang("")should match unspecified languages, e.g.lang="", but notlang=und.
- NEW: Only :is()and:where()should allow forgiving selector lists according to latest CSS (as far as Soup Sieve supports "forgiving" which is limited to empty selectors).
- NEW: Formally drop Python 3.6.
- NEW: Formally declare support for Python 3.11.
2.3.2.post1
- FIX: Documentation for installation from source is outdated.
2.3.2
- FIX: Fix some typos in error messages.
2.3.1
- FIX: Ensure attribute selectors match tags that have new line characters in attributes. (#233)
2.3
- NEW: Officially support Python 3.10.
- NEW: Add static typing.
- NEW: :has(),:is(), and:where()now use a forgiving selector list. While not as forgiving as CSS might be, it will forgive such things as empty sets and empty slots due to multiple consecutive commas, leading commas, or trailing commas. Essentially, these pseudo-classes will match all non-empty selectors and ignore empty ones. As the scraping environment is different than a browser environment, it was chosen not to aggressively forgive bad syntax and invalid features to ensure the user is alerted that their program may not perform as expected.
- NEW: Add support to output a pretty print format of a compiled SelectorListfor debug purposes.
- FIX: Some small corner cases discovered with static typing.
2.2.1
- FIX: Fix an issue with namespaces when one of the keys is self.
2.2
- NEW: :linkand:any-linkno longer include<link>due to a change in the level 4 selector specification. This actually yields more sane results.
- FIX: BeautifulSoup, when using find, is quite forgiving of odd types that a user may place in an element's attribute value. Soup Sieve will also now be more forgiving and attempt to match these unexpected values in a sane manner by normalizing them before compare. (#212)
2.1
- NEW: Officially support Python 3.9.
- NEW: Drop official support for Python 3.5.
- NEW: In order to avoid conflicts with future CSS specification changes, non-standard pseudo classes will now start with the :-soup-prefix. As a consequence,:contains()will now be known as:-soup-contains(), though for a time the deprecated form of:contains()will still be allowed with a warning that users should migrate over to:-soup-contains().
- NEW: Added new non-standard pseudo class :-soup-contains-own()which operates similar to:-soup-contains()except that it only looks at text nodes directly associated with the currently scoped element and not its descendants.
- FIX: Import bs4globally instead of in local functions as it appears there are no adverse affects due to circular imports asbs4does not immediately referencesoupsievefunctions andsoupsievedoes not immediately referencebs4functions. This should give a performance boost to functions that had previously includedbs4locally.
2.0.1
- FIX: Remove unused code.
2.0
- NEW: SelectorSyntaxErroris derived fromExceptionnotSyntaxError.
- NEW: Remove deprecated commentsandicommentsfrom the API.
- NEW: Drop support for EOL Python versions (Python 2 and Python < 3.5).
- FIX: Corner case with splitting namespace and tag name that have an escaped |.
1.9.6
Last version for Python 2.7
- FIX: Prune dead code.
- FIX: Corner case with splitting namespace and tag name that have an escaped |.
1.9.5
- FIX: :placeholder-shownshould not match if the element has content that overrides the placeholder.
1.9.4
- FIX: :checkedrule was too strict withoptionelements. The specification for:checkeddoes not require anoptionelement to be under aselectelement.
- FIX: Fix level 4 :lang()wildcard match handling with singletons. Implicit wildcard matching should not match any singleton. Explicit wildcard matching (*in the language range:*-US) is allowed to match singletons.
1.9.3
- FIX: [attr!=value]pattern was mistakenly using:not([attr|=value])logic instead of:not([attr=value]).
- FIX: Remove undocumented _QUIRKSmode flag. Beautiful Soup was meant to use it to help with transition to Soup Sieve, but never released with it. Help with transition at this point is no longer needed.
1.9.2
- FIX: Shortcut last descendant calculation if possible for performance.
- FIX: Fix issue where Doctypestrings can be mistaken for a normal text node in some cases.
- FIX: A top level tag is not a :roottag if it has sibling text nodes or tag nodes. This is an issue that mostly manifests when usinghtml.parseras the parser will allow multiple root nodes.
1.9.1
- FIX: :root,:contains(),:default,:indeterminate,:lang(), and:dir()will properly account for HTMLiframeelements in their logic when selecting or matching an element. Their logic will be restricted to the document for which the element under consideration applies.
- FIX: HTML pseudo-classes will check that all key elements checked are in the XHTML namespace (HTML parsers that do not provide namespaces will assume the XHTML namespace).
- FIX: Ensure that all pseudo-class names are case insensitive and allow CSS escapes.
1.9
- NEW: Allow :contains()to accept a list of text to search for. (#115)
- NEW: Add new escapefunction for escaping CSS identifiers. (#125)
- NEW: Deprecate commentsandicommentsfunctions in the API to ensure Soup Sieve focuses only on CSS selectors.commentsandicommentswill most likely be removed in 2.0. (#130)
- NEW: Add Python 3.8 support. (#133)
- FIX: Don't install test files when installing the soupsievepackage. (#111)
- FIX: Improve efficiency of :contains()comparison.
- FIX: Null characters should translate to the Unicode REPLACEMENT CHARACTER (U+FFFD) according to the specification. This applies to CSS escaped NULL characters as well. (#124)
- FIX: Escaped EOF should translate to U+FFFDoutside of CSS strings. In a string, they should just be ignored, but as there is no case where we could resolve such a string and still have a valid selector, string handling remains the same. (#128)
1.8
- NEW: Add custom selector support. (#92)(#108)
- FIX: Small tweak to CSS identifier pattern to ensure it matches the CSS specification exactly. Specifically, you can't have an identifier of only -. (#107)
- FIX: CSS string patterns should allow escaping newlines to span strings across multiple lines. (#107)
- FIX: Newline regular expression for CSS newlines should treat \r\nas a single character, especially in cases such as string escapes:\\\r\n. (#107)
- FIX: Allow --as a valid identifier or identifier start. (#107)
- FIX: Bad CSS syntax now raises a SelectorSyntaxError, which is still currently derived fromSyntaxError, but will most likely be derived fromExceptionin the future.
1.7.3
- FIX: Fix regression with tag names in regards to case sensitivity, and ensure there are tests to prevent breakage in the future.
- FIX: XHTML should always be case sensitive like XML.
1.7.2
- FIX: Fix HTML detection typeselector.
- FIX: Fixes for :enabledand:disabled.
- FIX: Provide a way for Beautiful Soup to parse selectors in a quirks mode to mimic some of the quirks of the old select method prior to Soup Sieve, but with warnings. This is to help old scripts to not break during the transitional period with newest Beautiful Soup. In the future, these quirks will raise an exception as Soup Sieve requires selectors to follow the CSS specification.
1.7.1
- FIX: Fix issue with :has()selector where a leading combinator can only be provided in the first selector in a relative selector list.
1.7
- NEW: Add support for :in-rangeand:out-of-rangeselectors. (#60)
- NEW: Add support for :definedselector. (#76)
- FIX: Fix pickling issue when compiled selector contains a NullSelectorobject. (#70)
- FIX: Better exception messages in the CSS selector parser and fix a position reporting issue that can occur in some exceptions. (#72, #73)
- FIX: Don't compare prefixes when evaluating attribute namespaces, compare the actual namespace. (#75)
- FIX: Split whitespace attribute lists by all whitespace characters, not just space.
- FIX: :nth-*patterns were converting numbers to base 16 when they should have been converting to base 10.
1.6.2
- FIX: Fix pattern compile issues on Python < 2.7.4.
- FIX: Don't use \din UnicodeRepatterns as they will contain characters outside the range of[0-9].
1.6.1
- FIX: Fix warning about not importing Mappingfromcollections.abc.
1.6
- NEW: Add closestmethod to the API that matches closest ancestor.
- FIX: Add missing select_onereference to module's__all__.
1.5
- NEW: Add select_onemethod like Beautiful Soup has.
- NEW: Add :dir()selector (HTML only).
- FIX: Fix issues when handling HTML fragments (elements without a BeautifulSoupobject as a parent).
- FIX: Fix internal nthrange check.
1.4.0
- NEW: Throw NotImplementedErrorfor at-rules:@page, etc.
- NEW: Match nothing for :host,:host(), and:host-context().
- NEW: Add support for :read-writeand:read-only.
- NEW: Selector patterns can be annotated with CSS comments.
- FIX: \r,\n, and\fcannot be escaped with\in CSS. You must use Unicode escapes.
1.3.1
- FIX: Fix issue with undefined namespaces.
1.3
- NEW: Add support for :scope.
- NEW: :user-invalid,:playing,:paused, and:local-linkwill not cause a failure, but all will match nothing as their use cases are not possible in an environment outside a web browser.
- FIX: Fix [attr~=value]handling of whitespace. According to the spec, if the value contains whitespace, or is an empty string, it should not match anything.
- FIX: Precompile internal patterns for pseudo-classes to prevent having to parse them again.
1.2.1
- FIX: More descriptive exceptions. Exceptions will also now mention position in the pattern that is problematic.
- FIX: filterignoresNavigableStringobjects in normal iterables andTagiterables. Basically, it filters all Beautiful Soup document parts regardless of iterable type where as it used to only filter out aNavigableStringin aTagobject. This is viewed as fixing an inconsistency.
- FIX: DEBUGflag has been added to help with debugging CSS selector parsing. This is mainly for development.
- FIX: If forced to search for language in metatag, and no language is found, cache that there is no language in themetatag to prevent searching again during the current select.
- FIX: If a non BeautifulSoup/Tagobject is given to the API to compare against, raise aTypeError.
1.2
- NEW: Add Python 2.7 support.
- NEW: Remove old pre 1.0 deprecations.
1.1
- NEW: Adds support for [attr!=value]which is equivalent to:not([attr=value]).
- NEW: Add support for :active,:focus,:hover,:visited,:target,:focus-within,:focus-visible,:target-within,:current()/:current,:past, and:future, but they will never match as these states don't exist in the Soup Sieve environment.
- NEW: Add support for :checked,:enabled,:disabled,:required,:optional,:default, and:placeholder-shownwhich will only match in HTML documents as these concepts are not defined in XML.
- NEW: Add support for :linkand:any-link, both of which will target all<a>,<area>, and<link>elements with anhrefattribute as all links will be treated as unvisited in Soup Sieve.
- NEW: Add support for :lang()(CSS4) which works in XML and HTML.
- NEW: Users must install Beautiful Soup themselves. This requirement is removed in the hopes that Beautiful Soup may use this in the future.
- FIX: Attributes in the form prefix:attrcan be matched with the form[prefix\:attr]without specifying a namespaces if desired.
- FIX: Fix exception when [type]is used (with no value).
1.0.2
- FIX: Use proper CSS identifier patterns for tag names, classes, ids, etc. Things like #3or#-3should not match and should require#\33or#-\33.
- FIX: Do not raise NotImplementedErrorfor supported pseudo classes/elements with bad syntax, instead raiseSyntaxError.
1.0.1
- FIX: When giving a tag to select, it should only return the children of that tag, never the tag itself.
- FIX: For informational purposes, raise a NotImplementedErrorwhen an unsupported pseudo class is used.
1.0
- NEW: Official 1.0.0 release.
1.0.0b2
- NEW: Drop document flags. Document type can be detected from the Beautiful Soup object directly.
- FIX: CSS selectors should be evaluated with CSS whitespace rules.
- FIX: Processing instructions, CDATA, and declarations should all be ignored in :containsand child considerations for:empty.
- FIX: In Beautiful Soup, the document itself is the first tag. Do not match the "document" tag by returning false for any tag that doesn't have a parent.
1.0.0b1
- NEW: Add support for non-standard :contains()selector.
- FIX: Compare pseudo class names case insensitively when matching unexpected cases.
- FIX: Don't allow attribute case flags when no attribute value is defined.
0.6
- NEW: modeattribute is now calledflagsto allow for other options in the future.
- FIX: More corner cases for nthselectors.
0.5.3
- FIX: Previously, all pseudo classes' selector lists were evaluated as one big group, but now each pseudo class's selector lists are evaluated separately.
- FIX: CSS selector tokens are not case sensitive.
0.5.2
- FIX: Add missing sflag to attribute selector for forced case sensitivity of attribute values.
- FIX: Relax attribute pattern matching to allow non-essential whitespace.
- FIX: Attribute selector flags themselves are not case sensitive.
- FIX: typeattribute in HTML is handled special. While all other attributes values are case sensitive,typein HTML is usually treated special and is insensitive. In XML, this is not the case.
0.5.1
- FIX: Fix namespace check for :nth-of-type.
0.5
- NEW: Deprecate commentsiterandselectiterin favor oficommentsandiselect. Expect removal in version 1.0.
0.4
- NEW: Initial prerelease.