Having some trouble understanding why the :contains() pseudo class in CSS selectors works the way it does?

StackOverflow https://stackoverflow.com/questions/9334539

Question

I am using CSS selectors with Selenium and Cucumber. When a locator doesn't work I test it out using the console of the Chrome Developer Tools. I keep encountering a behavior I don't understand (as in why does it do what it is doing and not what I need it to do...). Please look at these locators:

  1. div.view_header ~ div input.my_button

  2. div:contains(My Header Title) ~ div input.my_button

  3. div:contains(My Header Title) ~ div div div input.my_button

In my DOM the element matching the first part of each of those locators is the same...

<div class="view_header foo">    My Header Title  </div>

The issue is that only locators #1 & #3 above will actually match anything. Does anybody know why this is true. I realize that div:contains(foo) will match not only the div that actually contains foo but all parent divs as well but it seems to me that the rest of the locator elements should be sorting it out so that it should work.

I'm just looking for any insight and possibly suggestions for a way to make sure that the 'my_button' I am clicking is the one under 'My Header Title' and not a 'my_button' somewhere else on the page (and the only easy way to distinguish them is by the header they are under) while eliminating the seemingly excess DOM structure in the locator so as to make it more likely to be reusable.

<head>
<body class="bp">
  <div style="left: -100em; position: absolute; width: 100em;"></div>
  <input class="refresh_marker" type="text" value="no" style="display:none">
  <div class="container">
    <div id="nav_bar">
    <div id="user_bar">
    <div id="wrapper" style="border-radius: 10px 10px 10px 10px;">
      <div class="content">
        <div class="page_title"> Title </div>
        <div></div>
        <a class="change_tracker_link"> &nbsp; </a>
        <div class="breadcrumb_trail">
        <style type="text/css">
        <div id="dialog_no_new_assoc" class="hide" title="No Associations Selected"></div>
        <div class="organizer_widget root_organizer" title="WorkflowItem" style="">
          <input id="data_classifier" type="hidden" value="Workflow::WorkflowItem">
          <input id="data_id" type="hidden" value="34">
          <input id="data_getter" type="hidden">
          <input id="collection_vertex_id" type="hidden" value="4cb1ecc300fa5f77844b1e87431d0a25390c1c77">
          <input id="view-name" type="hidden" value="EnterPaperInformation">
          <div class="object organizer">
            <div class="clear"></div>
            <div class="interior">
              <form method="POST" enctype="multipart/form-data">
                <input type="hidden" value="4cb1ecc300fa5f77844b1e87431d0a25390c1c77" name="vertex_id">
                <input type="submit" value="Save" style="display: none;" name="submit_form">
                <div class="organizer_header view_header"> My Header Title </div>
                <div class="organizer_widget" title="Citation" style="">
                  <input id="data_classifier" type="hidden" value="Bibliography::Citation">
                  <input id="data_id" type="hidden" value="10">
                  <input id="data_getter" type="hidden" value="citation">
                  <input id="collection_vertex_id" type="hidden" value="5376dcc81102a5d76bf829513b096be8f67e560d">
                  <input id="view-name" type="hidden" value="CitationEntrySummary">
                  <div id="citation" class="object organizer">
                    <div class="clear"></div>
                    <div class="interior">
                      <div id="Citation___id_widget" class="widget_row numeric">
                      <div id="Citation___title_widget" class="widget_row string">
                      <div id="Citation___abbreviated_title_widget" class="widget_row string">
                      <div id="Citation___authors_display_string_widget" class="widget_row string">
                      <div id="Citation___language_widget" class="widget_row choice">
                      <div id="Citation___link_widget" class="widget_row link">
                      <input type="hidden" value="Bibliography::JournalArticle___10" name="check_5376dcc81102a5d76bf829513b096be8f67e560d[]">
                      <input id="ba_citation" class="my_button" type="button" value="Break Associations" name="break_assoc_5376dcc81102a5d76bf829513b096be8f67e560d">
                      <div class="clear"></div>
                      <input type="hidden" value="5376dcc81102a5d76bf829513b096be8f67e560d" name="vertices[]">
                    </div>
                  ...
Was it helpful?

Solution

The usual thing I do when I find myself in this kind of trouble is to look at the spec.

As you probably know, there's none for :contains() in the current spec and therefore you rely on undocumented, unspeced features of a particular browser/parser. It should work, but it doesn't - obviously the implementation wasn't complete. And now the pseudo-class is gone.

Could you go for an XPath instead? Either by internal Selenium methods or JavaScript. This XPath is the same as your CSS selector number 2:

//div[contains(text(),'My Header Title')]/following-sibling::div//input[contains(@class,'my_button')]

EDIT

After your comment showed me that we're talking about Selenium RC and, therefore, Sizzle, I dug deeper.

I took your example HTML, stripped it from the hidden and (seemingly) needless elements, and was left with this:

<!DOCTYPE html>
<html lang="en">

<head>
    <meta charset="utf-8" />
    <script src="sizzle.js" type="text/javascript"></script>
</head>

<body class="bp">
  <div class="container">
    <div id="nav_bar">
    <div id="user_bar">
    <div id="wrapper" style="border-radius: 10px 10px 10px 10px;">
      <div class="content">
        <div class="breadcrumb_trail">
        <div class="organizer_widget root_organizer" title="WorkflowItem" style="">
          <div class="object organizer">
            <div class="interior">
              <form method="POST" enctype="multipart/form-data">
                <div class="organizer_header view_header">    My Header Title  </div>
                <div class="organizer_widget" title="Citation" style="">
                  <div id="citation" class="object organizer">
                    <div class="clear"></div>
                    <div class="interior">
                      <div id="Citation___id_widget" class="widget_row numeric">
                      <div id="Citation___title_widget" class="widget_row string">
                      <div id="Citation___abbreviated_title_widget" class="widget_row string">
                      <div id="Citation___authors_display_string_widget" class="widget_row string">
                      <div id="Citation___language_widget" class="widget_row choice">
                      <div id="Citation___link_widget" class="widget_row link">
                      <input id="ba_citation" class="my_button" type="button" value="Break Associations" name="break_assoc_5376dcc81102a5d76bf829513b096be8f67e560d" />
                      <div class="clear"></div>
                      </div></div></div></div></div></div>
                    </div>
                  </div>
                </div>
              </form>
            </div>
          </div>
        </div>
        </div>
      </div>
    </div>
    </div>
    </div>
  </div>
</body>

</html>

I downloaded the latest Sizzle and I obtained the version of Sizzle that is actually used by Selenium in the current release.

Turns out those two are very different.

E.g. the contains implementation of current Sizzle:

return ~( elem.textContent || elem.innerText || getText( elem ) ).indexOf( match[3] );

and the implementation Selenium uses:

return (elem.textContent || elem.innerText || getText([ elem ]) || "").indexOf(match[3]) >= 0;

I tried both implementations on my test document, results can be seen here (click to enlarge):

Current Sizzle - matches all perfectly Current Sizzle Results

Selenium's Sizzle - matches 1 out of 4 Selenium's Sizzle Results


The results say it all. Selenium uses an old version of Sizzle that is somehow imperfect in handling of :contains() pseudo-class. The current Sizzle version doesn't suffer from the bug and is able to find all elements well.

Now, you can do any of these:

  1. File a Selenium bug.
  2. Use XPath as a workaround.
  3. Switch the sizzle.js file in your Selenium package.

OTHER TIPS

#Selenium #Webdriver handle only HTML elements but with using java script executor It's possible to handle #pseudo elements in selenium #webdriver. 

Ex: :after , :before etc

String script = "return window.getComputedStyle(document.querySelector('Enter root classname here'),':after / :before').getPropertyValue('content')";
Thread.sleep(3000);
JavascriptExecutor js = (JavascriptExecutor) driver;
String content = (String) js.executeScript(script);
System.out.println(content);
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top