Identifying elements in the page | Introduction to XPath locators

In the Previous Tutorial, we learned to locate elements on the page by using the following attributes –

  • id
  • class
  • name

We discussed some challenges in that tutorial where we are not able to uniquely identify an element by using its id, class or name attribute. Here comes XPath to our rescue.

In this tutorial, we’ll learn about the XPath.

What is an XPath?

Every webpage is a document that consists of different HTML tags like <head>, <body>, <div>, <input>, <img> etc. 

<!DOCTYPE html>
<html>
<body>
<h1>My First Heading</h1>
<p>My first paragraph.</p>
</body>
</html>


As we know XML documents also consist of elements and attributes.

<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>

XPath is used to navigate through elements and attributes in an XML document. By using XPath we can query the page document as if it were an XML document. To locate a particular element we can write an XPath query that could use the element’s tag name as well as its attributes(s). The query would return the matching element in XML. Every modern browser has a built-in XPath engine.

How to write an Xpath query?

Let us learn to write XPath query through an example –

  • Open Chrome browser and navigate to www.google.com
  • Inspect the Google Search Box using Chrome’s Inspect Element. If you missed it how to do that you can check This Tutorial.
  • Let us now look closely at the source code for the Google Search input box. Inspect it again.
<table id="gs_id0" class="gstl_0 lst-t" cellspacing="0" cellpadding="0" style="height: 27px; padding: 0px;">
    <tbody>
        <tr>
            <td id="gs_ttc0" style="white-space: nowrap;" dir="ltr"></td>
            <td id="gs_tti0" class="gsib_a">
                <div id="gs_lc0" style="position: relative;">
                    <input id="gbqfq" class="gbqfif" type="text" value="" autocomplete="off" name="q" style="border:                  medium none; padding: 0px; margin: 0px; height: auto…; width: 100%; background: url('"                              dir="ltr" spellcheck="false">
                    </input>
                </div>
            </td>
            <td class="gsib_b"></td>
        </tr>
    </tbody>
</table>
  • We can construct an XPath query to locate this element –
//input[@name=’q’]

How to verify if the XPath query is correct?

We can use the search functionality of the Chrome developer tools. Right-click anywhere on the page and select ‘Inspect’. It should open the developer’s tool. Click the ‘Elements’ tab in the inspect window and use Ctrl+F to open the search window. Search for the locator (XPath, CSS etc) and verify if it appears in the search result.

Understanding the XPath query

What is that XPath doing man? //input[@name=’q’]

It is saying to find an input tag ANYWHERE (// indicates anywhere) in the document that has name property and its value is q.   

Difference between an XPath starting from ‘/’ and one starting from ‘//’?

A single slash at the start of Xpath instructs the XPath engine to look for elements starting from the root node. If we had written /html/body, it would have searched from the start of XML. A double slash at the start of Xpath instructs the XPath engine to search look for matching elements ANYWHERE in the XML document.

What does single-slash ‘/’ mean if used inside the XPath?

A single slash / anywhere in XPath signifies to look for element immediately inside its parent element. For example for our Search box source code, we can construct XPath like this as well –

//table/tbody/tr/td[2]/div/input[@id='gbqfq']

It is saying like this. Hey XPath Engine. Find an element with table tag ANYWHERE(//) in the document. Make sure that element has an immediate child element named as tbody. The tbody the element should have an immediate child as tr. Element tr can have many immediate children. I am interested in its SECOND ([2]) child element. This td element should have an immediate div child. And the next child in ancestor legacy should be input element. And this  ‘input’ element should have id property whose value should be gbqfq.

What does double-slash ‘//’ mean if used inside the XPath?

A double slash // signifies to look for any child or grand-child or grand grand-child or grand grand-child element inside the parent element. So for the same classic Google Search box, we can construct XPath like this as well –

//table//input[@class='gbqfif']

It is saying – Hey get me table tag element ANYWHERE in the document. Make sure that inside that table tag element there should be ‘input’ tag element. I don’t care if the input tag element is ‘table’ tag’s child or grandchild or grandchild. I just care ‘input’ tag should be ENCLOSED by ‘table’. And yes, don’t forget, the ‘input’ tag should have ‘class’ property and its value should be gbqfif.

Oh, man. You are driving me nuts.

I would suggest to first carefully examine the source code for Google Search Box and read the above paragraphs again and again. Then try to relate each sentence with the referred part of the source code.

Please don’t rush until you grasp XPath basics properly. 

In the Next Tutorial, we would dive deeper into the XPath ocean.

11 thoughts on “Identifying elements in the page | Introduction to XPath locators”

  1. If you are talking about quotes in input[@class='gbqfif']… you may use any one single or double quote… However if you are talking about quotes in By.tagName("input").. it should be double-quote

    Reply
  2. Hi Shadab, When I used //input[@name=’q’] as you suggested above, Selenium IDE gave error – Locator not found: //input[@name=’q’].
    When I used double quotes i.e //input[@name="q"], it worked fine. So does that mean we should be using double quotes?

    Reply
  3. Strange it is, After posting this comment when I reverted back to single quotes, it worked fine.
    Another Question: While I used /input[@name='q'], it is not able to locate. I suppose we can use single or double slash for this given page. Pls correct me

    Reply
  4. Double slash won't work in this case.
    Please check this section and you will get your answer 🙂 –
    Hey Dude, What is the difference between an XPath starting from ‘/’ and one starting from ‘//’?

    Reply
    • Hi Hazel

      Apologies for the late response. Somehow your comment was filtered as spam.
      I would more than happy if you can reach out to us – contact (at) teachmeselenium (dot) com

      Looking forward to hear from you.

      Thanks
      Shadab

      Reply
  5. Keep using this style of writing. Plain informal, direct to the point. And more importantly example oriented concept teaching. I learnt a lot in this and the next article.

    Reply

Leave a Reply to Shadab AnsariCancel reply