Looking Inside DOM Page Elements

This article shows you how to access the contents and attributes of any DOM element in a Web page. You also find out about node properties and relationships, and learn how to move around the DOM tree.

In our last DOM tutorial, you learned how to access the elements inside your Web page as JavaScript objects. Once you've done that, how do you find out more about each element? This tutorial shows you how to delve deep into any DOM element object.

Everything is a node

As you've seen in our previous tutorials, the Document Object Model breaks an entire Web page down into a tree of node objects. Elements are nodes; attributes are nodes; chunks of text are nodes. Even the document itself is a node.

All nodes are related to each other, with the document node at the top of the tree. For example, if you have a p element inside a div element at the top of your Web page, the p node is a child of the div node. In turn, the div node is a child of the body node.

Once you understand this concept, it's easy to manipulate DOM elements. For example, to access the text inside a paragraph, you retrieve the paragraph node's child text node, then read that node's value. To access an attribute of a paragraph, you retrieve its attribute node, and so on.

Finding out about a node

Each node in the DOM tree contains three properties that tell you about the node:

nodeType
An integer value representing the type of the node (element, attribute, text, and so on). See below for details.
nodeName
The name of the node, as a string. For example, the nodeName of an h1 element is "H1".
nodeValue
The value of the node. For element nodes, the value will be null. For text nodes, the value is the text itself. For attribute nodes, the value is the attribute's value, and so on.

The nodeType property is an integer that tells you the type of the node. The value corresponds to one of twelve constants. Here's a list of the ones you're most likely to use:

Value Constant Description
1 Node.ELEMENT_NODE The node is an (X)HTML element
2 Node.ATTRIBUTE_NODE The node is an attribute of an element
3 Node.TEXT_NODE The node is a chunk of plain text
8 Node.COMMENT_NODE The node is an (X)HTML comment
9 Node.DOCUMENT_NODE The node is the Document node

Say you have the following p element in your Web page:


<p id="welcome">Welcome to the Widget Company!</p>

You could find out some info about this node as follows:


var element = document.getElementById( "welcome" );
alert ( element.nodeType );  // Displays "1"
alert ( element.nodeName );  // Displays "P"
alert ( element.nodeValue ); // Displays "null"

Node relationships

Once you have one node, you can access any other node that is related to it. All nodes have the following properties:

childNodes
A collection of all the children of the node
firstChild
The first node in the collection of child nodes
lastChild
The last node in the collection of child nodes
nextSibling
The next node that has the same parent as the node
previousSibling
The previous node that has the same parent as the node
parentNode
The node's parent

Let's say you have the following markup in your page:


<ul>
  <li id="widget1"><a href="superwidget.html">SuperWidget</a></li>
  <li id="widget2"><a href="megawidget.html">MegaWidget</a></li>
  <li id="widget3"><a href="wonderwidget.html">WonderWidget</a></li>
</ul>

The following JavaScript displays the text inside the second link ("MegaWidget"):


var widget2 = document.getElementById( "widget2" );
alert( widget2.firstChild.firstChild.nodeValue );

The first line uses a widget2 variable to store the li element node with the id of "widget2".

The second line displays the text inside the link. The first child of the "widget2" li element is the a element, and the first child of that a element is the text node, whose nodeValue is "MegaWidget".

Example: Getting all paragraph text in a page

Here's a simple page containing, amongst other things, three paragraphs of text. There's also a JavaScript function, displayParas(), triggered by clicking the "Display paragraph text" link, that displays the text inside each paragraph in the page:


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
        "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
  <head>
    <title>My Web Page</title>
    <script type="text/javascript">
      // <![CDATA[
        function displayParas() {
          var output = "";
          var paras = document.getElementsByTagName( "p" );

          for ( i=0; i < paras.length; i++ ) {
            for ( j=0; j < paras[i].childNodes.length; j++ ) {
              if ( paras[i].childNodes[j].nodeType == Node.TEXT_NODE ) {
                output += paras[i].childNodes[j].nodeValue + "\n";
              }
            }
          }

          alert( output );
        }
      // ]]>
    </script>
  </head>
  <body>
    <h1>The Widget Company</h1>
    <p>Welcome to the Widget Company!</p> 
    <p>We have lots of fantastic widgets for sale.</p>
    <p>Feel free to browse!</p>
    <p><a href="javascript:displayParas()">Display paragraph text</a></p>
  </body>
</html>

Try it out! Click the "Display paragraph text" link, then use your Back button to return here.

Clicking the link displays an alert box with the following contents:


Welcome to the Widget Company!
We have lots of fantastic widgets for sale.
Feel free to browse!

First, the code stores a list of the page's p elements in a paras variable. It then loops through the elements in paras. For each element, it loops through all the child nodes of the element. When it finds a text node, it adds its value (the text) to the output string, which is then displayed using alert() at the end of the script.

You may be wondering why the fourth paragraph's text, "Display paragraph text", isn't displayed. This is because the child node of the fourth p node is in fact the a element, so the code skips it. The "Display paragraph text" node is the grandchild of the p node, so if you want to access this text, you have to dig deeper than the p node's child.

Be wary of whitespace!

One thing to watch out for when retrieving child nodes is any whitespace in the HTML markup. Consider the following markup:


    <div id="welcome">
      <p>Welcome to the Widget Company!</p>
    </div>

You might expect the following code to display the value "1" (Node.ELEMENT_NODE), because the child node of the "welcome" div appears to be the p element:


var ul = document.getElementById( "welcome" );
alert( ul.firstChild.nodeType );

In fact it displays "3" (Node.TEXT_NODE). This is because the whitespace — the carriage return and space/tab characters — between the opening div tag and the opening p tag is in fact a text node in its own right. This text node is the first child of the div node.

In order to accurately locate the paragraph element node, you need to loop through the child nodes until you find the correct node:


var ul = document.getElementById( "welcome" );
var para = null;

for ( i=0; i < ul.childNodes.length; i++ ) {
  if ( ul.childNodes[i].nodeType == Node.ELEMENT_NODE && ul.childNodes[i].nodeName == "P" ) {
    para = ul.childNodes[i];
    break;
  }
}

// Displays "Welcome to the Widget Company!"
if ( para ) alert( para.firstChild.nodeValue );

This is obviously tedious, so it's a good idea to wrap code such as this inside a function so that you can reuse it.

Retrieving attributes

Attribute nodes are stored a bit differently to other nodes. An element's attribute nodes aren't children of the element; instead, you access the nodes through the element's attributes collection:


var attributes = element.attributes;

Say you have the following form field in your Web page:


<input type="text" name="widgetName" id="widgetName" />

The following code displays each of the attributes of the field:


var output = "";
var widgetName = document.getElementById( "widgetName" );
var attrs = widgetName.attributes;
for ( i=0; i < attrs.length; i++ ) {
  output += ( attrs[i].name + "=" + attrs[i].value ) + "\n";
}
alert( output );

type=text
id=widgetName
name=widgetName

Note that the attribute nodes aren't in any particular order.

You can also retrieve an attribute node directly if you know its name:


var widgetName = document.getElementById( "widgetName" );
alert( widgetName.attributes["type"].value ); // Displays "text"

The following methods also let you retrieve an attribute by name:

element.getAttributeNode ( name )
Returns the attribute node called name
element.getAttribute ( name )
Returns the value of the attribute node called name

For example:


var widgetName = document.getElementById( "widgetName" );
alert( widgetName.getAttribute( "type" )); // Displays "text"

By the way, you can test if an element contains a particular attribute with the element.hasAttribute() method:


result = element.hasAttribute( attributeName )

hasAttribute() returns true if the element contains the attribute named attributeName; false otherwise.

Now you've read this tutorial, you can hop around from one node in the DOM tree to another, and you can dig deep inside any element node to view its contents - that is, its child nodes and its attributes. However, so far you've merely retrieved element content. In the next tutorial you'll learn how to alter the contents of an element, as well as add and remove elements in a Web page.

Follow Elated

Related articles

Responses to this article

There are no responses yet.

Post a response

Want to add a comment, or ask a question about this article? Post a response.

To post responses you need to be a member. Not a member yet? Signing up is free, easy and only takes a minute. Sign up now.

Top of Page