Wednesday, July 24, 2013

Automating Internet Explorer (Add Dom Navigation)


The previous two posts have covered creating an IE automation object. This post will build on this idea by allowing you to actually do things with the webpage that is loaded. As discussed in the previous posts, the DOM is the actual webpage that contains the data and information you see on the screen. You can think of the DOM as being akin to a notebook with tabs – go to a particular tab to get at information inside that tab. This is a very simplistic view, but it illustrates the idea of grabbing a tab and then using it to get the data inside of the tab. This tab may contain other tabs inside it which will be easier to get at once you get the main (parent) tab.

In this post we will discuss wrapper methods, the Internet Explorer (IE) document object, and a brief discussion about different ways of using the IE object to achieve the same end goal.

Wrapper Methods

A wrapper method is essentially a method that allows you to get at another method of an object inside another object. For example, our IEAgent has an IE object inside of it that has a whole slew of useful methods built in. Without a wrapper method there would be no way for us to access these methods.

Wrappers are vital to help preserve encapsulation, but more importantly they make it possible to use objects stored inside other objects without pulling them out of the parent. It is like being able to unlock your car over your phone – you just told something else to open the lock, and that thing does it for you, but the end result will be the same.

One of the great advantages to creating wrapper classes when automating IE is the ability to add the waitForLoad method described in the previous post. This extends the functionality of the original IE object without have to write a whole lot of custom code.

IE Document Object

Before we discuss some useful code to add into your IEAgent, first we will discuss a bit more about the DOM and html. It should be noted now that IE can do more than just read DOM objects. Indeed, IE is a very versatile program in that it has built in Visio Document viewing, it can parse XML, JSON, and other data structures, and perform a wide variety of other tasks.

Part of the problem with IE is that for years they were the ones pushing the envelope on web technology, so they had to define their own way of doing things. It wasn’t until some other people came in and pushed for standards for the tech that Microsoft developed that MS was forced to not only support the systems they developed, but also had to now include the standards set by the industry if they wanted to stay competitive. For this reason there are lots of issues in using IE to do anything over the internet, including reading html documents.

Since IE is so extensible, it is hard to really say that this is really automating IE so much as it is automating HTML DOM traversal using IE. This post does not attempt to even come close to tackling all the different ways you could automate IE, or even the different ways to automate HTML traversal in IE since there are more ways than one for reasons discussed. We will only one way to cover traversal across a DOM, but there are other ways.

The Document Object

With the previous discussion in mind, we will now cover the basic structure that provides all the useful methods we need in automating web browsing. As discussed multiple times, an HTML document is a Document Object Model, or something that represents objects that can be manipulated by a program.

Because of this fact, we can make assumptions about every DOM object in an HTML document, such as properties and methods that all DOM objects inherit. In IE, some of these are methods that allow you to grab all objects nested underneath a particular object of a given name, type, etc. Nesting is a crucial idea to understand, so I would read the attached article if you don’t know what this means.

Document models use the concept of tags to denote when an object begins and ends. From here on out, we will refer to the objects in the DOM as tags.

Grabbing Tags

The first thing you must be able to do in a DOM is grab a particular tag and perform a method on it. This could be anything from performing the click function on a button tag (which would correspond to clicking on a button on a webpage in the browser) to getting all nested tags within that tag. Both methods are essential in mimicking normal interaction with an HTML document.

The first method we will add is grabbing a tag by it’s id. Don’t worry if you don’t understand what an id is, or a class, or a name. Just know that an id is a completely unique identifier on any webpage, so calling this method will always return exactly one object, and that object will be of the type that of the tag the id is identifying (hence the term id). A name is not quite as universal because multiple tags can have the same name. The same goes with class. But knowing that doesn’t mean that these can’t be useful to helping automate the process. Indeed, many developers do not add ids to their tags at all, so these become very useful because almost all developers will use name and class tags.

Add the following code to your IEAgent:

Public Function getElementByIdWrapper(id As String) As Object
       Set getElementByIdWrapper = ie.document.getElementById(id)
End Function

As you can see this function will return an object. You could also set this equal to IHTMLElement, but that level of specificity can cause issues in some code. As long as you use this code to return to an object that is of type IHTMLElement then you should will still be able to use intellisense to help you code (we talked about intellisense in a previous post. In actuality, you don’t need intellisense at all – you could make all things objects and VBA will figure out what is supposed to be what on runtime, but that makes for a difficult coding). Just know that the object is there to help prevent some weird behavior in the IEAgent.

As discussed, the above method will grab the tag of the DOM with the specified id. We will discuss in a later post how to determine the id of a tag. The following three methods will grab multiple tags that contain the given parameter:

Public Function getElementsByTagNameWrapper(tagName As String) As Object
       Set getElementsByTagNameWrapper = ie.document.getElementById(tagName)
End Function

Public Function getElementsByNameWrapper(name As String) As Object
       Set getElementsByNameWrapper = ie.document.getElementById(name)
End Function

Public Function getElementsByClassNameWrapper(className As String) As Object
       Set getElementsByClassNameWrapper = ie.document.getElementById(className)
End Function

You will call each of these functions by setting an object equal to the output of this method:

Set someObject = ieAgent.getElement<rest of Name here>( parameter )

Each of these functions will return what is called an IHTMLElementCollection (again, we just used object here). Since this is a collection, we will be able to go over each tag in a for loop (defined later) to get at the tags we want. Don’t worry right now about which method above to use, we will cover that later. Just know that each of these will provide the functionality we will need to navigate the DOM.

Further Wrapper Methods

The above examples give a very good pattern to follow when creating a wrapper method. I will not go through each method as that is way beyond the scope of this blog. The three that I have listed here will be able to handle 95% of all your automating needs. If you want to add additional wrapper methods, you can use the set <functionName> = ie.document.<predefined function name> pattern and use this website to add whatever methods you feel will be useful.

Other Methods to Achieve Document Automation

Due to the storied history of IE, there are different ways of handling DOM navigation. I won’t provide any sort of code here except to mention how you may access it.

One method that a lot of people who do what is called page scraping (essentially what we are doing with these methods) is using something called Regular Expressions to find the exact part of the DOM they want to manipulate. The great down side to this is that Regular expressions are difficult and messy. They are very accurate, but it is not exactly clean code or easy to read. It also requires the creation of other objects that will render the code you write into something useful, and then you are dealing more with strings then you are objects. If your purpose is to actually click on things in the DOM, you are far better off doing it this way as it would probably require some more thinking and problem solving to get to what you want.

In addition to that complexity, it doesn’t handle well pages that would have a dynamic nature to the number of rows you wish to search for. With the methods above you don’t have to worry about the rules of regular expressions that would limit the number of fields you want to grab – you simply write a loop and let the computer determine it for you every time.

Another method is using the document.all method on the DOM. This will return all tags for you to then write your own methods, or to iterate over them in certain ways, and use some other methods. But you are severely limited in this scope if you don’t wish to write your own methods. However, if you are clever, this method will make writing custom code much easier and provides greater flexibility. For the reason of it being far more complicated than we will ever need for our purposes, we will not be using this.

Finalizing the IEAgent

With these methods you have essentially made the entire IEAgent that will be used in automating web browsing. There are a few additional methods we should define before moving on. Each method doesn’t really have a particular category to fit under except to describe them as being some useful functions to help determine the state of a webpage, return some values of the IE object, and so on.

The first method is used to return the IE object inside the IEAgent:

Public Property Get explorer() As Object
       Set explorer = ie
End Property

Remember that we defined the InternetExplorer object as being named ie. This simply returns that object when you call ieAgent.explorer. You will use this code like this:

Set someObject = ieAgent.explorer

The next method returns the handle of the IE object:

Public Property Get returnHandle() As Object
       returnHandle = handle
End Property

And finally, this method will get the title of the webpage you are viewing (the text that appears at the top of a tab):

Public Function getDocumentTitle() As String
    getDocumentTitle = ie.document.title
End Function

This last method will be useful when writing a login method.

Conclusion

This concludes the discussion on creating an IEAgent to automate IE. The next few posts will discuss how to navigate a DOM with the methods we defined, how to determine what tags you will want to use, and how to manipulate those tags.

No comments:

Post a Comment