How to Make Robots More Robust

Web sites often change without notice. Such changes may result in the robot failing to do its task, unless you are careful. Robustness is the term used to describe how well robots cope with web site changes. The more changes the robot can deal with (and still work correctly), the more robust it is.

Robustness, however, comes at a price. It is more challenging and time-consuming to write robust robots than writing shaky robots. (The same goes for writing a program in any programming language.) It involves analyzing the web site in question, and to understand how it responds in various situations, such as when a registration form is filled out incorrectly. In a sense, writing robust robots involves a kind of reverse engineering of the web site logic, and usually the only way to do this is through exploration.

There are two different approaches to robustness that each serves a different purpose:

Let us look at each approach in turn.

Succeeding as much as possible might, for a robot extracting news type variables, mean that it should extract as many news items as possible. In Design Studio, you will use conditional actions Try steps, and data converters to deal with different layouts, missing information, and strangely formatted content.

Failing when things are not perfect might, for an order submission robot, mean that it should fail immediately if it cannot figure out how to enter a field correctly, or the order result page does not match an exact layout. In this sense, failing does not mean to generate an API exception. Instead, it means that the robot should return a value dedicated to describing errors and failure causes. Robots taking input variables will often choose to fail, rather than to succeed as much as possible. In Design Studio, you will use dedicated error type variables, error handling, and conditional actions to detect and handle unexpected situations.

For more information on Design Studio techniques that can be used to make robots more robust, you should consult the following sections: How to Extract Content from HTML, How to Extract Content From an HTML Table, How to Handle Errors, and How to Use the Tag Finders.