How to Extract Content From an HTML Table

Interesting content is often located in tables. Unfortunately, HTML tables are far too often irregular in both content and structure. Fortunately, Design Studio has been designed to deal with such irregularities as described below. (Note that the techniques described in this section are not really restricted to dealing with table content and structure irregularities. They can be used when dealing with all kinds of tag irregularities.)

If the table containing the interesting content is perfectly regular in both content and structure, then you can extract the content as described in How to Extract Content from HTML. The robot will typically look like this:

The first step contains a For Each Tag action that loops through the <tr>-tags in the <tbody>-tag of a <table>-tag. It is followed by several steps that each extract content from a cell (column-wise) in a table row.