Extract Target

This action extracts data from a target URL and stores it in a variable or to a file. If the selected variable can not store the actual data (such as attempting to store a PDF file in an XML variable), an error may be generated when executing the action.

The action can also optionally store the actual content type and file name of the extracted data in user specified variables.

The variable in which to store the extracted data must be of one of the following types: Binary, Image, PDF, Text, HTML, or XML.

Properties

The Extract Target action can be configured using the following properties:

Location:
This property specifies which target URL to extract from.
URL:
Enter the URL directly in the text field provided. Note that standard URLs using the HTTP protocol can be written in shorthand. For example, "http://www.kapowtech.com" can be written instead as "www.kapowtech.com".
URL in Found Tag:
Specifies that the found tag contains the URL.
URL in Variable:
Specifies that the URL should be read from a specified variable.
URL from Expression:
Specifies an expression as the URL to open.
URL from Converters:
Specifies a list of data converters whose output is used as the URL to open.
URL Loaded when Clicking:
Specifies that the found node should be clicked, and the URL that would have been loaded as a result of this is used. For instance, if the result of clicking the found node resulted in a form submission, the data loaded by the Load Page step action will be the result of the form submission.
Store In:
This specifies where to store the extracted data. There are two choices for this:
Variable:
Specifies the variable in which to store the extracted data. The variable must be of type Binary, Image, PDF, Text, HTML, or XML.
File:
Specifies the file which to write the data to.
File Name:
Specifies the name and extension of the file.
Auto:
With this option a file name is generated automatically using the following strategy:
1. First the content disposition header of the response is inspected to see if it has a filename parameter and if so that name will be used.
2. Next the URL is inspected to see if it contains a file name and if so that name will be used.
3. If none of the above options succeed then an error is generated.
Value, Variable, Expression and Converters:
The value can be specified in several ways using a Value Selector.
Directory:
Specifies the directory where the file will be placed. The value can be specified in several ways using a Value Selector.
Create Directories:
Specifies whether to create all the directories in the specified path that does not already exist. If the option is selected the directories are created. If the option is not selected the directories must exist and if not then an error is generated.
Override Strategy:
Specifies a strategy for what to do when the selected file already exists.
Override File:
Any existing file will be replaced.
Never Override File:
Ensures that an existing file will never be replaced. If the file already exists then an error is generated.
Create a New File:
Ensures that a new file will always be created. If there already exists a file with the selected name then a new unique file name will be generated for the file. This new file name will be the originally selected file name with a serial number added to the end just before the extension, e.g. myData_1.dat where _1 was added to the original file name myData.dat.
Store Meta Data In:
Specify variables to be used to store meta data about the extracted data.
Content Type:
This specifies an optional variable in which to store the content type of the data. For example, the content type could look like this for an image:
image/gif

and like this for a plain text:

text/plain; charset=iso-8859-1
File Name:
This specifies the optional variable in which to store the file name of the extracted data. If the data is saved to a file then the file name will be the full path of the file actually used. If the data is loaded into a variable then the file name will be the file name of the original resource (obtained from the URL or from the content disposition header for the response).
Options:
Options:
The robot's options can be overridden with the step's own options. An option that is marked with an asterisk in the Options Dialog will override the one from the robot's configuration. All other options will be the same as specified for the robot.