Extract Number

This data converter finds and extracts a number and outputs it in the standard number format.

Note: If a number that is already in the standard number format should be reformatted, use the Format Number data converter instead.

Properties

The Extract Number data converter is configured using the following properties:

Format Pattern:
Contains a pattern that specifies the format of the number to be extracted. Either use one of the default patterns, or see below for details about specifying a pattern.
Decimal Separator:
Contains the possible decimal separators in the number to be extracted, e.g. ".". More than one separator can be specified.
Thousands Separator:
Contains the possible thousands separators in the number to be extracted, e.g. ",". More than one separator can be specified.
Minus Sign:
Contains the character to use as minus sign in the number, typically '-'.
Multiply By:
Specifies a multiplication factor that will be multiplied to the extracted number.
Convert to Integer:
If this field is checked, the extracted number will be converted to an integer.
Constants:
Contains definitions of constants which may occur before or after the number to be extracted. For each constant, the name (e.g. kilo) and the value (e.g. 1000) can be given as well as the position of the constant (before and/or after the number to be extracted).
Note that the name of a constant must be precisely what comes before or after the number to be extracted. For instance, let the constants configured be kilo=1000.0 and double=2.0. From the input "2 kilo", the number 2000.0 will be extracted, but from the input "2 double kilo", only the number 2.0 will be extracted since no constants are named double kilo.
Description:
Type in a description to be shown in the list of data converters. If there is no type in a description, one will be generated.

Specifying a format pattern

The format pattern provides a very flexible way of specifying the number format. However, the rules for specifying the pattern can be somewhat difficult to understand, so finding the default pattern that matches the required format in the best possible way, and then experimenting with changing that default pattern might be an easier solution.

In a pattern, the following special characters can be used:

Special Character Meaning
0 A digit.
# A digit, but zero is not shown.
. The decimal separator, i.e. the character specified in the Decimal Separator field.
, The thousands separator, i.e. the character specified in the Thousands Separator field.
- The minus sign, i.e. the character specified in the Minus Sign field.
E In scientific notation, separates the mantissa and the exponent.

Note that in the pattern, the '.' character is always used to select the decimal separator, regardless of what is entered in the Decimal Separator field. The '.' character will then be replaced by the character in the Decimal Separator field when the number is formatted. The same applies to the thousands separator and minus sign.

Separate patterns can be specified for positive and negative numbers. This is done by specifying two patterns separated by semicolon (';'). For example, use the pattern "#,##0.00;(#,##0.00)" if you want negative numbers to be parenthesized instead of the default where the minus sign character is placed in front of negative numbers.

Note

If the input uses scientific notation with a large exponent (e.g. the number 6.023E23), Convert to Integer should generally not be checked, since conversion of such large numbers to integers may give inappropriate results.

Examples

Consider this input:

Price is USD 33,555.77.

With Format Pattern set to "###0.0", Decimal Separator set to ".", Thousands Separator set to ",", Minus Sign set to "-", Multiply By set to "1.0", Convert to Integer not checked, and no Constants configured, the number 33555.77 is extracted.

In the example above, if Convert to Integer is checked, the number 33556 is extracted.

Now, consider this input:

Price is USD 10.5 mill.

With Format Pattern set to "0.000", Decimal Separator set to ".", Thousands Separator set to ",", Minus Sign set to "-", Multiply By set to "1.0", Convert to Integer checked, and Constants set to mill.=1000000.0 and bill.=1000000000.0, the number 10500000 is extracted.

In the example above, if Convert to Integer is not checked, the number 1.05E7 is extracted.