Operator IPAddressLocation
IPAddressLocation is an operator for the Streams product that finds the geographical location of IP addresses received in input tuples, based on the subnets they are part of, and emits output tuples containing the country, state or province, city, latitude, and longitude of the subnets. The operator may be configured with one or more output ports, and each port may be configured to emit different tuples, as specified by output filters.
The IPAddressLocation operator consumes input tuples containing IP version 4 and 6 addresses, selects messages to emit as output tuples with filter expressions, and assigns values to them with output attribute assignment expressions. Output filters and attribute assignments are SPL expressions. They may use any of the built-in SPL functions, and any of these functions, which are specific to the IPAddressLocation operator:
The IPAddressLocation operator emits a tuple on each output port for each input tuple, optionally filtered by the 'outputFilters' parameter. Geographical location data is assigned to output attributes with the location result functions, based on the IP addresses specified with the functions. All attributes of all output ports must be assigned values, either with explicit assignment expressions, or implicitly by copy from input tuples.
This operator is part of the network toolkit. To use it in an application, include this statement in the SPL source file:
use com.teracloud.streams.network.location::*;
Dependencies
The IPAddressLocation operator depends upon geographical location data provided by Maxmind, Inc.:
- GeoLite2 data is available free from https://dev.maxmind.com/geoip/geoip2/geolite2/
-
GeoIP2 data, which is more accurate, may be purchased from https://www.maxmind.com/en/geoip2-city
The required parameter geographyDiectory parameter must point to a directory containing geography files, after downloading and unpacking.
For GeoLite2 data, the geography files are:
GeoLite2-City-Locations-en.csv
GeoLite2-City-Blocks-IPv4.csv
GeoLite2-City-Blocks-IPv6.csv
For GeoIP2 data, the geography files are:
GeoIP2-City-Locations-en.csv
GeoIP2-City-Blocks-IPv4.csv
GeoIP2-City-Blocks-IPv6.csv
The format of the geography files is described here:
Threads
The IPAddressLocation runs on the thread of the upstream operator that sends input tuples to it. It does not start any threads of its own.
Exceptions
The IPAddressLocation operator will throw an exception and terminate in these situations:
- No output ports are specified.
-
The outputFilters parameter is specified, and the number of expressions
Sample Applications
The network toolkit includes several sample applications that illustrate how to use this operator. See the samples directory in your Streams installation.
References
The result functions that can be used in boolean expressions for the outputFilters parameter and in output attribute assignment expressions are described here:
The format of the GeoIP2 and GeoLite2 CSV files provided by Maxmind, Inc. is described here:
The format of 'geohash' codes for latitude and longitude coordinates is described here:
Summary
- Ports
- This operator has 2 input ports and 0 or more output ports.
- Windowing
- This operator does not accept any windowing configurations.
- Parameters
- This operator supports 3 parameters.
Required: geographyDirectory
Optional: initOnTuple, outputFilters
- Metrics
- This operator does not report any metrics.
Properties
- Implementation
- C++
- Threading
- Always - Operator always provides a single threaded execution context.
- Ports (0)
-
The IPAddressLocation operator requires one input port. One or more input attributes must be of type uint32 or list<uint8>[16] containing IP version 4 or 6 addresses, respectively. These attributes are specified as arguments to the location result functions.
- Properties
-
- Optional: false
- ControlPort: false
- TupleMutationAllowed: false
- WindowingMode: NonWindowed
- WindowPunctuationInputMode: Oblivious
- Ports (1)
-
Control port that ingests a file path pointing to a MaxMind GeoIP2 (or GeoLite2) database CSV files. The operator determines whether the incoming file path refers to a "locations" file, "IPv4 blocks" file or "IPv6 blocks" file based on the name of the file. The expected file names are:
- For the "locations" file: GeoIP2-City-Locations-en.csv or GeoLite2-City-Locations-en.csv
- For IPv4 "blocks" file: GeoIP2-City-Blocks-IPv4.csv or GeoLite2-City-Blocks-IPv4.csv
- For IPv6 "blocks" file: GeoIP2-City-Blocks-IPv6.csv or GeoLite2-City-Blocks-IPv6.csv
This control port can be used to dynamically update the operator's internal database. Each time a tuple is received containing a path to one of the files listed above, the operator will update it's internal table with the data in the file.
This input port expects a tuple containing a single attribute of type rstring.
- Properties
-
- Optional: true
- ControlPort: true
- TupleMutationAllowed: false
- WindowingMode: NonWindowed
- WindowPunctuationInputMode: Oblivious
- Assignments
- This operator allows any SPL expression of the correct type to be assigned to output attributes.
- Ports (0...)
-
The IPAddressLocation operator requires one or more output ports.
Each output port will produce one output tuple for each input tuple if the corresponding expression in the outputFilters parameter evaluates true, or if no outputFilters parameter is specified.
Output attributes can be assigned values with any SPL expression that evaluates to the proper type, and the expressions may include any of the location result functions. Output attributes that match input attributes in name and type are copied automatically.
- Properties
-
- TupleMutationAllowed: false
- WindowPunctuationOutputMode: Preserving
Required: geographyDirectory
Optional: initOnTuple, outputFilters
- geographyDirectory
-
This required parameter specifies a directory containing GeoIP2 or GeoLite2 files downloaded from MaxMind, Inc.
- Properties
-
- Type: rstring
- Cardinality: 1
- Optional: false
- ExpressionMode: AttributeFree
- initOnTuple
-
This optional parameter takes an expression of type 'boolean' that specifies whether or not the operator should load the files specified with the geographyDirectory parameter on the first tuple.
The default value is 'false' and the operator loads the files on operator startup. In some cases loading the files during operator startup might cause timeouts and setting this parameter value to 'true' resolves this.
- Properties
-
- Type: boolean
- Optional: true
- ExpressionMode: Expression
- outputFilters
-
This optional parameter takes a list of SPL expressions that specify which input tuples should be emitted by the corresponding output port. The number of expressions in the list must match the number of output ports, and each expression must evaluate to a boolean value. The output filter expressions may include any of the location result functions.
The default value of the outputFilters parameter is an empty list, which causes each input tuple to be emitted by all output ports.
- Properties
-
- Type: boolean
- Optional: true
- ExpressionMode: Expression
- No description for library.
- No description for library.
- No description for library.
- No description for library.