Operator IPAddressLocation

Primitive operator image not displayed. Problem loading file: ../../image/tk$com.teracloud.streams.network/op$com.teracloud.streams.network.location$IPAddressLocation.svg

IPAddressLocation is an operator for the Streams product that finds the geographical location of IP addresses received in input tuples, based on the subnets they are part of, and emits output tuples containing the country, state or province, city, latitude, and longitude of the subnets. The operator may be configured with one or more output ports, and each port may be configured to emit different tuples, as specified by output filters.

The IPAddressLocation operator consumes input tuples containing IP version 4 and 6 addresses, selects messages to emit as output tuples with filter expressions, and assigns values to them with output attribute assignment expressions. Output filters and attribute assignments are SPL expressions. They may use any of the built-in SPL functions, and any of these functions, which are specific to the IPAddressLocation operator:

The IPAddressLocation operator emits a tuple on each output port for each input tuple, optionally filtered by the outputFilters parameter. Geographical location data is assigned to output attributes with the location result functions, based on the IP addresses specified with the functions. All attributes of all output ports must be assigned values, either with explicit assignment expressions, or implicitly by copy from input tuples.

Dependencies

The IPAddressLocation operator depends upon geographical location data provided by Maxmind, Inc.:

The required geographyDirectory parameter must point to a directory containing geography files, after downloading and unpacking.

For GeoLite2 data, the geography files are:


GeoLite2-City-Locations-en.csv
GeoLite2-City-Blocks-IPv4.csv   
GeoLite2-City-Blocks-IPv6.csv

For GeoIP2 data, the geography files are:


GeoIP2-City-Locations-en.csv
GeoIP2-City-Blocks-IPv4.csv   
GeoIP2-City-Blocks-IPv6.csv

The format of the geography files is described here:

Threads

The IPAddressLocation runs on the thread of the upstream operator that sends input tuples to it. It does not start any threads of its own.

Exceptions

The IPAddressLocation operator will throw an exception and terminate in these situations:

  • No output ports are specified.
  • The outputFilters parameter is specified, and the number of expressions specified does not match the number of output ports specified.

Example


use com.teracloud.streams.network.location::*;
use com.teracloud.streams.network.source::*;
composite Main {
    type
      PacketType =
          uint64 packetNumber,                // sequence number of packet, as emitted by operator
          float64 captureTime,                // time that packet was captured, in seconds since Unix epoch
          uint32 ipSourceAddress,             // IPv4 source address, or empty if not IPv4 packet
          uint16 ipSourcePort,                // IP source port, or zero if not UDP or TCP packet
          uint32 ipDestinationAddress,        // IPv4 destination address, or empty if not IPv4 packet
          uint16 ipDestinationPort,           // IP destination port, or zero if not UDP or TCP packet
          uint32 packetLength;                // original length of packet (not necessarily all captured)

      LocatedPacketType =
          uint64 packetNumber,               // sequence number of packet, as emitted by operator
          float64 captureTime,               // time that packet was captured, in seconds since Unix epoch
          uint32 packetLength,
          rstring ipSourceAddress,
          rstring ipSourceSubnet,
          rstring ipSourceLabel,
          rstring ipSourceCoordinates,
          rstring ipDestinationAddress,
          rstring ipDestinationSubnet,
          rstring ipDestinationLabel,
          rstring ipDestinationCoordinates;

    graph
      stream<PacketType> PacketStream as Out = PacketFileSource() {
          param
              pcapFilename: getSubmissionTimeValue("pcapFilename", "data/sample_locations_ipv4_only.pcap" );
              outputFilters: IP_VERSION()==4ub;
          output Out:
              packetNumber = packetsProcessed() - 1ul,
              captureTime = (float64)CAPTURE_SECONDS() + (float64)CAPTURE_MICROSECONDS() / 1000000.0,
              ipSourceAddress = IPV4_SRC_ADDRESS(),
              ipSourcePort = IP_SRC_PORT(),
              ipDestinationAddress = IPV4_DST_ADDRESS(),
              ipDestinationPort = IP_DST_PORT(),
              packetLength = PACKET_LENGTH();
      }

      stream<LocatedPacketType> LocatedPacketStream as Out = IPAddressLocation(PacketStream) {
        param
          geographyDirectory: getSubmissionTimeValue("geographyDirectory", "./www.maxmind.com" );
          outputFilters: locationCityName(ipSourceAddress) != "" || locationCityName(ipDestinationAddress) != "";
        output Out:
          ipSourceAddress = convertIPV4AddressNumericToString(ipSourceAddress),
          ipSourceSubnet = locationSubnet(ipSourceAddress),
          ipSourceLabel = locationCityName(ipSourceAddress) + ", " + locationSubdivision1Name(ipSourceAddress) + ", " + locationCountryName(ipSourceAddress),
          ipSourceCoordinates = (rstring)locationLatitude(ipSourceAddress) + ", " + (rstring)locationLongitude(ipSourceAddress),
          ipDestinationAddress = convertIPV4AddressNumericToString(ipDestinationAddress),
          ipDestinationSubnet = locationSubnet(ipDestinationAddress),
          ipDestinationLabel = locationCityName(ipDestinationAddress) + ", " + locationSubdivision1Name(ipDestinationAddress) + ", " + locationCountryName(ipDestinationAddress),
          ipDestinationCoordinates = (rstring)locationLatitude(ipDestinationAddress) + ", " + (rstring)locationLongitude(ipDestinationAddress);
      }

      // See the $STREAMS_INSTALL/samples/com.teracloud.streams.network directory for more examples
}

Sample Applications

The network toolkit includes several sample applications that illustrate how to use this operator. See the samples directory in your Streams installation.

References

The format of the GeoIP2 and GeoLite2 CSV files provided by Maxmind, Inc. is described here:

The format of 'geohash' codes for latitude and longitude coordinates is described here:

Summary

Ports
This operator has 2 input ports and 0 or more output ports.
Windowing
This operator does not accept any windowing configurations.
Parameters
This operator supports 3 parameters.

Required: geographyDirectory

Optional: initOnTuple, outputFilters

Metrics
This operator does not report any metrics.

Properties

Implementation
C++
Threading
Always - Operator always provides a single threaded execution context.

Input Ports

Ports (0)

The IPAddressLocation operator requires one input port. One or more input attributes must be of type uint32 or list<uint8>[16] containing IP version 4 or 6 addresses, respectively. These attributes are specified as arguments to the location result functions.

Properties

Ports (1)

Control port that ingests a file path pointing to a MaxMind GeoIP2 (or GeoLite2) database CSV files. The operator determines whether the incoming file path refers to a "locations" file, "IPv4 blocks" file or "IPv6 blocks" file based on the name of the file. The expected file names are:

  • For the "locations" file: GeoIP2-City-Locations-en.csv or GeoLite2-City-Locations-en.csv
  • For IPv4 "blocks" file: GeoIP2-City-Blocks-IPv4.csv or GeoLite2-City-Blocks-IPv4.csv
  • For IPv6 "blocks" file: GeoIP2-City-Blocks-IPv6.csv or GeoLite2-City-Blocks-IPv6.csv

This control port can be used to dynamically update the operator's internal database. Each time a tuple is received containing a path to one of the files listed above, the operator will update it's internal table with the data in the file.

This input port expects a tuple containing a single attribute of type rstring.

Properties

Output Ports

Assignments
This operator allows any SPL expression of the correct type to be assigned to output attributes.
Ports (0...)

The IPAddressLocation operator requires one or more output ports.

Each output port will produce one output tuple for each input tuple if the corresponding expression in the outputFilters parameter evaluates true, or if no outputFilters parameter is specified.

Output attributes can be assigned values with any SPL expression that evaluates to the proper type, and the expressions may include any of the location result functions. Output attributes that match input attributes in name and type are copied automatically.

Properties

Parameters

Required: geographyDirectory

Optional: initOnTuple, outputFilters

geographyDirectory

This required parameter specifies a directory containing GeoIP2 or GeoLite2 files downloaded from MaxMind, Inc.

Properties

initOnTuple

This optional parameter takes an expression of type 'boolean' that specifies whether or not the operator should load the files specified with the geographyDirectory parameter on the first tuple.

The default value is 'false' and the operator loads the files on operator startup. In some cases loading the files during operator startup might cause timeouts and setting this parameter value to 'true' resolves this.

Properties

outputFilters

This optional parameter takes a list of SPL expressions that specify which input tuples should be emitted by the corresponding output port. The number of expressions in the list must match the number of output ports, and each expression must evaluate to a boolean value. The output filter expressions may include any of the location result functions.

The default value of the outputFilters parameter is an empty list, which causes each input tuple to be emitted by all output ports.

Properties

Code Templates

IPAddressLocation

stream<${outputSchema}> ${outputStream} = com.teracloud.streams.network.location::IPAddressLocation(${inputStream}) {
  param
      geographyDirectory: ${inputStream-attribute};
      ${parameter}: ${parameterExpression};
  output
      ${outputStream}: ${attributeAssignments}
}
      

Libraries

No description for library.
Library Name: boost_filesystem
No description for library.
Library Name: boost_system
No description for library.
Library Name: boost_regex
No description for library.
Include Path: ../../impl/include