Shared Map Mode
Please note the following information for the shared map mode.
Shared Memory Segments
The shared map store consists of following shared memory segments:
- <storeName>
Holds the map data.
Multiple PointMapMatcher operator instances running on the same host, can access the same map data, thus reducing the application's memory footprint.
-
<storeName>.control
Holds some control data like the mutex and status information to detect malfunctions.
Access permissions for shared memory segments are handled by the operating system like file permissions. To ensure that the MapStore and the PointMapMatcher operators can access the segments, it is recommended to run the Streams jobs containing the operators as the same operating system user.
Fault Handling
The PointMapMatcher operator applies the following fault handling procedures for the shared map mode.
In case the shared map data (either data or control segment) cannot be opened on receipt of a query tuple, because it was modified or deleted by another application, or the MapStore operator was not started so far, the following error handling applies :
- If the fault handling is set to abort, an error trace message is written, and the operator aborts. Use this setting only if the PointMapMatcher and the MapStore operators are running in the same Streams job, otherwise it is not assured that the MapStore operator can create the shared data, before the PointMapMatcher operators will try to access it.
-
If the fault handling is set to drop, a warning trace message is written and the current tuple is dropped. The operator tries to open the shared data segments again on receipt of the next query tuple. Use this setting if you can afford to loose query tuples,and want to prevent the PointMapMatcher operator to cause back-pressure on upstream operators.
-
If the fault handling is set to wait, the operator waits until the shared data can be accessed. Use this if you do not want to loose query tuples. In this mode, the operator can cause back-pressure on upstream operators, because it blocks until the shared data can be opened.
In case the read lock cannot be acquired on receipt of a query tuple due to some failure, in the lock function, the following error handling applies:
- If the faultHandling is set to abort an error message is logged and the operator aborts.
-
If the faultHandling is set to wait an error message is logged and the operator tries to acquire the lock again in a loop until it succeeds. Thus it potentially waits endlessly.
-
If the faultHandling is set to drop an error message is logged and the operator drops the query tuple. An error tuple is sent on the first output port.
Getting the lock can take a long time, because updating the map data by the MapStore operator can take several hours. The PointMapMatcher just continues to wait until the lock can be acquired. This may slow down the shutdown of the application, as the operator may be waiting for the lock, when the request to cancel the Streams job is issued, thus increasing the time the cancel job operation will take.
Troubleshooting
In the unlikely case that the mutex is locked and the locking operator aborts before it can unlock again, either write, or read operations cannot run anymore.
If the MapStore operator locked and aborted, you will notice the PointMapMatcher operator not processing tuples anymore because it cannot acquire the read lock. If the fault handling is set to abort, you will see periodically appearing warning trace messages. The MapStore operator will try to recover from this situation after it has been restarted.
If a PointMapMatcher operator locked and aborted, this will not affect other PointMapMatcher operators, they will continue to process tuples. However, the MapStore operator will not be able to get the write lock when the next modification is started. The MapStore operator will detect this situation and try to recover from it.
If the application cannot recover from these sort of failures you need to manually solve this issue. Follow this procedure to cleanly restart the application.
- Stop the applications that run the MapStore and PointMapMatcher operators
-
Manually delete the shared memory files with the following command (replace <storeName> with the storeName parameter value) :
rm /dev/shm/<storeName> /dev/shm/<storeName>.control
-
Submit the applications again.
The MapStore operator will create the shared memory segments and load the map data after the restart.
Removed Shared Memory Segments
A running process that has opened a shared memory segment, can access the data in the shared memory segment even if the shared memory segment is removed.
Also, if a recreated shared memory segment has totally different data, the process sees the old data.
The process must close and reopen the shared memory segment to get access to newly created one.
The PointMapMatcher operator reopens the shared memory segments under the following conditions:
- A window punctuation is received on the query input port.
-
A tuple is received on the query input port but so far the operator did not attach to the shared map store.
Problems with the Boost library
If you use a boost version older than 1.45 and you encounter crashes of the PointMapMatcher operator, you may have run into boost bug 3951. If you cannot upgrade to the recommended boost version, you can try as a workaround to pass the following compiler option to the streams compiler (sc):
--cxx-flags=-fno-strict-aliasing