Multi-threading considerations
The following topic describes situations when operators can run in multi-threaded context and the related performance considerations.
Single threaded context in Operator's model
TheprovidesSingleThreadedContext
element
is used to enable the Teracloud® Streams
instance to
avoid unnecessary thread synchronization.Set
providesSingleThreadedContext
to Always
when:- It does not perform concurrent
submit
calls unless itsprocess
methods are called concurrently - Its
submit
calls complete before theprocess
call that triggered the submission completes
The SPL compiler ensures that the shared state
variables declared and modified within the logic clause of an operator
invocation are safely accessed during concurrent process methods.
When you set providesSingleThreadedContext
to Always
,
the operator code determines whether locking is required to serialize
access to the state variables.
An example of an operator with
a single threaded context is the Filter operator.
- Unless its
process
method is being called concurrently, the Filter operator does not make concurrentsubmit
calls. Itssubmit
calls are triggered by incoming tuples. - When it receives a tuple from a
process
call, it makes asubmit
call if the received tuple passes the filter condition, and thatsubmit
call completes before theprocess
call that triggered it is complete. - As a result, all instances of a Filter operator
provide a single threaded context and the setting
Always
is appropriate.
Managing thread concurrency in operator defined shared states
- Minimize the amount of sharing as much as possible
- Consider the use of thread local or stack local variables if that helps
- If operators must share state information, then consider trade-offs
between the following methods:
- Volatile variables
- Atomic operations (by compiler intrinsics)
- Spin-locks
- Mutex
- Be careful about false sharing
- Use padding to avoid data that is accessed by different threads, accessing the same cache line.
Queues shared across threads
- Common use case is producer / consumer queues
- Consider the use of lock free queues. These queues are almost always architecture-specific.