* MCOL-4846 dev-6 Handle large join results
Use a loop to shrink the number of results reported per message to something manageable.
* MCOL-4841 small changes requested by review
* Add EXTRA threads to prioritythreadpool
prioritythreadpool is configured at startup with a fixed number of threads available. This is to prevent thread thrashing. Since most of the time, BPP job steps are short lived, and a rescheduling mechanism exist if no threads are available, this works to keep cpu wastage to a minimum.
However, if a query or queries consume all the threads in prioritythreadpool and then block (due to the consumer not consuming fast enough) we can run out of threads and no work will be done until some threads unblock. A new mechanism allows for EXTRA threads to be generated for the duration of the blocking action. These threads can act on new queries. When all blocking is completed, these threads will be released when idle.
* MCOL-4841 dev6 Reconcile with changes in develop-6
* MCOL-4841 Some format corrections
* MCOL-4841 dev clean up some things based on review
* MCOL-4841 dev 6 ExeMgr Crashes after large join
This commit fixes up memory accounting issues in ExeMgr
* MCOL-4841 remove LDI change
Opened MCOL-4968 to address the issue
* MCOL-4841 Add fMaxBPPSendQueue to ResourceManager
This causes the setting to be loaded at run time (requires restart to accept a change) BPPSendthread gets this in it's ctor
Also rolled back changes to TupleHashJoinStep::smallRunnerFcn() that used a local variable to count locally allocated memory, then added it into the global counter at function's end. Not counting the memory globally caused conversion to UM only join way later than it should. This resulted in MCOL-4971.
* MCOL-4841 make blockedThreads and extraThreads atomic
Also restore previous scope of locks in bppsendthread. There is some small chance the new scope could be incorrect, and the performance boost is negligible. Better safe than sorry.
Fixes:
* Irrelevant where conditions
* Irrelevant const
* A potential infinite loop in treenode
* Bad implicit case fallthroughs
* Explicit markings for required case fallthroughs
* Unused variables
* Unused function
Also disabled some warnings for now which we should fix later.
When a thread has been idle for 10 minutes and we have too many threads
in the threadpool the thread will be pruned. This is done by the
thread's main function just returning. Unfortunately this does not free
up the memory, the thread either needs to be joined or detatched.
We cannot use detached threads since there are mutexes and conditional
variables between the main thread and the threadpool threads. If the
main thread finishes before the threadpool threads (as would happen in
cpimport) then crashes occur. The parent needs to wait on the child
threads which is the whole point in joining.
So this fix spawns a new thread which every minute will check the list
of threads to be joined due to timeout and join them.
We have had to use an adapted version of boost::thread_group so that we
can join a single thread based off its thread ID.
In addition with have modified PriorityThreadPool to use detached
threads since this does not need to signal the child threads at the end.
PriorityThreadPool didn't have very good error handling. If something
failed it would just ignore whatever was being processed. This could
lead to a query continuing without retreiving all of the required data.
This patch adds error handling, sending a message back to the client
and a log message. It also destroys and recreates the pool thread.