Track polling tasks across cluster in the openMIC database#46
Open
StephenCWills wants to merge 1 commit intomasterfrom
Open
Track polling tasks across cluster in the openMIC database#46StephenCWills wants to merge 1 commit intomasterfrom
StephenCWills wants to merge 1 commit intomasterfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds the
PollingTasktable to the openMIC database to track tasks across the entire cluster. This allows one node to clear all matching tasks for a given downloader when the task executes, reducing churn on the cluster when processing times are high.Adds the
DownloaderGrouptable to prevent parallel processing, even between separate nodes in the cluster. Many devices don't support parallel access, and even those that do would experience performance degradation due to the additional resources required to handle simultaneous connections. The cluster likely has enough to do in parallel already so this helps spread those resources out across different downloaders.PollingTaskrecords are added when a task is queued on a logical thread. They are removed at the start of execution when that task executes on the logical thread. They may also be removed by another node if that node happens to execute its task first.DownloaderGrouprecords are created at the start of execution when the first task for a downloader group is executed somewhere on the cluster. These records never get deleted. Also at the start of execution, the downloader will attempt to obtain a cluster-wide lock on the downloader group by entering its node identifier and the current timestamp into theDownloaderGrouptable. When the task has finished executing, the node entersNULLinto those fields to release the lock. If it fails to obtain the lock, the downloader will simply requeue the task in the hope that the lock will have been released when the task comes up again.As a safeguard, any lock only lasts one hour at which point other nodes are allowed to take the lock. If a node crashes and is able to restart, it can expire the lock early during initialization of the downloader. If a node crashes and fails to restart, other nodes in the cluster will have to wait the full hour before they can obtain the lock. If a node somehow takes more than an hour to poll a device, the lock will still expire while it is polling and may end up getting taken by another node.
All downloaders in a downloader group will now share a logical thread. This prevents some churn when the node is already processing a downloader in that group.