Skip to content

Conversation

@willardf
Copy link
Owner

@willardf willardf commented Nov 17, 2022

I'll describe this in more detail when I get a chance, but it's based on @Mukikaizoku's outline for message queues. The big refactor is actually combining the PeerData and DtlsServerConnection classes. This allows for a lot more certainty around adding and removing connections during handshake. There is also a couple of new concurrent set structures which allows us to either cycle connections across multiple threads or assert thread affinity.

WIP stuff:

  • I believe there still might be some scenario which can be optimized around connections that have just disconnected and received an errant packet after. Possibly some holes around peers that drop during handshake as well.
  • It feels a little bit unstable, failing to respond to pings under load.

Copy link
Collaborator

@Mukikaizoku Mukikaizoku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far just looked at the work distribution and not the DTLS processing code (I presume it's just migrated code and just needs spot checking).

Really excited that we're moving forward with making things lockless! I think overall I'd still recommend going with a thread-affinity approach as I think it'd keep the message processing "fairness" the best. On a more detailed note, I think then having single receive/work queues per thread may make it easier to manage thread work-load/sleep cycling.

Nonetheless, I think the non-affinity model here could be viable and I had some comments about how I think it could be improved.

Bottom-line, I think whatever is done, the absolutely most important thing to do here is to measure performance at various loads. In particular, we could see performance being great at certain loads, but starts to deteriorate at a certain CCU. Measuring statistical distribution of latencies will give us confidence in the methodology.

LocklessDtlsServerConnection connection;
if (!this.allConnections.TryGetValue(remoteEP, out connection))
{
lock (this.allConnections)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need this lock and double TryGet anymore - previously, the Add was happening from multiple threads, but if it's on a single receive thread, we should be fine

var myTid = Thread.CurrentThread.ManagedThreadId;
while (this.receiveQueue.TryTake(out var sender))
{
if (Interlocked.CompareExchange(ref sender.LockedBy, myTid, 0) != 0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm - I'm trying to wrap my head around a few comments that are all connected wrt:
-This compare exchange
-The concurrent dictionary within ConcurrentSetQueue
-The monitor pulses? (maybe)

I believe the goal here is to do thread-non-affinity lockless handling of a connection. I think the criteria that we are looking for is even work distribution. Essentially, without affinity, we are creating a "check-out" system, which effectively locks a particular connection to a particular thread temporarily.

The ConcurrentSetQueue will be loaded when receives/sends have triggered for that connection, meaning that even if a thread is working on a connection, the SetQueue may note there's new work to be done (even if the thread is still processing it). This is guarded by the compare exchange.

On one-side, I'd recommend the compare exchange to be a feature of ConcurrentSetQueue, so the worker thread will be guaranteed to get a connection on a TryTake (if a free one exists) or block/sleep. Considering receives/sends happen constantly, we can guess that there might be a lot of exchange "misses" here and the internal dictionary keeps filling with connections that are already being processed.

At the same time, AFAIK the concurrent dictionary isn't order guaranteed and so I don't think we're guaranteed queue-like behavior here.

So stepping back, I think we can potentially avoid these quirks by switching up how the "connection check-out" works with ConcurrentSetQueue. Instead of queuing connections on receive/send - queue them on connection creation. Then have threads pull/return connections with FIFO behavior (maybe BlockingCollection?). For disconnected connections, don't re-queue them and if it's disconnected on a dequeue, just drop it and re-dequeue.

We then don't need the monitors to synchronize the threads as well. If we do need monitors, I have the feeling it might be cleaner to using one of the signaling objects .NET provides (ManualResetHandle? I forget which one works for this exact case).

}
}

while (sender.PacketsSent.TryDequeue(out var message))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just reiterating that I think these two receive and send loops will need maximum processing counts.

Also, while I think the monitors will take care of sleeping the thread occasionally, if the TryTake doesn't induce a Wait, a thread could spin quite a bit - wondering if it'll be necessarily to sleep the thread every 100 or 1000 total messages.

@willardf willardf force-pushed the main branch 8 times, most recently from f30e956 to ff5f86a Compare February 29, 2024 21:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants