?

Log in

No account? Create an account
 
 
03 February 2017 @ 02:18 am
Learning Multithreading  
In the last couple of days I learned quite a bit about multithreading:
1) How to create new thread in order to fix hangs in crawler.

2) That creating new threads has performance penalty (about 10 ms 0.15 ms per creation of a new thread).

3) That Task has good performance (almost no performance penalty) because it reuses thread pool.

4) That if you use Task (thread pool) you cannot really kill the hanging thread, so using Task Factory does not really help in solving a hanging thread issue.

5) How to create our own thread factory that is running single thread and how to kill that thread if it runs for too long.
In particular:
- How to use two EventWaitHandle objects in order to communicate appropriately between main thread an background/worker thread.

- When it is the right time to exit from the infinite loop in background thread in case if service is shutting down or pausing. (Exit background thread in case of pause/shutdown only if it is idle - to prevent confusion in the business logic of the client code).

using System;
using System.Runtime.ExceptionServices;
using System.Threading;

namespace PostJobFree.Utilities
{
	public static class ThreadHelper
	{
		private static EventWaitHandle CompletionWait;
		private static EventWaitHandle InputWait;
		private static Thread CurrentThread;
		private static Exception ThreadException;
		private static Action ActionToExecute;
		private static readonly object LockObject = new object();

		public static bool ExecuteOnSeparateThreadWithTimeout(Action action, TimeSpan timeout)
		{
			lock (LockObject)
			{
				ThreadException = null;
				ActionToExecute = action;
				if (CurrentThread == null // First-time execution or previous thread was aborted due to timeout
					|| !CurrentThread.IsAlive) // Service was paused
				{
					InitializeCurrentThread();
				}
				InputWait.Set();
				if (!CompletionWait.WaitOne(timeout))
				{
					CurrentThread.Abort();
					CurrentThread = null;
					return false;
				}
				if (ThreadException != null)
					ExceptionDispatchInfo.Capture(ThreadException).Throw(); // To preserve stack trace
				return true;
			}
		}

		private static void InitializeCurrentThread()
		{
			CompletionWait = new EventWaitHandle(false, EventResetMode.AutoReset);
			InputWait = new EventWaitHandle(false, EventResetMode.AutoReset);
			CurrentThread = new Thread(WorkerThread);
			CurrentThread.Start();

		}
		private static void WorkerThread()
		{
			while (true)
			{
				if (InputWait.WaitOne(TimeSpan.FromSeconds(1)))
				{// Action is requested
					try
					{
						ActionToExecute.Invoke();
					}
					catch (ThreadAbortException)
					{
						return;
					}
					catch (Exception ex)
					{
						ThreadException = ex;
					}
					finally
					{
						CompletionWait.Set();
					}
				}
				else
				{// 1 second passed with no Action request
					if (ExecutionCore.NewExecutionAllowed) continue; // To allow exiting on Pause
					return; // Exit on (NewExecutionAllowed = false) only if background thread has nothing to do
				}
			}
		}
	}
}

using System;
using System.Threading;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using PostJobFree;
using PostJobFree.Utilities;

namespace TestIjSearch.Utilities
{
	[TestClass]
	public class ThreadHelperTests
	{
		[TestMethod]
		public void ThreadHelperExecuteOnSeparateThreadWithTimeoutTest()
		{
			Assert.IsFalse(ThreadHelper.ExecuteOnSeparateThreadWithTimeout(() => Thread.Sleep(10000), TimeSpan.FromTicks(1)));

			Assert.IsTrue(ThreadHelper.ExecuteOnSeparateThreadWithTimeout(() => { }, TimeSpan.FromSeconds(1)));

			bool exceptionHappened = false;
			try
			{
				ThreadHelper.ExecuteOnSeparateThreadWithTimeout(() => {throw new PostJobFreeException();}, TimeSpan.FromSeconds(1));
			}
			catch (PostJobFreeException)
			{
				exceptionHappened = true;
			}
			Assert.IsTrue(exceptionHappened);

			try
			{
				ExecutionCore.NewExecutionAllowed = false;
				int result = 0;
				Assert.IsTrue(ThreadHelper.ExecuteOnSeparateThreadWithTimeout(() => { result = 111; }, TimeSpan.FromSeconds(1)));
				Assert.AreEqual(111, result);
				//Thread.Sleep(TimeSpan.FromSeconds(2)); // It allows asynchronous thread to die/exit if (NewExecutionAllowed = false)
				Assert.IsTrue(ThreadHelper.ExecuteOnSeparateThreadWithTimeout(() => { result = 222; }, TimeSpan.FromSeconds(1)));
				Assert.AreEqual(222, result);
			}
			finally
			{
				ExecutionCore.NewExecutionAllowed = true;
			}
		}
	}
}

Update: LJ discussion.

Originally posted at: http://dennisgorelik.dreamwidth.org/123635.html
 
 
 
Yaturkenzhensirhiv - a handheld spyyatur on February 1st, 2017 08:14 pm (UTC)
Multi-threading programming is hard. Manually fiddling with threads and events should be used as the option of last resort.

Consider using Monitor.Pulse() instead of events.

Also, you should avoid polling. ExecutionCore can use Monitor.PulseAll() to signal all outstanding threads to exit.

Your solution seems to execute one thread per requested action. As you have found yourself, this is usually not the most optimal approach: creating threads is expensive. A pool of worker threads that process several items one by one performs better. If you are going to organize a queue, you may want to consider BlockingCollection class.

Now, there are read-to-use thread pool solutions out there, e.g. http://smartthreadpool.codeplex.com/ (disclaimer: I never used it).


Dennis Gorelikdennisgorelik on February 1st, 2017 10:45 pm (UTC)
> Multi-threading programming is hard.

I agree.

> Manually fiddling with threads and events should be used as the option of last resort.

I'm kind of agree here, but there are even worse options out there.

For example, creating and managing separate process requires even more overhead.

> Consider using Monitor.Pulse() instead of events.

Did you use Monitor.Pulse() yourself?
What are the advantages of using Monitor.Pulse() vs EventWaitHandle's .Set() and .WaitOne()?

> use Monitor.PulseAll() to signal all outstanding threads to exit.

In this solution I posted - there is only one background thread. We are not looking for high performance here. Just want to be able to kill/restart the hanging thread and not to abuse hardware in the process.

> creating threads is expensive

I noticed it only theoretically.
In practice we had a solution in production that was creating new thread for every URL crawl (about 2 per second) and we did not even noticed it.

> If you are going to organize a queue

Not yet.
There is no need for now. May be later.
Yaturkenzhensirhiv - a handheld spyyatur on February 2nd, 2017 04:52 am (UTC)
> Did you use Monitor.Pulse() yourself?

Yes.

> What are the advantages of using Monitor.Pulse() vs EventWaitHandle's .Set() and .WaitOne()?

It more closely matches the semantic of "wake up one thread and have it acquire a lock" and does it atomically.

> In this solution I posted - there is only one background thread.

What happens if you have new action to execute before the previous one has completed?
Dennis Gorelikdennisgorelik on February 2nd, 2017 05:12 am (UTC)
> more closely matches the semantic of "wake up one thread and have it acquire a lock"

How well does Monitor.Pulse() handle the goal of exiting the thread when our whole service is paused/shut down?
Would Monitor.Pulse() help to gracefully exit the background thread when it's time to do service re-deployment?

> and does it atomically.

Why is atomicity important in that case?

> What happens if you have new action to execute before the previous one has completed?

We just wait (see "lock (LockObject)") until previous request finishes.
But that should not happen anyway, because we are crawling from the single thread.
Yaturkenzhensirhiv - a handheld spyyatur on February 3rd, 2017 04:37 am (UTC)
> How well does Monitor.Pulse() handle the goal of exiting the thread

It doesn't. And nor does Event. Its goal is to tell the thread to wake up. Then the thread can read the next command. The command could be "run this function" or "exit".

> Why is atomicity important in that case?

Atomicity is always important. Otherwise you risk running into rare, but destructive corner cases when your threads get preempted in an inconsistent state. You need to do an extra careful analysis to ensure that such cases are either taken care of, or are guaranteed not to occur.

The fact that you ask this question means that you probably should try to use higher-level solutions where possible. It's akin to a surgeon asking to explain why anesthesia is important. The question is, of course, valid, but I would not want to be operated by such a surgeon.
(no subject) - dennisgorelik on February 3rd, 2017 05:30 am (UTC) (Expand)
(no subject) - yatur on February 3rd, 2017 05:39 am (UTC) (Expand)
(no subject) - dennisgorelik on February 3rd, 2017 06:08 am (UTC) (Expand)
(no subject) - yatur on February 3rd, 2017 06:11 am (UTC) (Expand)
(no subject) - dennisgorelik on February 3rd, 2017 08:20 am (UTC) (Expand)
(no subject) - yatur on February 3rd, 2017 12:38 pm (UTC) (Expand)
Perception games - dennisgorelik on February 3rd, 2017 04:49 pm (UTC) (Expand)
Re: Perception games - yatur on February 3rd, 2017 05:59 pm (UTC) (Expand)
Re: Perception games - dennisgorelik on February 3rd, 2017 11:24 pm (UTC) (Expand)
Re: Perception games - yatur on February 4th, 2017 12:10 am (UTC) (Expand)
Re: Perception games - dennisgorelik on February 4th, 2017 12:52 am (UTC) (Expand)
Re: Perception games - yatur on February 4th, 2017 12:59 am (UTC) (Expand)
Re: Perception games - dennisgorelik on February 4th, 2017 01:29 am (UTC) (Expand)
Gratitude time - dennisgorelik on February 4th, 2017 01:30 am (UTC) (Expand)
Re: Gratitude time - yatur on February 4th, 2017 02:12 am (UTC) (Expand)
Re: Gratitude time - dennisgorelik on February 4th, 2017 07:56 pm (UTC) (Expand)
Re: Gratitude time - dennisgorelik on February 4th, 2017 08:50 pm (UTC) (Expand)
(no subject) - dennisgorelik on February 3rd, 2017 08:11 am (UTC) (Expand)
Dennis Gorelikdennisgorelik on February 3rd, 2017 07:15 am (UTC)
Avoid polling
> you should avoid polling.

What do you call "polling" in this case? Is it this line that is executed every second:
if (ExecutionCore.NewExecutionAllowed) continue; // To allow exiting on Pause
?

Edited at 2017-02-03 07:17 am (UTC)
veremeenko_alex on February 3rd, 2017 02:10 pm (UTC)
Написание нового кода, вместо починки старого, как бы не комильфо.
Dennis Gorelikdennisgorelik on February 3rd, 2017 04:52 pm (UTC)
Это к чему?
rezkiy on February 4th, 2017 12:20 am (UTC)
Let me better understand what you are trying to achieve...

You are calling into third party code
third party code may hang
you are investigating an approach where you will terminate the thread that executes third party code
to do that, you pretty much have to implement a custom thread pool and you are discussing the technical challenges of that.

Correct?
Dennis Gorelikdennisgorelik on February 4th, 2017 01:04 am (UTC)
Overview
1) That is almost correct.

2) That "third party code" is HttpWebRequest from .NET framework, which unfortunately hangs in ~1 per 1M web page downloads.

3) The thread pool is not a business requirements, but just one option that allows to deal with hangs.

4) The thread pool consists of only single active thread at a time, because we do not really pushing performance limits here.

5) Everything is already implemented and works well in production.

6) The question is - can it be made better? In particular, yatur claims that we can use Monitor.Pulse()/Monitor.Wait() in order to simplify the code and make that code more reliable in case of adding more features.


Edited at 2017-02-04 01:04 am (UTC)
журнал закрытjuan_gandhi on February 4th, 2017 02:24 am (UTC)
Re: Overview
See, all this is happening not because it is a good solution, but because it is a popular solution. From FP point of view, it's ridiculously bad to do all this, manual handling of threads. Just as a cheap hack, yes.

Here's how we do it cheaper, in Scala:
    for ((r,i) <- reads.data.zipWithIndex) {
      log(s"Testing #$i/: $r")
      for (ref <- refs.values) {
        val foundFuture = ref.listAlignments(r, length, max)
        foundFuture.foreach { _.report() }
      }
    }


where
  def listAlignments(r: Read, kmer: Int, max: Int = Integer.MAX_VALUE): Future[MatchResult] = Future {
    MatchResult(r, this, alignments(r.value, kmer, max), max)
  }


In short, the simpler, the better. A complicated multi-threaded code is always a smell.

Yes, I think the whole "Java Concurrency in Practice" is a bunch of BS.

Edited at 2017-02-04 02:25 am (UTC)
Dennis Gorelikdennisgorelik on February 4th, 2017 05:13 pm (UTC)
Terminate hanging thread in Scala?
1) Does that Scala code creates new threads in order to execute ".report()"?
2) What would happen, if one of these .report() calls would hang (due to crawling)?
Would that Scala code be able to terminate that thread and continue?
журнал закрытjuan_gandhi on February 4th, 2017 06:52 pm (UTC)
Re: Terminate hanging thread in Scala?
Threads are created, yes.
If it hangs, it hangs. I did not bother to kill them after certain time.
It's doable, but I just ignore the problem here.
Dennis Gorelikdennisgorelik on February 4th, 2017 08:00 pm (UTC)
Re: Terminate hanging thread in Scala?
Are threads created for every .report() call?

There is an overhead in creating new threads (~10 ms per thread if I am not mistaken).

We did not have to create separate threads at all and actually ran a simple loop for our crawler. But unfortunately crawler hangs occasionally, so we have to address it.

So we initially created a solution that always creates threads for every download (and aborting thread if it hangs).
But then decided to avoid thread creation overhead which ended up in the version of the code which you can see in this posting.


Edited at 2017-02-04 08:07 pm (UTC)
Re: Terminate hanging thread in Scala? - juan_gandhi on February 5th, 2017 07:19 pm (UTC) (Expand)
Re: Terminate hanging thread in Scala? - dennisgorelik on February 6th, 2017 12:26 am (UTC) (Expand)
Re: Terminate hanging thread in Scala? - juan_gandhi on February 6th, 2017 12:59 am (UTC) (Expand)
Re: Terminate hanging thread in Scala? - dennisgorelik on February 6th, 2017 01:11 am (UTC) (Expand)
Re: Terminate hanging thread in Scala? - juan_gandhi on February 6th, 2017 04:31 pm (UTC) (Expand)
Re: Terminate hanging thread in Scala? - dennisgorelik on February 6th, 2017 05:06 pm (UTC) (Expand)
Re: Terminate hanging thread in Scala? - juan_gandhi on February 6th, 2017 07:00 pm (UTC) (Expand)
rezkiy on February 6th, 2017 06:00 am (UTC)
Re: Overview
What would happen if the worker is about to call CompletionWait.Set(); while the call to CompletionWait.WaitOne(timeout) just timed out?
Dennis Gorelikdennisgorelik on February 6th, 2017 07:36 am (UTC)
What if CompletionWait.WaitOne(timeout) just timed out?
If CompletionWait.WaitOne(timeout) just timed out, then WorkerThread is going to be aborted and destroyed anyway:
CurrentThread.Abort();
CurrentThread = null;
return false;
It does not really matter if WorkerThread() would be able to complete CompletionWait.Set() call or not: nothing depends on the state of CompletionWait anymore.

Do you see any potential problems here?
rezkiy on February 6th, 2017 07:41 am (UTC)
Re: What if CompletionWait.WaitOne(timeout) just timed out?
well, CurrentThread.Abort() will make an async exception thrown in the context of the worker thread, and worker is outside of try() block.
rezkiy on February 6th, 2017 06:48 pm (UTC)
As for 10ms per thread, I find it hard to believe.

Please run this _without_ debugger attached:

using System;

namespace threads
{
    class Program
    {
        const int kCount = 10000;

        static int counter = 0;

        static void Main(string[] args)
        {
            Console.WriteLine("Started");
            var timer = new System.Diagnostics.Stopwatch();
            timer.Start();


            for (int i = 0; i < kCount; ++i)
            {
                new System.Threading.Thread(() => { System.Threading.Interlocked.Increment(ref counter); }).Start();
            }


            while (counter != kCount)
            {
                Console.WriteLine("Draining...");
                System.Threading.Thread.Sleep(10);

            }

            timer.Stop();

            var perThread = Convert.ToString(timer.Elapsed.TotalMilliseconds / kCount);
            var total = Convert.ToString(timer.Elapsed.TotalMilliseconds);

            Console.WriteLine(perThread + "ms per thread; " + total + "ms total");
            Console.ReadLine();
        }
    }
}
Dennis Gorelikdennisgorelik on February 6th, 2017 07:35 pm (UTC)
How long does it take to create a new thread in C#?
> As for 10ms per thread, I find it hard to believe.

You are right: it's only 0.15ms per new thread:
------
2017-02-06 19:17:39.259 UTC Started
2017-02-06 19:17:40.721 UTC Draining...
2017-02-06 19:17:40.731 UTC 0.1471557ms per thread; 1471.557ms total
------
(I used logging that is different from your code).

How fast threads are created on your machine?

I get my original "10ms per new thread" estimate from this example:
==========
http://stackoverflow.com/questions/13125105/why-so-much-difference-in-performance-between-thread-and-task
for (int i = 0; i < 0xFFF; ++i)
{
	new Thread(() => { }).Start();
}
if I use Thread the output is always greater than 40 seconds
==========

0xFFF == 4095, so it's about 40,000ms+/4095 ~= 10ms

Do you think that Nick (from StackOverflow post) mis-measured something?

Edited at 2017-02-06 07:36 pm (UTC)
rezkiy on February 6th, 2017 07:41 pm (UTC)
Re: How long does it take to create a new thread in C#?
Within the ballpark for me, 0.1169812ms per thread; 1169.812ms total.

>> mis-measured something?

Absolutely. My physic debugging powers tell me that Nick had Visual Studio debugger attached.
rezkiy on February 6th, 2017 07:47 pm (UTC)
Re: How long does it take to create a new thread in C#?
0.08 -- 0.09 per thread on x64
0.11 -- 0.12 AnyCPU or x86

no difference with debug/release.
Dennis Gorelikdennisgorelik on February 6th, 2017 08:12 pm (UTC)
Re: How long does it take to create a new thread in C#?
Your psychic powers are consistent with my observations:
I am getting ~3.9ms per new thread under Visual Studio debugger.

Nick was probably running on a slower machine too (back in 2012)
rezkiy on February 6th, 2017 07:43 pm (UTC)
Re: How long does it take to create a new thread in C#?
also please notice that unlike Nick's, this snippet actually ensures that the meaningful part of each thread's work completes before the stopwatch is stopped.
Dennis Gorelikdennisgorelik on February 6th, 2017 08:17 pm (UTC)
Re: How long does it take to create a new thread in C#?
Yes, I noticed.
Why do you call it "Draining..." though?
It's more like "Waiting for all threads to increment the counter..."

Did you just create this code from scratch, or do you frequently use code like that in your troubleshooting sessions?

Edited at 2017-02-06 08:23 pm (UTC)
rezkiy on February 6th, 2017 08:29 pm (UTC)
Re: How long does it take to create a new thread in C#?
'Draining' is standard terminology. You have some N of outstanding async things, no new things are going to come, 'draining' it would be either waiting for or making sure of all N of it to finish. Sure, here 'draining' is 'waiting for all threads to increment the counter'.

example https://msdn.microsoft.com/en-us/library/windows/hardware/ff544620(v=vs.85).aspx look for 'draining'

I copied Nick's example, deleted what I did not need, thought a bit re how to make sure threads do run, and wrote the simplest thing that could possibly work.

I don't write C# much. Probably I wrote more C# today vs in the preceding 2 months. Cpp, 50% reuse, 50% from scratch.