?

Log in

No account? Create an account
 
 
03 February 2017 @ 02:18 am
Learning Multithreading  
In the last couple of days I learned quite a bit about multithreading:
1) How to create new thread in order to fix hangs in crawler.

2) That creating new threads has performance penalty (about 10 ms 0.15 ms per creation of a new thread).

3) That Task has good performance (almost no performance penalty) because it reuses thread pool.

4) That if you use Task (thread pool) you cannot really kill the hanging thread, so using Task Factory does not really help in solving a hanging thread issue.

5) How to create our own thread factory that is running single thread and how to kill that thread if it runs for too long.
In particular:
- How to use two EventWaitHandle objects in order to communicate appropriately between main thread an background/worker thread.

- When it is the right time to exit from the infinite loop in background thread in case if service is shutting down or pausing. (Exit background thread in case of pause/shutdown only if it is idle - to prevent confusion in the business logic of the client code).

using System;
using System.Runtime.ExceptionServices;
using System.Threading;

namespace PostJobFree.Utilities
{
	public static class ThreadHelper
	{
		private static EventWaitHandle CompletionWait;
		private static EventWaitHandle InputWait;
		private static Thread CurrentThread;
		private static Exception ThreadException;
		private static Action ActionToExecute;
		private static readonly object LockObject = new object();

		public static bool ExecuteOnSeparateThreadWithTimeout(Action action, TimeSpan timeout)
		{
			lock (LockObject)
			{
				ThreadException = null;
				ActionToExecute = action;
				if (CurrentThread == null // First-time execution or previous thread was aborted due to timeout
					|| !CurrentThread.IsAlive) // Service was paused
				{
					InitializeCurrentThread();
				}
				InputWait.Set();
				if (!CompletionWait.WaitOne(timeout))
				{
					CurrentThread.Abort();
					CurrentThread = null;
					return false;
				}
				if (ThreadException != null)
					ExceptionDispatchInfo.Capture(ThreadException).Throw(); // To preserve stack trace
				return true;
			}
		}

		private static void InitializeCurrentThread()
		{
			CompletionWait = new EventWaitHandle(false, EventResetMode.AutoReset);
			InputWait = new EventWaitHandle(false, EventResetMode.AutoReset);
			CurrentThread = new Thread(WorkerThread);
			CurrentThread.Start();

		}
		private static void WorkerThread()
		{
			while (true)
			{
				if (InputWait.WaitOne(TimeSpan.FromSeconds(1)))
				{// Action is requested
					try
					{
						ActionToExecute.Invoke();
					}
					catch (ThreadAbortException)
					{
						return;
					}
					catch (Exception ex)
					{
						ThreadException = ex;
					}
					finally
					{
						CompletionWait.Set();
					}
				}
				else
				{// 1 second passed with no Action request
					if (ExecutionCore.NewExecutionAllowed) continue; // To allow exiting on Pause
					return; // Exit on (NewExecutionAllowed = false) only if background thread has nothing to do
				}
			}
		}
	}
}

using System;
using System.Threading;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using PostJobFree;
using PostJobFree.Utilities;

namespace TestIjSearch.Utilities
{
	[TestClass]
	public class ThreadHelperTests
	{
		[TestMethod]
		public void ThreadHelperExecuteOnSeparateThreadWithTimeoutTest()
		{
			Assert.IsFalse(ThreadHelper.ExecuteOnSeparateThreadWithTimeout(() => Thread.Sleep(10000), TimeSpan.FromTicks(1)));

			Assert.IsTrue(ThreadHelper.ExecuteOnSeparateThreadWithTimeout(() => { }, TimeSpan.FromSeconds(1)));

			bool exceptionHappened = false;
			try
			{
				ThreadHelper.ExecuteOnSeparateThreadWithTimeout(() => {throw new PostJobFreeException();}, TimeSpan.FromSeconds(1));
			}
			catch (PostJobFreeException)
			{
				exceptionHappened = true;
			}
			Assert.IsTrue(exceptionHappened);

			try
			{
				ExecutionCore.NewExecutionAllowed = false;
				int result = 0;
				Assert.IsTrue(ThreadHelper.ExecuteOnSeparateThreadWithTimeout(() => { result = 111; }, TimeSpan.FromSeconds(1)));
				Assert.AreEqual(111, result);
				//Thread.Sleep(TimeSpan.FromSeconds(2)); // It allows asynchronous thread to die/exit if (NewExecutionAllowed = false)
				Assert.IsTrue(ThreadHelper.ExecuteOnSeparateThreadWithTimeout(() => { result = 222; }, TimeSpan.FromSeconds(1)));
				Assert.AreEqual(222, result);
			}
			finally
			{
				ExecutionCore.NewExecutionAllowed = true;
			}
		}
	}
}

Update: LJ discussion.

Originally posted at: http://dennisgorelik.dreamwidth.org/123635.html
 
 
 
Yaturkenzhensirhiv - a handheld spyyatur on February 1st, 2017 08:14 pm (UTC)
Multi-threading programming is hard. Manually fiddling with threads and events should be used as the option of last resort.

Consider using Monitor.Pulse() instead of events.

Also, you should avoid polling. ExecutionCore can use Monitor.PulseAll() to signal all outstanding threads to exit.

Your solution seems to execute one thread per requested action. As you have found yourself, this is usually not the most optimal approach: creating threads is expensive. A pool of worker threads that process several items one by one performs better. If you are going to organize a queue, you may want to consider BlockingCollection class.

Now, there are read-to-use thread pool solutions out there, e.g. http://smartthreadpool.codeplex.com/ (disclaimer: I never used it).


Dennis Gorelikdennisgorelik on February 1st, 2017 10:45 pm (UTC)
> Multi-threading programming is hard.

I agree.

> Manually fiddling with threads and events should be used as the option of last resort.

I'm kind of agree here, but there are even worse options out there.

For example, creating and managing separate process requires even more overhead.

> Consider using Monitor.Pulse() instead of events.

Did you use Monitor.Pulse() yourself?
What are the advantages of using Monitor.Pulse() vs EventWaitHandle's .Set() and .WaitOne()?

> use Monitor.PulseAll() to signal all outstanding threads to exit.

In this solution I posted - there is only one background thread. We are not looking for high performance here. Just want to be able to kill/restart the hanging thread and not to abuse hardware in the process.

> creating threads is expensive

I noticed it only theoretically.
In practice we had a solution in production that was creating new thread for every URL crawl (about 2 per second) and we did not even noticed it.

> If you are going to organize a queue

Not yet.
There is no need for now. May be later.
Yaturkenzhensirhiv - a handheld spyyatur on February 2nd, 2017 04:52 am (UTC)
> Did you use Monitor.Pulse() yourself?

Yes.

> What are the advantages of using Monitor.Pulse() vs EventWaitHandle's .Set() and .WaitOne()?

It more closely matches the semantic of "wake up one thread and have it acquire a lock" and does it atomically.

> In this solution I posted - there is only one background thread.

What happens if you have new action to execute before the previous one has completed?
Dennis Gorelikdennisgorelik on February 2nd, 2017 05:12 am (UTC)
> more closely matches the semantic of "wake up one thread and have it acquire a lock"

How well does Monitor.Pulse() handle the goal of exiting the thread when our whole service is paused/shut down?
Would Monitor.Pulse() help to gracefully exit the background thread when it's time to do service re-deployment?

> and does it atomically.

Why is atomicity important in that case?

> What happens if you have new action to execute before the previous one has completed?

We just wait (see "lock (LockObject)") until previous request finishes.
But that should not happen anyway, because we are crawling from the single thread.
(no subject) - yatur on February 3rd, 2017 04:37 am (UTC) (Expand)
(no subject) - dennisgorelik on February 3rd, 2017 05:30 am (UTC) (Expand)
(no subject) - yatur on February 3rd, 2017 05:39 am (UTC) (Expand)
(no subject) - dennisgorelik on February 3rd, 2017 06:08 am (UTC) (Expand)
(no subject) - yatur on February 3rd, 2017 06:11 am (UTC) (Expand)
(no subject) - dennisgorelik on February 3rd, 2017 08:20 am (UTC) (Expand)
(no subject) - yatur on February 3rd, 2017 12:38 pm (UTC) (Expand)
Perception games - dennisgorelik on February 3rd, 2017 04:49 pm (UTC) (Expand)
Re: Perception games - yatur on February 3rd, 2017 05:59 pm (UTC) (Expand)
Re: Perception games - dennisgorelik on February 3rd, 2017 11:24 pm (UTC) (Expand)
Re: Perception games - yatur on February 4th, 2017 12:10 am (UTC) (Expand)
Re: Perception games - dennisgorelik on February 4th, 2017 12:52 am (UTC) (Expand)
Re: Perception games - yatur on February 4th, 2017 12:59 am (UTC) (Expand)
Re: Perception games - dennisgorelik on February 4th, 2017 01:29 am (UTC) (Expand)
Gratitude time - dennisgorelik on February 4th, 2017 01:30 am (UTC) (Expand)
Re: Gratitude time - yatur on February 4th, 2017 02:12 am (UTC) (Expand)
Re: Gratitude time - dennisgorelik on February 4th, 2017 07:56 pm (UTC) (Expand)
Re: Gratitude time - dennisgorelik on February 4th, 2017 08:50 pm (UTC) (Expand)
(no subject) - dennisgorelik on February 3rd, 2017 08:11 am (UTC) (Expand)
Dennis Gorelikdennisgorelik on February 3rd, 2017 07:15 am (UTC)
Avoid polling
> you should avoid polling.

What do you call "polling" in this case? Is it this line that is executed every second:
if (ExecutionCore.NewExecutionAllowed) continue; // To allow exiting on Pause
?

Edited at 2017-02-03 07:17 am (UTC)
veremeenko_alex on February 3rd, 2017 02:10 pm (UTC)
Написание нового кода, вместо починки старого, как бы не комильфо.
Dennis Gorelikdennisgorelik on February 3rd, 2017 04:52 pm (UTC)
Это к чему?
rezkiy on February 4th, 2017 12:20 am (UTC)
Let me better understand what you are trying to achieve...

You are calling into third party code
third party code may hang
you are investigating an approach where you will terminate the thread that executes third party code
to do that, you pretty much have to implement a custom thread pool and you are discussing the technical challenges of that.

Correct?
Dennis Gorelikdennisgorelik on February 4th, 2017 01:04 am (UTC)
Overview
1) That is almost correct.

2) That "third party code" is HttpWebRequest from .NET framework, which unfortunately hangs in ~1 per 1M web page downloads.

3) The thread pool is not a business requirements, but just one option that allows to deal with hangs.

4) The thread pool consists of only single active thread at a time, because we do not really pushing performance limits here.

5) Everything is already implemented and works well in production.

6) The question is - can it be made better? In particular, yatur claims that we can use Monitor.Pulse()/Monitor.Wait() in order to simplify the code and make that code more reliable in case of adding more features.


Edited at 2017-02-04 01:04 am (UTC)
ex_juan_gan on February 4th, 2017 02:24 am (UTC)
Re: Overview
See, all this is happening not because it is a good solution, but because it is a popular solution. From FP point of view, it's ridiculously bad to do all this, manual handling of threads. Just as a cheap hack, yes.

Here's how we do it cheaper, in Scala:
    for ((r,i) <- reads.data.zipWithIndex) {
      log(s"Testing #$i/: $r")
      for (ref <- refs.values) {
        val foundFuture = ref.listAlignments(r, length, max)
        foundFuture.foreach { _.report() }
      }
    }


where
  def listAlignments(r: Read, kmer: Int, max: Int = Integer.MAX_VALUE): Future[MatchResult] = Future {
    MatchResult(r, this, alignments(r.value, kmer, max), max)
  }


In short, the simpler, the better. A complicated multi-threaded code is always a smell.

Yes, I think the whole "Java Concurrency in Practice" is a bunch of BS.

Edited at 2017-02-04 02:25 am (UTC)
Terminate hanging thread in Scala? - dennisgorelik on February 4th, 2017 05:13 pm (UTC) (Expand)
Re: Terminate hanging thread in Scala? - ex_juan_gan on February 4th, 2017 06:52 pm (UTC) (Expand)
Re: Terminate hanging thread in Scala? - dennisgorelik on February 4th, 2017 08:00 pm (UTC) (Expand)
Re: Terminate hanging thread in Scala? - ex_juan_gan on February 5th, 2017 07:19 pm (UTC) (Expand)
Re: Terminate hanging thread in Scala? - dennisgorelik on February 6th, 2017 12:26 am (UTC) (Expand)
Re: Terminate hanging thread in Scala? - ex_juan_gan on February 6th, 2017 12:59 am (UTC) (Expand)
Re: Terminate hanging thread in Scala? - dennisgorelik on February 6th, 2017 01:11 am (UTC) (Expand)
Re: Terminate hanging thread in Scala? - ex_juan_gan on February 6th, 2017 04:31 pm (UTC) (Expand)
Re: Terminate hanging thread in Scala? - dennisgorelik on February 6th, 2017 05:06 pm (UTC) (Expand)
Re: Terminate hanging thread in Scala? - ex_juan_gan on February 6th, 2017 07:00 pm (UTC) (Expand)
rezkiy on February 6th, 2017 06:00 am (UTC)
Re: Overview
What would happen if the worker is about to call CompletionWait.Set(); while the call to CompletionWait.WaitOne(timeout) just timed out?
rezkiy on February 6th, 2017 06:48 pm (UTC)
As for 10ms per thread, I find it hard to believe.

Please run this _without_ debugger attached:

using System;

namespace threads
{
    class Program
    {
        const int kCount = 10000;

        static int counter = 0;

        static void Main(string[] args)
        {
            Console.WriteLine("Started");
            var timer = new System.Diagnostics.Stopwatch();
            timer.Start();


            for (int i = 0; i < kCount; ++i)
            {
                new System.Threading.Thread(() => { System.Threading.Interlocked.Increment(ref counter); }).Start();
            }


            while (counter != kCount)
            {
                Console.WriteLine("Draining...");
                System.Threading.Thread.Sleep(10);

            }

            timer.Stop();

            var perThread = Convert.ToString(timer.Elapsed.TotalMilliseconds / kCount);
            var total = Convert.ToString(timer.Elapsed.TotalMilliseconds);

            Console.WriteLine(perThread + "ms per thread; " + total + "ms total");
            Console.ReadLine();
        }
    }
}
Dennis Gorelikdennisgorelik on February 6th, 2017 07:35 pm (UTC)
How long does it take to create a new thread in C#?
> As for 10ms per thread, I find it hard to believe.

You are right: it's only 0.15ms per new thread:
------
2017-02-06 19:17:39.259 UTC Started
2017-02-06 19:17:40.721 UTC Draining...
2017-02-06 19:17:40.731 UTC 0.1471557ms per thread; 1471.557ms total
------
(I used logging that is different from your code).

How fast threads are created on your machine?

I get my original "10ms per new thread" estimate from this example:
==========
http://stackoverflow.com/questions/13125105/why-so-much-difference-in-performance-between-thread-and-task
for (int i = 0; i < 0xFFF; ++i)
{
	new Thread(() => { }).Start();
}
if I use Thread the output is always greater than 40 seconds
==========

0xFFF == 4095, so it's about 40,000ms+/4095 ~= 10ms

Do you think that Nick (from StackOverflow post) mis-measured something?

Edited at 2017-02-06 07:36 pm (UTC)
rezkiy on February 6th, 2017 07:41 pm (UTC)
Re: How long does it take to create a new thread in C#?
Within the ballpark for me, 0.1169812ms per thread; 1169.812ms total.

>> mis-measured something?

Absolutely. My physic debugging powers tell me that Nick had Visual Studio debugger attached.
Re: How long does it take to create a new thread in C#? - rezkiy on February 6th, 2017 07:47 pm (UTC) (Expand)
rezkiy on February 6th, 2017 07:43 pm (UTC)
Re: How long does it take to create a new thread in C#?
also please notice that unlike Nick's, this snippet actually ensures that the meaningful part of each thread's work completes before the stopwatch is stopped.
Re: How long does it take to create a new thread in C#? - rezkiy on February 6th, 2017 08:29 pm (UTC) (Expand)