Synchronization issue with usage of pthread_kill() to terminate thread blocked for I/O












1















Previously I had asked a question regarding how to terminate thread blocked for I/O. I have used pthread_kill() instead of pthread_cancel() or writing to pipes, considering few advantages.



I have implementing the code to send signal (SIGUSR2) to the target thread using pthread_kill(). Below is the skeleton code for this. Most of the times getTimeRemainedForNextEvent() returns a value that blocks poll() for several hours. Because of this large timeout value, even if Thread2 sets terminateFlag (to stop Thread1), Thread2 gets blocked till poll() of Thread1 returns (which might be after several hours if there are no events on sockets). So I'm sending signal to Thread1 using pthread_kill() to interrupt poll() system call (if it gets blocked).



static void signalHandler(int signum) {
//Does nothing
}

// Thread 1 (Does I/O operations and handles scheduler events).

void* Thread1(void* args) {
terminateFlag = 0;
while(!terminateFlag) {
int millis = getTimeRemainedForNextEvent(); //calculate maximum number of milliseconds poll() can block.

int ret = poll(fds,numOfFDs,millis);
if(ret > 0) {
//handle socket events.
} else if (ret < 0) {
if(errno == EINTR)
perror("Poll Error");
break;
}

handleEvent();
}
}

// Thread 2 (Terminates Thread 1 when Thread 1 needs to be terminated)

void* Thread2(void* args) {
while(1) {

/* Do other stuff */

if(terminateThread1) {
terminateFlag = 1;
pthread_kill(ftid,SIGUSR2); //ftid is pthread_t variable of Thread1
pthread_join( ftid, NULL );
}
}

/* Do other stuff */
}


Above code works fine if Thread2 sets terminateFlag and sends signal to Thread1 when it blocked in poll() system call. But, If context switch happens after getTimeRemainedForNextEvent() function of Thread1 and Thread2 sets terminateFlag and sends signal, poll() of Thread1 gets blocked for several hours as it lost the signal that interrupts the system call.



It seems I can not use mutex for synchronization as poll() will hold the lock till it gets unblocked. Is there any synchronization mechanism that I can apply to avoid the above mentioned issue ?










share|improve this question























  • You might look into using ppoll() with an appropriate signal mask instead of poll().

    – Shawn
    Nov 13 '18 at 16:00


















1















Previously I had asked a question regarding how to terminate thread blocked for I/O. I have used pthread_kill() instead of pthread_cancel() or writing to pipes, considering few advantages.



I have implementing the code to send signal (SIGUSR2) to the target thread using pthread_kill(). Below is the skeleton code for this. Most of the times getTimeRemainedForNextEvent() returns a value that blocks poll() for several hours. Because of this large timeout value, even if Thread2 sets terminateFlag (to stop Thread1), Thread2 gets blocked till poll() of Thread1 returns (which might be after several hours if there are no events on sockets). So I'm sending signal to Thread1 using pthread_kill() to interrupt poll() system call (if it gets blocked).



static void signalHandler(int signum) {
//Does nothing
}

// Thread 1 (Does I/O operations and handles scheduler events).

void* Thread1(void* args) {
terminateFlag = 0;
while(!terminateFlag) {
int millis = getTimeRemainedForNextEvent(); //calculate maximum number of milliseconds poll() can block.

int ret = poll(fds,numOfFDs,millis);
if(ret > 0) {
//handle socket events.
} else if (ret < 0) {
if(errno == EINTR)
perror("Poll Error");
break;
}

handleEvent();
}
}

// Thread 2 (Terminates Thread 1 when Thread 1 needs to be terminated)

void* Thread2(void* args) {
while(1) {

/* Do other stuff */

if(terminateThread1) {
terminateFlag = 1;
pthread_kill(ftid,SIGUSR2); //ftid is pthread_t variable of Thread1
pthread_join( ftid, NULL );
}
}

/* Do other stuff */
}


Above code works fine if Thread2 sets terminateFlag and sends signal to Thread1 when it blocked in poll() system call. But, If context switch happens after getTimeRemainedForNextEvent() function of Thread1 and Thread2 sets terminateFlag and sends signal, poll() of Thread1 gets blocked for several hours as it lost the signal that interrupts the system call.



It seems I can not use mutex for synchronization as poll() will hold the lock till it gets unblocked. Is there any synchronization mechanism that I can apply to avoid the above mentioned issue ?










share|improve this question























  • You might look into using ppoll() with an appropriate signal mask instead of poll().

    – Shawn
    Nov 13 '18 at 16:00
















1












1








1


1






Previously I had asked a question regarding how to terminate thread blocked for I/O. I have used pthread_kill() instead of pthread_cancel() or writing to pipes, considering few advantages.



I have implementing the code to send signal (SIGUSR2) to the target thread using pthread_kill(). Below is the skeleton code for this. Most of the times getTimeRemainedForNextEvent() returns a value that blocks poll() for several hours. Because of this large timeout value, even if Thread2 sets terminateFlag (to stop Thread1), Thread2 gets blocked till poll() of Thread1 returns (which might be after several hours if there are no events on sockets). So I'm sending signal to Thread1 using pthread_kill() to interrupt poll() system call (if it gets blocked).



static void signalHandler(int signum) {
//Does nothing
}

// Thread 1 (Does I/O operations and handles scheduler events).

void* Thread1(void* args) {
terminateFlag = 0;
while(!terminateFlag) {
int millis = getTimeRemainedForNextEvent(); //calculate maximum number of milliseconds poll() can block.

int ret = poll(fds,numOfFDs,millis);
if(ret > 0) {
//handle socket events.
} else if (ret < 0) {
if(errno == EINTR)
perror("Poll Error");
break;
}

handleEvent();
}
}

// Thread 2 (Terminates Thread 1 when Thread 1 needs to be terminated)

void* Thread2(void* args) {
while(1) {

/* Do other stuff */

if(terminateThread1) {
terminateFlag = 1;
pthread_kill(ftid,SIGUSR2); //ftid is pthread_t variable of Thread1
pthread_join( ftid, NULL );
}
}

/* Do other stuff */
}


Above code works fine if Thread2 sets terminateFlag and sends signal to Thread1 when it blocked in poll() system call. But, If context switch happens after getTimeRemainedForNextEvent() function of Thread1 and Thread2 sets terminateFlag and sends signal, poll() of Thread1 gets blocked for several hours as it lost the signal that interrupts the system call.



It seems I can not use mutex for synchronization as poll() will hold the lock till it gets unblocked. Is there any synchronization mechanism that I can apply to avoid the above mentioned issue ?










share|improve this question














Previously I had asked a question regarding how to terminate thread blocked for I/O. I have used pthread_kill() instead of pthread_cancel() or writing to pipes, considering few advantages.



I have implementing the code to send signal (SIGUSR2) to the target thread using pthread_kill(). Below is the skeleton code for this. Most of the times getTimeRemainedForNextEvent() returns a value that blocks poll() for several hours. Because of this large timeout value, even if Thread2 sets terminateFlag (to stop Thread1), Thread2 gets blocked till poll() of Thread1 returns (which might be after several hours if there are no events on sockets). So I'm sending signal to Thread1 using pthread_kill() to interrupt poll() system call (if it gets blocked).



static void signalHandler(int signum) {
//Does nothing
}

// Thread 1 (Does I/O operations and handles scheduler events).

void* Thread1(void* args) {
terminateFlag = 0;
while(!terminateFlag) {
int millis = getTimeRemainedForNextEvent(); //calculate maximum number of milliseconds poll() can block.

int ret = poll(fds,numOfFDs,millis);
if(ret > 0) {
//handle socket events.
} else if (ret < 0) {
if(errno == EINTR)
perror("Poll Error");
break;
}

handleEvent();
}
}

// Thread 2 (Terminates Thread 1 when Thread 1 needs to be terminated)

void* Thread2(void* args) {
while(1) {

/* Do other stuff */

if(terminateThread1) {
terminateFlag = 1;
pthread_kill(ftid,SIGUSR2); //ftid is pthread_t variable of Thread1
pthread_join( ftid, NULL );
}
}

/* Do other stuff */
}


Above code works fine if Thread2 sets terminateFlag and sends signal to Thread1 when it blocked in poll() system call. But, If context switch happens after getTimeRemainedForNextEvent() function of Thread1 and Thread2 sets terminateFlag and sends signal, poll() of Thread1 gets blocked for several hours as it lost the signal that interrupts the system call.



It seems I can not use mutex for synchronization as poll() will hold the lock till it gets unblocked. Is there any synchronization mechanism that I can apply to avoid the above mentioned issue ?







c++ c linux multithreading synchronization






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 13 '18 at 13:53









DurgeshDurgesh

4741618




4741618













  • You might look into using ppoll() with an appropriate signal mask instead of poll().

    – Shawn
    Nov 13 '18 at 16:00





















  • You might look into using ppoll() with an appropriate signal mask instead of poll().

    – Shawn
    Nov 13 '18 at 16:00



















You might look into using ppoll() with an appropriate signal mask instead of poll().

– Shawn
Nov 13 '18 at 16:00







You might look into using ppoll() with an appropriate signal mask instead of poll().

– Shawn
Nov 13 '18 at 16:00














3 Answers
3






active

oldest

votes


















2














In the first place, access to shared variable terminateFlag by multiple threads must be protected by a mutex or similar synchronization mechanism, else your program does not conform and all bets are off. That might, for instance, look like this:



void *Thread1(void *args) {
pthread_mutex_lock(&a_mutex);
terminateFlag = 0;
while(!terminateFlag) {
pthread_mutex_unlock(&a_mutex);

// ...

pthread_mutex_lock(&a_mutex);
}
pthread_mutex_unlock(&a_mutex);
}

void* Thread2(void* args) {
// ...

if (terminateThread1) {
pthread_mutex_lock(&a_mutex);
terminateFlag = 1;
pthread_mutex_unlock(&a_mutex);
pthread_kill(ftid,SIGUSR2); //ftid is pthread_t variable of Thread1
pthread_join( ftid, NULL );
}

// ...
}


But that does not solve the main problem, that a signal sent by thread 2 may be delivered to thread 1 after it tests terminateFlag but before it calls poll(), though it does narrow the window in which that could happen.



The cleanest solution is that suggested already by @PaulSanders' answer: have thread 2 wake thread 1 via a file descriptor that thread 1 is polling (i.e. by means of a pipe). Inasmuch as you seem to have a plausible reason to seek an alternative approach, however, it should also be possible to make your signaling approach work by appropriate use of signal masking. Expanding on @Shawn's comment, here's how it would work:




  1. The parent thread blocks SIGUSR2 before starting thread 1, so that the latter, which inherits its signal mask from its parent, starts with that signal blocked.


  2. Thread 1 uses ppoll() instead of poll(), so as to be able to specify that SIGUSR2 will be unblocked for the duration of that call. ppoll() does signal mask handling atomically, so that there is no opportunity for a signal to be lost when it is blocked before the call and unblocked within.


  3. Thread 2 uses pthread_kill() to send SIGUSR2 to thread 1 to make it stop. Because that signal is only unblocked for that thread when it is performing a ppoll() call, it will not be lost (blocked signals remain pending until unblocked). This is precisely the kind of usage scenario for which ppoll() is designed.


  4. You should even be able to do away with the terminateThread variable and associated synchronization, because you should be able to rely upon the signal being delivered during a ppoll() call and therefore causing the EINTR code path to be exercised. That path does not rely on terminateThread to make the thread stop.







share|improve this answer
























  • You don't need a mutex to protect the shared variable. std::atomic is sufficient. But +1 for ppoll.

    – Paul Sanders
    Nov 13 '18 at 19:38













  • I'm implementing ppoll(). If Thread2 set terminateFlag and sent signal to Thread1 when execution is between ppoll() and end of while() loop, Thread1 exits without handling the signal (as it is blocked). In this case, Does the signal sent by Thread2 would be in pending state or discarded ?

    – Durgesh
    Nov 16 '18 at 9:43











  • @Durgesh, I'm having trouble finding explicit documentation to this effect, but as far as I am aware, if a thread terminates while it has a signal pending then that signal is effectively ignored. However, I do recommend that you drop the terminateThread variables, as you have to do the signaling regardless, and that does not need the variable. At most, make the variable a local one, and have the thread set it itself on the EINTR code path instead of expecting some other thread to set it.

    – John Bollinger
    Nov 16 '18 at 14:58













  • @JohnBollinger I agree with you regarding setting flag locally within the thread on EINTR. But is there any way to set this flag only for SIGUSR2 signal. I mean, is there any way to know which signal interrupted ppoll() system call ?

    – Durgesh
    Nov 22 '18 at 5:55













  • @Durgesh, in a single-threaded-scenario, you could have the signal handler record the signal number in a variable of type volatile sigatomic_t, but for a multithreaded scenario such as yours, I don't see how you bind that to the right thread. Your best bet is probably to block all signals that can be blocked and whose disposition is something other than termination, and have ppoll unblock only SIGUSR2.

    – John Bollinger
    Nov 22 '18 at 15:59



















2














Consider having an additional file descriptor in the set of fds passed to poll whose sole job is to make poll return when you want to terminate the thread.



Thus, in thread 2 we would have something like:



if (terminateThread1) {
terminateFlag = 1;
send (terminate_fd, " ", 1, 0);
pthread_join (ftid, NULL);
}
}


And terminate_fd would be in the set of fds passed to poll by thread 1.



-- OR --



If the overhead of having an extra fd per thread is too much (as discussed in the comments) then send something to one of the existing fds that thread 1 ignores. This will cause poll to return and then thread 1 will terminate. You can even have this 'special' value act as the terminate flag, which makes the logic a little tidier.






share|improve this answer


























  • There can be more than 5000 threads which are similar to Thread1. I can't use dedicated FD for each and every thread.

    – Durgesh
    Nov 13 '18 at 14:57











  • Why not? A file descriptor is not particularly heavyweight. How many fds is each thread waiting on? That would give a better estimate of the overhead of this approach.

    – Paul Sanders
    Nov 13 '18 at 15:00











  • Ours is a live streaming media server. For each and every connected camera, a dedicated thread will be spawned. Each thread waits on 2 sockets. So, if 20000 cameras are connected, the process needs to have 60K FDs considering dedicated FD per thread to terminate it. This limits the maximum number of cameras per sever as we have FD limit of process set to 60K

    – Durgesh
    Nov 13 '18 at 15:10











  • Perhaps you can send something (which thread 1 would ignore) to one of your existing fds. Waking up the poll by sending something to it is the key to this.

    – Paul Sanders
    Nov 13 '18 at 15:14













  • Ok. I will check the possibility of implementing such logic, if I don't get resolution to the problem that I had mentioned in my question.

    – Durgesh
    Nov 13 '18 at 15:18





















1














As you say yourself, you could use thread cancellation to solve this.
Outside of thread cancellation, I don't think there's a "right" way to solve this within POSIX (waking up the poll call with a write isn't exactly a generic method that would work for all situations in which a thread might get blocked), because POSIX's paradigm for making syscalls
and handling signals simply doesn't allow you to close the gap between a flag check and a potentially long blocking call.



void handler() { dont_enter_a_long_blocking_call_flg=1; }
int main()
{ //...
if(dont_enter_a_long_blocking_call_flg)
//THE GAP; what if the signal arrives here ?
potentially_long_blocking_call();
//....
}


The musl libc library uses signals for thread cancellation (because signals can break long-blocking calls that are in kernel mode)
and it uses them in conjunction with global assembly labels so that from the flag setting SIGCANCEL handler, it can do
(conceptually, I'm not pasting their actual code):



void sigcancel_handler(int Sig, siginfo_t *Info, void *Uctx)
{
thread_local_cancellation_flag=1;
if_interrupted_the_gap_move_Program_Counter_to_start_cancellation(Uctx);
}


Now if you changed if_interrupted_the_gap_move_Program_Counter_to_start_cancellation(Uctx);
to if_interrupted_the_gap_move_Program_Counter_to_make_the_syscall_fail(Uctx); and exported the if_interrupted_the_gap_move_Program_Counter_to_make_the_syscall_fail function along with the thread_local_cancellation_flag.



then you can use it to:




  • solve your problem robustly
    implement robust signal cancelation with any signal without having to put any of that pthread_cleanup_{push,pop} stuff into your already working thread-safe singel threaded code

  • ensure assured normal-context reaction to a signal delivery in your target thread even if the signal is caught.


Basically without a libc extension like this, if you once kill()/pthread_kill() a process/thread with a signal it handles or if put a function on a signal-sending timer, you cannot be sure of an assured reaction to the signal delivery, as the target may well receive the signal in a gap like above and hang indefinitely instead of responding to it.



I've implemented such a libc extension on top of musl libc and published it now https://github.com/pskocik/musl. The SIGNAL_EXAMPLES directory also shows some kill(), pthread_kill, and setitimer() examples that under a demonstrated race condition hang with classical libcs but don't wit my extended musl. You can use that extended musl to solve your problem cleanly and I also use it in my personal project to do robust thread cancellation without having to litter my code with pthread_cleanup_{push,pop}.



The obvious downside of this approach is that it's unportable and I only have it implemented for x86_64 musl. I've published it today in the hope that somebody (Cygwin, MacOSX?) copies it, because I think it's the right way to do cancellation in C.



In C++ and with glibc, you could utilize the fact that glibc uses exceptions to implement thread cancellation and simply use pthread_cancel (which uses a signal (SIGCANCEL) underneath) but catch it instead of letting it kill the thread.






share|improve this answer

























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53282566%2fsynchronization-issue-with-usage-of-pthread-kill-to-terminate-thread-blocked-f%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    3 Answers
    3






    active

    oldest

    votes








    3 Answers
    3






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    2














    In the first place, access to shared variable terminateFlag by multiple threads must be protected by a mutex or similar synchronization mechanism, else your program does not conform and all bets are off. That might, for instance, look like this:



    void *Thread1(void *args) {
    pthread_mutex_lock(&a_mutex);
    terminateFlag = 0;
    while(!terminateFlag) {
    pthread_mutex_unlock(&a_mutex);

    // ...

    pthread_mutex_lock(&a_mutex);
    }
    pthread_mutex_unlock(&a_mutex);
    }

    void* Thread2(void* args) {
    // ...

    if (terminateThread1) {
    pthread_mutex_lock(&a_mutex);
    terminateFlag = 1;
    pthread_mutex_unlock(&a_mutex);
    pthread_kill(ftid,SIGUSR2); //ftid is pthread_t variable of Thread1
    pthread_join( ftid, NULL );
    }

    // ...
    }


    But that does not solve the main problem, that a signal sent by thread 2 may be delivered to thread 1 after it tests terminateFlag but before it calls poll(), though it does narrow the window in which that could happen.



    The cleanest solution is that suggested already by @PaulSanders' answer: have thread 2 wake thread 1 via a file descriptor that thread 1 is polling (i.e. by means of a pipe). Inasmuch as you seem to have a plausible reason to seek an alternative approach, however, it should also be possible to make your signaling approach work by appropriate use of signal masking. Expanding on @Shawn's comment, here's how it would work:




    1. The parent thread blocks SIGUSR2 before starting thread 1, so that the latter, which inherits its signal mask from its parent, starts with that signal blocked.


    2. Thread 1 uses ppoll() instead of poll(), so as to be able to specify that SIGUSR2 will be unblocked for the duration of that call. ppoll() does signal mask handling atomically, so that there is no opportunity for a signal to be lost when it is blocked before the call and unblocked within.


    3. Thread 2 uses pthread_kill() to send SIGUSR2 to thread 1 to make it stop. Because that signal is only unblocked for that thread when it is performing a ppoll() call, it will not be lost (blocked signals remain pending until unblocked). This is precisely the kind of usage scenario for which ppoll() is designed.


    4. You should even be able to do away with the terminateThread variable and associated synchronization, because you should be able to rely upon the signal being delivered during a ppoll() call and therefore causing the EINTR code path to be exercised. That path does not rely on terminateThread to make the thread stop.







    share|improve this answer
























    • You don't need a mutex to protect the shared variable. std::atomic is sufficient. But +1 for ppoll.

      – Paul Sanders
      Nov 13 '18 at 19:38













    • I'm implementing ppoll(). If Thread2 set terminateFlag and sent signal to Thread1 when execution is between ppoll() and end of while() loop, Thread1 exits without handling the signal (as it is blocked). In this case, Does the signal sent by Thread2 would be in pending state or discarded ?

      – Durgesh
      Nov 16 '18 at 9:43











    • @Durgesh, I'm having trouble finding explicit documentation to this effect, but as far as I am aware, if a thread terminates while it has a signal pending then that signal is effectively ignored. However, I do recommend that you drop the terminateThread variables, as you have to do the signaling regardless, and that does not need the variable. At most, make the variable a local one, and have the thread set it itself on the EINTR code path instead of expecting some other thread to set it.

      – John Bollinger
      Nov 16 '18 at 14:58













    • @JohnBollinger I agree with you regarding setting flag locally within the thread on EINTR. But is there any way to set this flag only for SIGUSR2 signal. I mean, is there any way to know which signal interrupted ppoll() system call ?

      – Durgesh
      Nov 22 '18 at 5:55













    • @Durgesh, in a single-threaded-scenario, you could have the signal handler record the signal number in a variable of type volatile sigatomic_t, but for a multithreaded scenario such as yours, I don't see how you bind that to the right thread. Your best bet is probably to block all signals that can be blocked and whose disposition is something other than termination, and have ppoll unblock only SIGUSR2.

      – John Bollinger
      Nov 22 '18 at 15:59
















    2














    In the first place, access to shared variable terminateFlag by multiple threads must be protected by a mutex or similar synchronization mechanism, else your program does not conform and all bets are off. That might, for instance, look like this:



    void *Thread1(void *args) {
    pthread_mutex_lock(&a_mutex);
    terminateFlag = 0;
    while(!terminateFlag) {
    pthread_mutex_unlock(&a_mutex);

    // ...

    pthread_mutex_lock(&a_mutex);
    }
    pthread_mutex_unlock(&a_mutex);
    }

    void* Thread2(void* args) {
    // ...

    if (terminateThread1) {
    pthread_mutex_lock(&a_mutex);
    terminateFlag = 1;
    pthread_mutex_unlock(&a_mutex);
    pthread_kill(ftid,SIGUSR2); //ftid is pthread_t variable of Thread1
    pthread_join( ftid, NULL );
    }

    // ...
    }


    But that does not solve the main problem, that a signal sent by thread 2 may be delivered to thread 1 after it tests terminateFlag but before it calls poll(), though it does narrow the window in which that could happen.



    The cleanest solution is that suggested already by @PaulSanders' answer: have thread 2 wake thread 1 via a file descriptor that thread 1 is polling (i.e. by means of a pipe). Inasmuch as you seem to have a plausible reason to seek an alternative approach, however, it should also be possible to make your signaling approach work by appropriate use of signal masking. Expanding on @Shawn's comment, here's how it would work:




    1. The parent thread blocks SIGUSR2 before starting thread 1, so that the latter, which inherits its signal mask from its parent, starts with that signal blocked.


    2. Thread 1 uses ppoll() instead of poll(), so as to be able to specify that SIGUSR2 will be unblocked for the duration of that call. ppoll() does signal mask handling atomically, so that there is no opportunity for a signal to be lost when it is blocked before the call and unblocked within.


    3. Thread 2 uses pthread_kill() to send SIGUSR2 to thread 1 to make it stop. Because that signal is only unblocked for that thread when it is performing a ppoll() call, it will not be lost (blocked signals remain pending until unblocked). This is precisely the kind of usage scenario for which ppoll() is designed.


    4. You should even be able to do away with the terminateThread variable and associated synchronization, because you should be able to rely upon the signal being delivered during a ppoll() call and therefore causing the EINTR code path to be exercised. That path does not rely on terminateThread to make the thread stop.







    share|improve this answer
























    • You don't need a mutex to protect the shared variable. std::atomic is sufficient. But +1 for ppoll.

      – Paul Sanders
      Nov 13 '18 at 19:38













    • I'm implementing ppoll(). If Thread2 set terminateFlag and sent signal to Thread1 when execution is between ppoll() and end of while() loop, Thread1 exits without handling the signal (as it is blocked). In this case, Does the signal sent by Thread2 would be in pending state or discarded ?

      – Durgesh
      Nov 16 '18 at 9:43











    • @Durgesh, I'm having trouble finding explicit documentation to this effect, but as far as I am aware, if a thread terminates while it has a signal pending then that signal is effectively ignored. However, I do recommend that you drop the terminateThread variables, as you have to do the signaling regardless, and that does not need the variable. At most, make the variable a local one, and have the thread set it itself on the EINTR code path instead of expecting some other thread to set it.

      – John Bollinger
      Nov 16 '18 at 14:58













    • @JohnBollinger I agree with you regarding setting flag locally within the thread on EINTR. But is there any way to set this flag only for SIGUSR2 signal. I mean, is there any way to know which signal interrupted ppoll() system call ?

      – Durgesh
      Nov 22 '18 at 5:55













    • @Durgesh, in a single-threaded-scenario, you could have the signal handler record the signal number in a variable of type volatile sigatomic_t, but for a multithreaded scenario such as yours, I don't see how you bind that to the right thread. Your best bet is probably to block all signals that can be blocked and whose disposition is something other than termination, and have ppoll unblock only SIGUSR2.

      – John Bollinger
      Nov 22 '18 at 15:59














    2












    2








    2







    In the first place, access to shared variable terminateFlag by multiple threads must be protected by a mutex or similar synchronization mechanism, else your program does not conform and all bets are off. That might, for instance, look like this:



    void *Thread1(void *args) {
    pthread_mutex_lock(&a_mutex);
    terminateFlag = 0;
    while(!terminateFlag) {
    pthread_mutex_unlock(&a_mutex);

    // ...

    pthread_mutex_lock(&a_mutex);
    }
    pthread_mutex_unlock(&a_mutex);
    }

    void* Thread2(void* args) {
    // ...

    if (terminateThread1) {
    pthread_mutex_lock(&a_mutex);
    terminateFlag = 1;
    pthread_mutex_unlock(&a_mutex);
    pthread_kill(ftid,SIGUSR2); //ftid is pthread_t variable of Thread1
    pthread_join( ftid, NULL );
    }

    // ...
    }


    But that does not solve the main problem, that a signal sent by thread 2 may be delivered to thread 1 after it tests terminateFlag but before it calls poll(), though it does narrow the window in which that could happen.



    The cleanest solution is that suggested already by @PaulSanders' answer: have thread 2 wake thread 1 via a file descriptor that thread 1 is polling (i.e. by means of a pipe). Inasmuch as you seem to have a plausible reason to seek an alternative approach, however, it should also be possible to make your signaling approach work by appropriate use of signal masking. Expanding on @Shawn's comment, here's how it would work:




    1. The parent thread blocks SIGUSR2 before starting thread 1, so that the latter, which inherits its signal mask from its parent, starts with that signal blocked.


    2. Thread 1 uses ppoll() instead of poll(), so as to be able to specify that SIGUSR2 will be unblocked for the duration of that call. ppoll() does signal mask handling atomically, so that there is no opportunity for a signal to be lost when it is blocked before the call and unblocked within.


    3. Thread 2 uses pthread_kill() to send SIGUSR2 to thread 1 to make it stop. Because that signal is only unblocked for that thread when it is performing a ppoll() call, it will not be lost (blocked signals remain pending until unblocked). This is precisely the kind of usage scenario for which ppoll() is designed.


    4. You should even be able to do away with the terminateThread variable and associated synchronization, because you should be able to rely upon the signal being delivered during a ppoll() call and therefore causing the EINTR code path to be exercised. That path does not rely on terminateThread to make the thread stop.







    share|improve this answer













    In the first place, access to shared variable terminateFlag by multiple threads must be protected by a mutex or similar synchronization mechanism, else your program does not conform and all bets are off. That might, for instance, look like this:



    void *Thread1(void *args) {
    pthread_mutex_lock(&a_mutex);
    terminateFlag = 0;
    while(!terminateFlag) {
    pthread_mutex_unlock(&a_mutex);

    // ...

    pthread_mutex_lock(&a_mutex);
    }
    pthread_mutex_unlock(&a_mutex);
    }

    void* Thread2(void* args) {
    // ...

    if (terminateThread1) {
    pthread_mutex_lock(&a_mutex);
    terminateFlag = 1;
    pthread_mutex_unlock(&a_mutex);
    pthread_kill(ftid,SIGUSR2); //ftid is pthread_t variable of Thread1
    pthread_join( ftid, NULL );
    }

    // ...
    }


    But that does not solve the main problem, that a signal sent by thread 2 may be delivered to thread 1 after it tests terminateFlag but before it calls poll(), though it does narrow the window in which that could happen.



    The cleanest solution is that suggested already by @PaulSanders' answer: have thread 2 wake thread 1 via a file descriptor that thread 1 is polling (i.e. by means of a pipe). Inasmuch as you seem to have a plausible reason to seek an alternative approach, however, it should also be possible to make your signaling approach work by appropriate use of signal masking. Expanding on @Shawn's comment, here's how it would work:




    1. The parent thread blocks SIGUSR2 before starting thread 1, so that the latter, which inherits its signal mask from its parent, starts with that signal blocked.


    2. Thread 1 uses ppoll() instead of poll(), so as to be able to specify that SIGUSR2 will be unblocked for the duration of that call. ppoll() does signal mask handling atomically, so that there is no opportunity for a signal to be lost when it is blocked before the call and unblocked within.


    3. Thread 2 uses pthread_kill() to send SIGUSR2 to thread 1 to make it stop. Because that signal is only unblocked for that thread when it is performing a ppoll() call, it will not be lost (blocked signals remain pending until unblocked). This is precisely the kind of usage scenario for which ppoll() is designed.


    4. You should even be able to do away with the terminateThread variable and associated synchronization, because you should be able to rely upon the signal being delivered during a ppoll() call and therefore causing the EINTR code path to be exercised. That path does not rely on terminateThread to make the thread stop.








    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Nov 13 '18 at 17:53









    John BollingerJohn Bollinger

    80.7k74275




    80.7k74275













    • You don't need a mutex to protect the shared variable. std::atomic is sufficient. But +1 for ppoll.

      – Paul Sanders
      Nov 13 '18 at 19:38













    • I'm implementing ppoll(). If Thread2 set terminateFlag and sent signal to Thread1 when execution is between ppoll() and end of while() loop, Thread1 exits without handling the signal (as it is blocked). In this case, Does the signal sent by Thread2 would be in pending state or discarded ?

      – Durgesh
      Nov 16 '18 at 9:43











    • @Durgesh, I'm having trouble finding explicit documentation to this effect, but as far as I am aware, if a thread terminates while it has a signal pending then that signal is effectively ignored. However, I do recommend that you drop the terminateThread variables, as you have to do the signaling regardless, and that does not need the variable. At most, make the variable a local one, and have the thread set it itself on the EINTR code path instead of expecting some other thread to set it.

      – John Bollinger
      Nov 16 '18 at 14:58













    • @JohnBollinger I agree with you regarding setting flag locally within the thread on EINTR. But is there any way to set this flag only for SIGUSR2 signal. I mean, is there any way to know which signal interrupted ppoll() system call ?

      – Durgesh
      Nov 22 '18 at 5:55













    • @Durgesh, in a single-threaded-scenario, you could have the signal handler record the signal number in a variable of type volatile sigatomic_t, but for a multithreaded scenario such as yours, I don't see how you bind that to the right thread. Your best bet is probably to block all signals that can be blocked and whose disposition is something other than termination, and have ppoll unblock only SIGUSR2.

      – John Bollinger
      Nov 22 '18 at 15:59



















    • You don't need a mutex to protect the shared variable. std::atomic is sufficient. But +1 for ppoll.

      – Paul Sanders
      Nov 13 '18 at 19:38













    • I'm implementing ppoll(). If Thread2 set terminateFlag and sent signal to Thread1 when execution is between ppoll() and end of while() loop, Thread1 exits without handling the signal (as it is blocked). In this case, Does the signal sent by Thread2 would be in pending state or discarded ?

      – Durgesh
      Nov 16 '18 at 9:43











    • @Durgesh, I'm having trouble finding explicit documentation to this effect, but as far as I am aware, if a thread terminates while it has a signal pending then that signal is effectively ignored. However, I do recommend that you drop the terminateThread variables, as you have to do the signaling regardless, and that does not need the variable. At most, make the variable a local one, and have the thread set it itself on the EINTR code path instead of expecting some other thread to set it.

      – John Bollinger
      Nov 16 '18 at 14:58













    • @JohnBollinger I agree with you regarding setting flag locally within the thread on EINTR. But is there any way to set this flag only for SIGUSR2 signal. I mean, is there any way to know which signal interrupted ppoll() system call ?

      – Durgesh
      Nov 22 '18 at 5:55













    • @Durgesh, in a single-threaded-scenario, you could have the signal handler record the signal number in a variable of type volatile sigatomic_t, but for a multithreaded scenario such as yours, I don't see how you bind that to the right thread. Your best bet is probably to block all signals that can be blocked and whose disposition is something other than termination, and have ppoll unblock only SIGUSR2.

      – John Bollinger
      Nov 22 '18 at 15:59

















    You don't need a mutex to protect the shared variable. std::atomic is sufficient. But +1 for ppoll.

    – Paul Sanders
    Nov 13 '18 at 19:38







    You don't need a mutex to protect the shared variable. std::atomic is sufficient. But +1 for ppoll.

    – Paul Sanders
    Nov 13 '18 at 19:38















    I'm implementing ppoll(). If Thread2 set terminateFlag and sent signal to Thread1 when execution is between ppoll() and end of while() loop, Thread1 exits without handling the signal (as it is blocked). In this case, Does the signal sent by Thread2 would be in pending state or discarded ?

    – Durgesh
    Nov 16 '18 at 9:43





    I'm implementing ppoll(). If Thread2 set terminateFlag and sent signal to Thread1 when execution is between ppoll() and end of while() loop, Thread1 exits without handling the signal (as it is blocked). In this case, Does the signal sent by Thread2 would be in pending state or discarded ?

    – Durgesh
    Nov 16 '18 at 9:43













    @Durgesh, I'm having trouble finding explicit documentation to this effect, but as far as I am aware, if a thread terminates while it has a signal pending then that signal is effectively ignored. However, I do recommend that you drop the terminateThread variables, as you have to do the signaling regardless, and that does not need the variable. At most, make the variable a local one, and have the thread set it itself on the EINTR code path instead of expecting some other thread to set it.

    – John Bollinger
    Nov 16 '18 at 14:58







    @Durgesh, I'm having trouble finding explicit documentation to this effect, but as far as I am aware, if a thread terminates while it has a signal pending then that signal is effectively ignored. However, I do recommend that you drop the terminateThread variables, as you have to do the signaling regardless, and that does not need the variable. At most, make the variable a local one, and have the thread set it itself on the EINTR code path instead of expecting some other thread to set it.

    – John Bollinger
    Nov 16 '18 at 14:58















    @JohnBollinger I agree with you regarding setting flag locally within the thread on EINTR. But is there any way to set this flag only for SIGUSR2 signal. I mean, is there any way to know which signal interrupted ppoll() system call ?

    – Durgesh
    Nov 22 '18 at 5:55







    @JohnBollinger I agree with you regarding setting flag locally within the thread on EINTR. But is there any way to set this flag only for SIGUSR2 signal. I mean, is there any way to know which signal interrupted ppoll() system call ?

    – Durgesh
    Nov 22 '18 at 5:55















    @Durgesh, in a single-threaded-scenario, you could have the signal handler record the signal number in a variable of type volatile sigatomic_t, but for a multithreaded scenario such as yours, I don't see how you bind that to the right thread. Your best bet is probably to block all signals that can be blocked and whose disposition is something other than termination, and have ppoll unblock only SIGUSR2.

    – John Bollinger
    Nov 22 '18 at 15:59





    @Durgesh, in a single-threaded-scenario, you could have the signal handler record the signal number in a variable of type volatile sigatomic_t, but for a multithreaded scenario such as yours, I don't see how you bind that to the right thread. Your best bet is probably to block all signals that can be blocked and whose disposition is something other than termination, and have ppoll unblock only SIGUSR2.

    – John Bollinger
    Nov 22 '18 at 15:59













    2














    Consider having an additional file descriptor in the set of fds passed to poll whose sole job is to make poll return when you want to terminate the thread.



    Thus, in thread 2 we would have something like:



    if (terminateThread1) {
    terminateFlag = 1;
    send (terminate_fd, " ", 1, 0);
    pthread_join (ftid, NULL);
    }
    }


    And terminate_fd would be in the set of fds passed to poll by thread 1.



    -- OR --



    If the overhead of having an extra fd per thread is too much (as discussed in the comments) then send something to one of the existing fds that thread 1 ignores. This will cause poll to return and then thread 1 will terminate. You can even have this 'special' value act as the terminate flag, which makes the logic a little tidier.






    share|improve this answer


























    • There can be more than 5000 threads which are similar to Thread1. I can't use dedicated FD for each and every thread.

      – Durgesh
      Nov 13 '18 at 14:57











    • Why not? A file descriptor is not particularly heavyweight. How many fds is each thread waiting on? That would give a better estimate of the overhead of this approach.

      – Paul Sanders
      Nov 13 '18 at 15:00











    • Ours is a live streaming media server. For each and every connected camera, a dedicated thread will be spawned. Each thread waits on 2 sockets. So, if 20000 cameras are connected, the process needs to have 60K FDs considering dedicated FD per thread to terminate it. This limits the maximum number of cameras per sever as we have FD limit of process set to 60K

      – Durgesh
      Nov 13 '18 at 15:10











    • Perhaps you can send something (which thread 1 would ignore) to one of your existing fds. Waking up the poll by sending something to it is the key to this.

      – Paul Sanders
      Nov 13 '18 at 15:14













    • Ok. I will check the possibility of implementing such logic, if I don't get resolution to the problem that I had mentioned in my question.

      – Durgesh
      Nov 13 '18 at 15:18


















    2














    Consider having an additional file descriptor in the set of fds passed to poll whose sole job is to make poll return when you want to terminate the thread.



    Thus, in thread 2 we would have something like:



    if (terminateThread1) {
    terminateFlag = 1;
    send (terminate_fd, " ", 1, 0);
    pthread_join (ftid, NULL);
    }
    }


    And terminate_fd would be in the set of fds passed to poll by thread 1.



    -- OR --



    If the overhead of having an extra fd per thread is too much (as discussed in the comments) then send something to one of the existing fds that thread 1 ignores. This will cause poll to return and then thread 1 will terminate. You can even have this 'special' value act as the terminate flag, which makes the logic a little tidier.






    share|improve this answer


























    • There can be more than 5000 threads which are similar to Thread1. I can't use dedicated FD for each and every thread.

      – Durgesh
      Nov 13 '18 at 14:57











    • Why not? A file descriptor is not particularly heavyweight. How many fds is each thread waiting on? That would give a better estimate of the overhead of this approach.

      – Paul Sanders
      Nov 13 '18 at 15:00











    • Ours is a live streaming media server. For each and every connected camera, a dedicated thread will be spawned. Each thread waits on 2 sockets. So, if 20000 cameras are connected, the process needs to have 60K FDs considering dedicated FD per thread to terminate it. This limits the maximum number of cameras per sever as we have FD limit of process set to 60K

      – Durgesh
      Nov 13 '18 at 15:10











    • Perhaps you can send something (which thread 1 would ignore) to one of your existing fds. Waking up the poll by sending something to it is the key to this.

      – Paul Sanders
      Nov 13 '18 at 15:14













    • Ok. I will check the possibility of implementing such logic, if I don't get resolution to the problem that I had mentioned in my question.

      – Durgesh
      Nov 13 '18 at 15:18
















    2












    2








    2







    Consider having an additional file descriptor in the set of fds passed to poll whose sole job is to make poll return when you want to terminate the thread.



    Thus, in thread 2 we would have something like:



    if (terminateThread1) {
    terminateFlag = 1;
    send (terminate_fd, " ", 1, 0);
    pthread_join (ftid, NULL);
    }
    }


    And terminate_fd would be in the set of fds passed to poll by thread 1.



    -- OR --



    If the overhead of having an extra fd per thread is too much (as discussed in the comments) then send something to one of the existing fds that thread 1 ignores. This will cause poll to return and then thread 1 will terminate. You can even have this 'special' value act as the terminate flag, which makes the logic a little tidier.






    share|improve this answer















    Consider having an additional file descriptor in the set of fds passed to poll whose sole job is to make poll return when you want to terminate the thread.



    Thus, in thread 2 we would have something like:



    if (terminateThread1) {
    terminateFlag = 1;
    send (terminate_fd, " ", 1, 0);
    pthread_join (ftid, NULL);
    }
    }


    And terminate_fd would be in the set of fds passed to poll by thread 1.



    -- OR --



    If the overhead of having an extra fd per thread is too much (as discussed in the comments) then send something to one of the existing fds that thread 1 ignores. This will cause poll to return and then thread 1 will terminate. You can even have this 'special' value act as the terminate flag, which makes the logic a little tidier.







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Nov 13 '18 at 19:36

























    answered Nov 13 '18 at 14:51









    Paul SandersPaul Sanders

    5,1851621




    5,1851621













    • There can be more than 5000 threads which are similar to Thread1. I can't use dedicated FD for each and every thread.

      – Durgesh
      Nov 13 '18 at 14:57











    • Why not? A file descriptor is not particularly heavyweight. How many fds is each thread waiting on? That would give a better estimate of the overhead of this approach.

      – Paul Sanders
      Nov 13 '18 at 15:00











    • Ours is a live streaming media server. For each and every connected camera, a dedicated thread will be spawned. Each thread waits on 2 sockets. So, if 20000 cameras are connected, the process needs to have 60K FDs considering dedicated FD per thread to terminate it. This limits the maximum number of cameras per sever as we have FD limit of process set to 60K

      – Durgesh
      Nov 13 '18 at 15:10











    • Perhaps you can send something (which thread 1 would ignore) to one of your existing fds. Waking up the poll by sending something to it is the key to this.

      – Paul Sanders
      Nov 13 '18 at 15:14













    • Ok. I will check the possibility of implementing such logic, if I don't get resolution to the problem that I had mentioned in my question.

      – Durgesh
      Nov 13 '18 at 15:18





















    • There can be more than 5000 threads which are similar to Thread1. I can't use dedicated FD for each and every thread.

      – Durgesh
      Nov 13 '18 at 14:57











    • Why not? A file descriptor is not particularly heavyweight. How many fds is each thread waiting on? That would give a better estimate of the overhead of this approach.

      – Paul Sanders
      Nov 13 '18 at 15:00











    • Ours is a live streaming media server. For each and every connected camera, a dedicated thread will be spawned. Each thread waits on 2 sockets. So, if 20000 cameras are connected, the process needs to have 60K FDs considering dedicated FD per thread to terminate it. This limits the maximum number of cameras per sever as we have FD limit of process set to 60K

      – Durgesh
      Nov 13 '18 at 15:10











    • Perhaps you can send something (which thread 1 would ignore) to one of your existing fds. Waking up the poll by sending something to it is the key to this.

      – Paul Sanders
      Nov 13 '18 at 15:14













    • Ok. I will check the possibility of implementing such logic, if I don't get resolution to the problem that I had mentioned in my question.

      – Durgesh
      Nov 13 '18 at 15:18



















    There can be more than 5000 threads which are similar to Thread1. I can't use dedicated FD for each and every thread.

    – Durgesh
    Nov 13 '18 at 14:57





    There can be more than 5000 threads which are similar to Thread1. I can't use dedicated FD for each and every thread.

    – Durgesh
    Nov 13 '18 at 14:57













    Why not? A file descriptor is not particularly heavyweight. How many fds is each thread waiting on? That would give a better estimate of the overhead of this approach.

    – Paul Sanders
    Nov 13 '18 at 15:00





    Why not? A file descriptor is not particularly heavyweight. How many fds is each thread waiting on? That would give a better estimate of the overhead of this approach.

    – Paul Sanders
    Nov 13 '18 at 15:00













    Ours is a live streaming media server. For each and every connected camera, a dedicated thread will be spawned. Each thread waits on 2 sockets. So, if 20000 cameras are connected, the process needs to have 60K FDs considering dedicated FD per thread to terminate it. This limits the maximum number of cameras per sever as we have FD limit of process set to 60K

    – Durgesh
    Nov 13 '18 at 15:10





    Ours is a live streaming media server. For each and every connected camera, a dedicated thread will be spawned. Each thread waits on 2 sockets. So, if 20000 cameras are connected, the process needs to have 60K FDs considering dedicated FD per thread to terminate it. This limits the maximum number of cameras per sever as we have FD limit of process set to 60K

    – Durgesh
    Nov 13 '18 at 15:10













    Perhaps you can send something (which thread 1 would ignore) to one of your existing fds. Waking up the poll by sending something to it is the key to this.

    – Paul Sanders
    Nov 13 '18 at 15:14







    Perhaps you can send something (which thread 1 would ignore) to one of your existing fds. Waking up the poll by sending something to it is the key to this.

    – Paul Sanders
    Nov 13 '18 at 15:14















    Ok. I will check the possibility of implementing such logic, if I don't get resolution to the problem that I had mentioned in my question.

    – Durgesh
    Nov 13 '18 at 15:18







    Ok. I will check the possibility of implementing such logic, if I don't get resolution to the problem that I had mentioned in my question.

    – Durgesh
    Nov 13 '18 at 15:18













    1














    As you say yourself, you could use thread cancellation to solve this.
    Outside of thread cancellation, I don't think there's a "right" way to solve this within POSIX (waking up the poll call with a write isn't exactly a generic method that would work for all situations in which a thread might get blocked), because POSIX's paradigm for making syscalls
    and handling signals simply doesn't allow you to close the gap between a flag check and a potentially long blocking call.



    void handler() { dont_enter_a_long_blocking_call_flg=1; }
    int main()
    { //...
    if(dont_enter_a_long_blocking_call_flg)
    //THE GAP; what if the signal arrives here ?
    potentially_long_blocking_call();
    //....
    }


    The musl libc library uses signals for thread cancellation (because signals can break long-blocking calls that are in kernel mode)
    and it uses them in conjunction with global assembly labels so that from the flag setting SIGCANCEL handler, it can do
    (conceptually, I'm not pasting their actual code):



    void sigcancel_handler(int Sig, siginfo_t *Info, void *Uctx)
    {
    thread_local_cancellation_flag=1;
    if_interrupted_the_gap_move_Program_Counter_to_start_cancellation(Uctx);
    }


    Now if you changed if_interrupted_the_gap_move_Program_Counter_to_start_cancellation(Uctx);
    to if_interrupted_the_gap_move_Program_Counter_to_make_the_syscall_fail(Uctx); and exported the if_interrupted_the_gap_move_Program_Counter_to_make_the_syscall_fail function along with the thread_local_cancellation_flag.



    then you can use it to:




    • solve your problem robustly
      implement robust signal cancelation with any signal without having to put any of that pthread_cleanup_{push,pop} stuff into your already working thread-safe singel threaded code

    • ensure assured normal-context reaction to a signal delivery in your target thread even if the signal is caught.


    Basically without a libc extension like this, if you once kill()/pthread_kill() a process/thread with a signal it handles or if put a function on a signal-sending timer, you cannot be sure of an assured reaction to the signal delivery, as the target may well receive the signal in a gap like above and hang indefinitely instead of responding to it.



    I've implemented such a libc extension on top of musl libc and published it now https://github.com/pskocik/musl. The SIGNAL_EXAMPLES directory also shows some kill(), pthread_kill, and setitimer() examples that under a demonstrated race condition hang with classical libcs but don't wit my extended musl. You can use that extended musl to solve your problem cleanly and I also use it in my personal project to do robust thread cancellation without having to litter my code with pthread_cleanup_{push,pop}.



    The obvious downside of this approach is that it's unportable and I only have it implemented for x86_64 musl. I've published it today in the hope that somebody (Cygwin, MacOSX?) copies it, because I think it's the right way to do cancellation in C.



    In C++ and with glibc, you could utilize the fact that glibc uses exceptions to implement thread cancellation and simply use pthread_cancel (which uses a signal (SIGCANCEL) underneath) but catch it instead of letting it kill the thread.






    share|improve this answer






























      1














      As you say yourself, you could use thread cancellation to solve this.
      Outside of thread cancellation, I don't think there's a "right" way to solve this within POSIX (waking up the poll call with a write isn't exactly a generic method that would work for all situations in which a thread might get blocked), because POSIX's paradigm for making syscalls
      and handling signals simply doesn't allow you to close the gap between a flag check and a potentially long blocking call.



      void handler() { dont_enter_a_long_blocking_call_flg=1; }
      int main()
      { //...
      if(dont_enter_a_long_blocking_call_flg)
      //THE GAP; what if the signal arrives here ?
      potentially_long_blocking_call();
      //....
      }


      The musl libc library uses signals for thread cancellation (because signals can break long-blocking calls that are in kernel mode)
      and it uses them in conjunction with global assembly labels so that from the flag setting SIGCANCEL handler, it can do
      (conceptually, I'm not pasting their actual code):



      void sigcancel_handler(int Sig, siginfo_t *Info, void *Uctx)
      {
      thread_local_cancellation_flag=1;
      if_interrupted_the_gap_move_Program_Counter_to_start_cancellation(Uctx);
      }


      Now if you changed if_interrupted_the_gap_move_Program_Counter_to_start_cancellation(Uctx);
      to if_interrupted_the_gap_move_Program_Counter_to_make_the_syscall_fail(Uctx); and exported the if_interrupted_the_gap_move_Program_Counter_to_make_the_syscall_fail function along with the thread_local_cancellation_flag.



      then you can use it to:




      • solve your problem robustly
        implement robust signal cancelation with any signal without having to put any of that pthread_cleanup_{push,pop} stuff into your already working thread-safe singel threaded code

      • ensure assured normal-context reaction to a signal delivery in your target thread even if the signal is caught.


      Basically without a libc extension like this, if you once kill()/pthread_kill() a process/thread with a signal it handles or if put a function on a signal-sending timer, you cannot be sure of an assured reaction to the signal delivery, as the target may well receive the signal in a gap like above and hang indefinitely instead of responding to it.



      I've implemented such a libc extension on top of musl libc and published it now https://github.com/pskocik/musl. The SIGNAL_EXAMPLES directory also shows some kill(), pthread_kill, and setitimer() examples that under a demonstrated race condition hang with classical libcs but don't wit my extended musl. You can use that extended musl to solve your problem cleanly and I also use it in my personal project to do robust thread cancellation without having to litter my code with pthread_cleanup_{push,pop}.



      The obvious downside of this approach is that it's unportable and I only have it implemented for x86_64 musl. I've published it today in the hope that somebody (Cygwin, MacOSX?) copies it, because I think it's the right way to do cancellation in C.



      In C++ and with glibc, you could utilize the fact that glibc uses exceptions to implement thread cancellation and simply use pthread_cancel (which uses a signal (SIGCANCEL) underneath) but catch it instead of letting it kill the thread.






      share|improve this answer




























        1












        1








        1







        As you say yourself, you could use thread cancellation to solve this.
        Outside of thread cancellation, I don't think there's a "right" way to solve this within POSIX (waking up the poll call with a write isn't exactly a generic method that would work for all situations in which a thread might get blocked), because POSIX's paradigm for making syscalls
        and handling signals simply doesn't allow you to close the gap between a flag check and a potentially long blocking call.



        void handler() { dont_enter_a_long_blocking_call_flg=1; }
        int main()
        { //...
        if(dont_enter_a_long_blocking_call_flg)
        //THE GAP; what if the signal arrives here ?
        potentially_long_blocking_call();
        //....
        }


        The musl libc library uses signals for thread cancellation (because signals can break long-blocking calls that are in kernel mode)
        and it uses them in conjunction with global assembly labels so that from the flag setting SIGCANCEL handler, it can do
        (conceptually, I'm not pasting their actual code):



        void sigcancel_handler(int Sig, siginfo_t *Info, void *Uctx)
        {
        thread_local_cancellation_flag=1;
        if_interrupted_the_gap_move_Program_Counter_to_start_cancellation(Uctx);
        }


        Now if you changed if_interrupted_the_gap_move_Program_Counter_to_start_cancellation(Uctx);
        to if_interrupted_the_gap_move_Program_Counter_to_make_the_syscall_fail(Uctx); and exported the if_interrupted_the_gap_move_Program_Counter_to_make_the_syscall_fail function along with the thread_local_cancellation_flag.



        then you can use it to:




        • solve your problem robustly
          implement robust signal cancelation with any signal without having to put any of that pthread_cleanup_{push,pop} stuff into your already working thread-safe singel threaded code

        • ensure assured normal-context reaction to a signal delivery in your target thread even if the signal is caught.


        Basically without a libc extension like this, if you once kill()/pthread_kill() a process/thread with a signal it handles or if put a function on a signal-sending timer, you cannot be sure of an assured reaction to the signal delivery, as the target may well receive the signal in a gap like above and hang indefinitely instead of responding to it.



        I've implemented such a libc extension on top of musl libc and published it now https://github.com/pskocik/musl. The SIGNAL_EXAMPLES directory also shows some kill(), pthread_kill, and setitimer() examples that under a demonstrated race condition hang with classical libcs but don't wit my extended musl. You can use that extended musl to solve your problem cleanly and I also use it in my personal project to do robust thread cancellation without having to litter my code with pthread_cleanup_{push,pop}.



        The obvious downside of this approach is that it's unportable and I only have it implemented for x86_64 musl. I've published it today in the hope that somebody (Cygwin, MacOSX?) copies it, because I think it's the right way to do cancellation in C.



        In C++ and with glibc, you could utilize the fact that glibc uses exceptions to implement thread cancellation and simply use pthread_cancel (which uses a signal (SIGCANCEL) underneath) but catch it instead of letting it kill the thread.






        share|improve this answer















        As you say yourself, you could use thread cancellation to solve this.
        Outside of thread cancellation, I don't think there's a "right" way to solve this within POSIX (waking up the poll call with a write isn't exactly a generic method that would work for all situations in which a thread might get blocked), because POSIX's paradigm for making syscalls
        and handling signals simply doesn't allow you to close the gap between a flag check and a potentially long blocking call.



        void handler() { dont_enter_a_long_blocking_call_flg=1; }
        int main()
        { //...
        if(dont_enter_a_long_blocking_call_flg)
        //THE GAP; what if the signal arrives here ?
        potentially_long_blocking_call();
        //....
        }


        The musl libc library uses signals for thread cancellation (because signals can break long-blocking calls that are in kernel mode)
        and it uses them in conjunction with global assembly labels so that from the flag setting SIGCANCEL handler, it can do
        (conceptually, I'm not pasting their actual code):



        void sigcancel_handler(int Sig, siginfo_t *Info, void *Uctx)
        {
        thread_local_cancellation_flag=1;
        if_interrupted_the_gap_move_Program_Counter_to_start_cancellation(Uctx);
        }


        Now if you changed if_interrupted_the_gap_move_Program_Counter_to_start_cancellation(Uctx);
        to if_interrupted_the_gap_move_Program_Counter_to_make_the_syscall_fail(Uctx); and exported the if_interrupted_the_gap_move_Program_Counter_to_make_the_syscall_fail function along with the thread_local_cancellation_flag.



        then you can use it to:




        • solve your problem robustly
          implement robust signal cancelation with any signal without having to put any of that pthread_cleanup_{push,pop} stuff into your already working thread-safe singel threaded code

        • ensure assured normal-context reaction to a signal delivery in your target thread even if the signal is caught.


        Basically without a libc extension like this, if you once kill()/pthread_kill() a process/thread with a signal it handles or if put a function on a signal-sending timer, you cannot be sure of an assured reaction to the signal delivery, as the target may well receive the signal in a gap like above and hang indefinitely instead of responding to it.



        I've implemented such a libc extension on top of musl libc and published it now https://github.com/pskocik/musl. The SIGNAL_EXAMPLES directory also shows some kill(), pthread_kill, and setitimer() examples that under a demonstrated race condition hang with classical libcs but don't wit my extended musl. You can use that extended musl to solve your problem cleanly and I also use it in my personal project to do robust thread cancellation without having to litter my code with pthread_cleanup_{push,pop}.



        The obvious downside of this approach is that it's unportable and I only have it implemented for x86_64 musl. I've published it today in the hope that somebody (Cygwin, MacOSX?) copies it, because I think it's the right way to do cancellation in C.



        In C++ and with glibc, you could utilize the fact that glibc uses exceptions to implement thread cancellation and simply use pthread_cancel (which uses a signal (SIGCANCEL) underneath) but catch it instead of letting it kill the thread.







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Nov 16 '18 at 10:03

























        answered Nov 14 '18 at 14:54









        PSkocikPSkocik

        32.9k64970




        32.9k64970






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53282566%2fsynchronization-issue-with-usage-of-pthread-kill-to-terminate-thread-blocked-f%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Full-time equivalent

            Bicuculline

            What is this shape that looks like a rectangle with rounded ends called?