Synchronization issue with usage of pthread_kill() to terminate thread blocked for I/O
Previously I had asked a question regarding how to terminate thread blocked for I/O. I have used pthread_kill()
instead of pthread_cancel()
or writing to pipes, considering few advantages.
I have implementing the code to send signal (SIGUSR2) to the target thread using pthread_kill()
. Below is the skeleton code for this. Most of the times getTimeRemainedForNextEvent()
returns a value that blocks poll() for several hours. Because of this large timeout value, even if Thread2 sets terminateFlag (to stop Thread1), Thread2 gets blocked till poll() of Thread1 returns (which might be after several hours if there are no events on sockets). So I'm sending signal to Thread1 using pthread_kill()
to interrupt poll() system call (if it gets blocked).
static void signalHandler(int signum) {
//Does nothing
}
// Thread 1 (Does I/O operations and handles scheduler events).
void* Thread1(void* args) {
terminateFlag = 0;
while(!terminateFlag) {
int millis = getTimeRemainedForNextEvent(); //calculate maximum number of milliseconds poll() can block.
int ret = poll(fds,numOfFDs,millis);
if(ret > 0) {
//handle socket events.
} else if (ret < 0) {
if(errno == EINTR)
perror("Poll Error");
break;
}
handleEvent();
}
}
// Thread 2 (Terminates Thread 1 when Thread 1 needs to be terminated)
void* Thread2(void* args) {
while(1) {
/* Do other stuff */
if(terminateThread1) {
terminateFlag = 1;
pthread_kill(ftid,SIGUSR2); //ftid is pthread_t variable of Thread1
pthread_join( ftid, NULL );
}
}
/* Do other stuff */
}
Above code works fine if Thread2 sets terminateFlag and sends signal to Thread1 when it blocked in poll() system call. But, If context switch happens after getTimeRemainedForNextEvent()
function of Thread1 and Thread2 sets terminateFlag and sends signal, poll() of Thread1 gets blocked for several hours as it lost the signal that interrupts the system call.
It seems I can not use mutex for synchronization as poll() will hold the lock till it gets unblocked. Is there any synchronization mechanism that I can apply to avoid the above mentioned issue ?
c++ c linux multithreading synchronization
add a comment |
Previously I had asked a question regarding how to terminate thread blocked for I/O. I have used pthread_kill()
instead of pthread_cancel()
or writing to pipes, considering few advantages.
I have implementing the code to send signal (SIGUSR2) to the target thread using pthread_kill()
. Below is the skeleton code for this. Most of the times getTimeRemainedForNextEvent()
returns a value that blocks poll() for several hours. Because of this large timeout value, even if Thread2 sets terminateFlag (to stop Thread1), Thread2 gets blocked till poll() of Thread1 returns (which might be after several hours if there are no events on sockets). So I'm sending signal to Thread1 using pthread_kill()
to interrupt poll() system call (if it gets blocked).
static void signalHandler(int signum) {
//Does nothing
}
// Thread 1 (Does I/O operations and handles scheduler events).
void* Thread1(void* args) {
terminateFlag = 0;
while(!terminateFlag) {
int millis = getTimeRemainedForNextEvent(); //calculate maximum number of milliseconds poll() can block.
int ret = poll(fds,numOfFDs,millis);
if(ret > 0) {
//handle socket events.
} else if (ret < 0) {
if(errno == EINTR)
perror("Poll Error");
break;
}
handleEvent();
}
}
// Thread 2 (Terminates Thread 1 when Thread 1 needs to be terminated)
void* Thread2(void* args) {
while(1) {
/* Do other stuff */
if(terminateThread1) {
terminateFlag = 1;
pthread_kill(ftid,SIGUSR2); //ftid is pthread_t variable of Thread1
pthread_join( ftid, NULL );
}
}
/* Do other stuff */
}
Above code works fine if Thread2 sets terminateFlag and sends signal to Thread1 when it blocked in poll() system call. But, If context switch happens after getTimeRemainedForNextEvent()
function of Thread1 and Thread2 sets terminateFlag and sends signal, poll() of Thread1 gets blocked for several hours as it lost the signal that interrupts the system call.
It seems I can not use mutex for synchronization as poll() will hold the lock till it gets unblocked. Is there any synchronization mechanism that I can apply to avoid the above mentioned issue ?
c++ c linux multithreading synchronization
You might look into usingppoll()
with an appropriate signal mask instead ofpoll()
.
– Shawn
Nov 13 '18 at 16:00
add a comment |
Previously I had asked a question regarding how to terminate thread blocked for I/O. I have used pthread_kill()
instead of pthread_cancel()
or writing to pipes, considering few advantages.
I have implementing the code to send signal (SIGUSR2) to the target thread using pthread_kill()
. Below is the skeleton code for this. Most of the times getTimeRemainedForNextEvent()
returns a value that blocks poll() for several hours. Because of this large timeout value, even if Thread2 sets terminateFlag (to stop Thread1), Thread2 gets blocked till poll() of Thread1 returns (which might be after several hours if there are no events on sockets). So I'm sending signal to Thread1 using pthread_kill()
to interrupt poll() system call (if it gets blocked).
static void signalHandler(int signum) {
//Does nothing
}
// Thread 1 (Does I/O operations and handles scheduler events).
void* Thread1(void* args) {
terminateFlag = 0;
while(!terminateFlag) {
int millis = getTimeRemainedForNextEvent(); //calculate maximum number of milliseconds poll() can block.
int ret = poll(fds,numOfFDs,millis);
if(ret > 0) {
//handle socket events.
} else if (ret < 0) {
if(errno == EINTR)
perror("Poll Error");
break;
}
handleEvent();
}
}
// Thread 2 (Terminates Thread 1 when Thread 1 needs to be terminated)
void* Thread2(void* args) {
while(1) {
/* Do other stuff */
if(terminateThread1) {
terminateFlag = 1;
pthread_kill(ftid,SIGUSR2); //ftid is pthread_t variable of Thread1
pthread_join( ftid, NULL );
}
}
/* Do other stuff */
}
Above code works fine if Thread2 sets terminateFlag and sends signal to Thread1 when it blocked in poll() system call. But, If context switch happens after getTimeRemainedForNextEvent()
function of Thread1 and Thread2 sets terminateFlag and sends signal, poll() of Thread1 gets blocked for several hours as it lost the signal that interrupts the system call.
It seems I can not use mutex for synchronization as poll() will hold the lock till it gets unblocked. Is there any synchronization mechanism that I can apply to avoid the above mentioned issue ?
c++ c linux multithreading synchronization
Previously I had asked a question regarding how to terminate thread blocked for I/O. I have used pthread_kill()
instead of pthread_cancel()
or writing to pipes, considering few advantages.
I have implementing the code to send signal (SIGUSR2) to the target thread using pthread_kill()
. Below is the skeleton code for this. Most of the times getTimeRemainedForNextEvent()
returns a value that blocks poll() for several hours. Because of this large timeout value, even if Thread2 sets terminateFlag (to stop Thread1), Thread2 gets blocked till poll() of Thread1 returns (which might be after several hours if there are no events on sockets). So I'm sending signal to Thread1 using pthread_kill()
to interrupt poll() system call (if it gets blocked).
static void signalHandler(int signum) {
//Does nothing
}
// Thread 1 (Does I/O operations and handles scheduler events).
void* Thread1(void* args) {
terminateFlag = 0;
while(!terminateFlag) {
int millis = getTimeRemainedForNextEvent(); //calculate maximum number of milliseconds poll() can block.
int ret = poll(fds,numOfFDs,millis);
if(ret > 0) {
//handle socket events.
} else if (ret < 0) {
if(errno == EINTR)
perror("Poll Error");
break;
}
handleEvent();
}
}
// Thread 2 (Terminates Thread 1 when Thread 1 needs to be terminated)
void* Thread2(void* args) {
while(1) {
/* Do other stuff */
if(terminateThread1) {
terminateFlag = 1;
pthread_kill(ftid,SIGUSR2); //ftid is pthread_t variable of Thread1
pthread_join( ftid, NULL );
}
}
/* Do other stuff */
}
Above code works fine if Thread2 sets terminateFlag and sends signal to Thread1 when it blocked in poll() system call. But, If context switch happens after getTimeRemainedForNextEvent()
function of Thread1 and Thread2 sets terminateFlag and sends signal, poll() of Thread1 gets blocked for several hours as it lost the signal that interrupts the system call.
It seems I can not use mutex for synchronization as poll() will hold the lock till it gets unblocked. Is there any synchronization mechanism that I can apply to avoid the above mentioned issue ?
c++ c linux multithreading synchronization
c++ c linux multithreading synchronization
asked Nov 13 '18 at 13:53
DurgeshDurgesh
4741618
4741618
You might look into usingppoll()
with an appropriate signal mask instead ofpoll()
.
– Shawn
Nov 13 '18 at 16:00
add a comment |
You might look into usingppoll()
with an appropriate signal mask instead ofpoll()
.
– Shawn
Nov 13 '18 at 16:00
You might look into using
ppoll()
with an appropriate signal mask instead of poll()
.– Shawn
Nov 13 '18 at 16:00
You might look into using
ppoll()
with an appropriate signal mask instead of poll()
.– Shawn
Nov 13 '18 at 16:00
add a comment |
3 Answers
3
active
oldest
votes
In the first place, access to shared variable terminateFlag
by multiple threads must be protected by a mutex or similar synchronization mechanism, else your program does not conform and all bets are off. That might, for instance, look like this:
void *Thread1(void *args) {
pthread_mutex_lock(&a_mutex);
terminateFlag = 0;
while(!terminateFlag) {
pthread_mutex_unlock(&a_mutex);
// ...
pthread_mutex_lock(&a_mutex);
}
pthread_mutex_unlock(&a_mutex);
}
void* Thread2(void* args) {
// ...
if (terminateThread1) {
pthread_mutex_lock(&a_mutex);
terminateFlag = 1;
pthread_mutex_unlock(&a_mutex);
pthread_kill(ftid,SIGUSR2); //ftid is pthread_t variable of Thread1
pthread_join( ftid, NULL );
}
// ...
}
But that does not solve the main problem, that a signal sent by thread 2 may be delivered to thread 1 after it tests terminateFlag
but before it calls poll()
, though it does narrow the window in which that could happen.
The cleanest solution is that suggested already by @PaulSanders' answer: have thread 2 wake thread 1 via a file descriptor that thread 1 is polling (i.e. by means of a pipe). Inasmuch as you seem to have a plausible reason to seek an alternative approach, however, it should also be possible to make your signaling approach work by appropriate use of signal masking. Expanding on @Shawn's comment, here's how it would work:
The parent thread blocks
SIGUSR2
before starting thread 1, so that the latter, which inherits its signal mask from its parent, starts with that signal blocked.Thread 1 uses
ppoll()
instead ofpoll()
, so as to be able to specify thatSIGUSR2
will be unblocked for the duration of that call.ppoll()
does signal mask handling atomically, so that there is no opportunity for a signal to be lost when it is blocked before the call and unblocked within.Thread 2 uses
pthread_kill()
to sendSIGUSR2
to thread 1 to make it stop. Because that signal is only unblocked for that thread when it is performing appoll()
call, it will not be lost (blocked signals remain pending until unblocked). This is precisely the kind of usage scenario for whichppoll()
is designed.You should even be able to do away with the
terminateThread
variable and associated synchronization, because you should be able to rely upon the signal being delivered during appoll()
call and therefore causing theEINTR
code path to be exercised. That path does not rely onterminateThread
to make the thread stop.
You don't need a mutex to protect the shared variable.std::atomic
is sufficient. But +1 for ppoll.
– Paul Sanders
Nov 13 '18 at 19:38
I'm implementing ppoll(). If Thread2 set terminateFlag and sent signal to Thread1 when execution is between ppoll() and end of while() loop, Thread1 exits without handling the signal (as it is blocked). In this case, Does the signal sent by Thread2 would be in pending state or discarded ?
– Durgesh
Nov 16 '18 at 9:43
@Durgesh, I'm having trouble finding explicit documentation to this effect, but as far as I am aware, if a thread terminates while it has a signal pending then that signal is effectively ignored. However, I do recommend that you drop theterminateThread
variables, as you have to do the signaling regardless, and that does not need the variable. At most, make the variable a local one, and have the thread set it itself on theEINTR
code path instead of expecting some other thread to set it.
– John Bollinger
Nov 16 '18 at 14:58
@JohnBollinger I agree with you regarding setting flag locally within the thread onEINTR
. But is there any way to set this flag only for SIGUSR2 signal. I mean, is there any way to know which signal interrupted ppoll() system call ?
– Durgesh
Nov 22 '18 at 5:55
@Durgesh, in a single-threaded-scenario, you could have the signal handler record the signal number in a variable of typevolatile sigatomic_t
, but for a multithreaded scenario such as yours, I don't see how you bind that to the right thread. Your best bet is probably to block all signals that can be blocked and whose disposition is something other than termination, and haveppoll
unblock onlySIGUSR2
.
– John Bollinger
Nov 22 '18 at 15:59
add a comment |
Consider having an additional file descriptor in the set of fds passed to poll
whose sole job is to make poll
return when you want to terminate the thread.
Thus, in thread 2 we would have something like:
if (terminateThread1) {
terminateFlag = 1;
send (terminate_fd, " ", 1, 0);
pthread_join (ftid, NULL);
}
}
And terminate_fd
would be in the set of fds passed to poll
by thread 1.
-- OR --
If the overhead of having an extra fd per thread is too much (as discussed in the comments) then send something to one of the existing fds that thread 1 ignores. This will cause poll to return and then thread 1 will terminate. You can even have this 'special' value act as the terminate flag, which makes the logic a little tidier.
There can be more than 5000 threads which are similar to Thread1. I can't use dedicated FD for each and every thread.
– Durgesh
Nov 13 '18 at 14:57
Why not? A file descriptor is not particularly heavyweight. How many fds is each thread waiting on? That would give a better estimate of the overhead of this approach.
– Paul Sanders
Nov 13 '18 at 15:00
Ours is a live streaming media server. For each and every connected camera, a dedicated thread will be spawned. Each thread waits on 2 sockets. So, if 20000 cameras are connected, the process needs to have 60K FDs considering dedicated FD per thread to terminate it. This limits the maximum number of cameras per sever as we have FD limit of process set to 60K
– Durgesh
Nov 13 '18 at 15:10
Perhaps you can send something (which thread 1 would ignore) to one of your existing fds. Waking up the poll by sending something to it is the key to this.
– Paul Sanders
Nov 13 '18 at 15:14
Ok. I will check the possibility of implementing such logic, if I don't get resolution to the problem that I had mentioned in my question.
– Durgesh
Nov 13 '18 at 15:18
add a comment |
As you say yourself, you could use thread cancellation to solve this.
Outside of thread cancellation, I don't think there's a "right" way to solve this within POSIX (waking up the poll
call with a write
isn't exactly a generic method that would work for all situations in which a thread might get blocked), because POSIX's paradigm for making syscalls
and handling signals simply doesn't allow you to close the gap between a flag check and a potentially long blocking call.
void handler() { dont_enter_a_long_blocking_call_flg=1; }
int main()
{ //...
if(dont_enter_a_long_blocking_call_flg)
//THE GAP; what if the signal arrives here ?
potentially_long_blocking_call();
//....
}
The musl libc library uses signals for thread cancellation (because signals can break long-blocking calls that are in kernel mode)
and it uses them in conjunction with global assembly labels so that from the flag setting SIGCANCEL handler, it can do
(conceptually, I'm not pasting their actual code):
void sigcancel_handler(int Sig, siginfo_t *Info, void *Uctx)
{
thread_local_cancellation_flag=1;
if_interrupted_the_gap_move_Program_Counter_to_start_cancellation(Uctx);
}
Now if you changed if_interrupted_the_gap_move_Program_Counter_to_start_cancellation(Uctx);
to if_interrupted_the_gap_move_Program_Counter_to_make_the_syscall_fail(Uctx);
and exported the if_interrupted_the_gap_move_Program_Counter_to_make_the_syscall_fail
function along with the thread_local_cancellation_flag
.
then you can use it to:
- solve your problem robustly
implement robust signal cancelation with any signal without having to put any of thatpthread_cleanup_{push,pop}
stuff into your already working thread-safe singel threaded code - ensure assured normal-context reaction to a signal delivery in your target thread even if the signal is caught.
Basically without a libc extension like this, if you once kill()/pthread_kill()
a process/thread with a signal it handles or if put a function on a signal-sending timer, you cannot be sure of an assured reaction to the signal delivery, as the target may well receive the signal in a gap like above and hang indefinitely instead of responding to it.
I've implemented such a libc extension on top of musl libc and published it now https://github.com/pskocik/musl. The SIGNAL_EXAMPLES directory also shows some kill()
, pthread_kill
, and setitimer()
examples that under a demonstrated race condition hang with classical libcs but don't wit my extended musl. You can use that extended musl to solve your problem cleanly and I also use it in my personal project to do robust thread cancellation without having to litter my code with pthread_cleanup_{push,pop}
.
The obvious downside of this approach is that it's unportable and I only have it implemented for x86_64 musl. I've published it today in the hope that somebody (Cygwin, MacOSX?) copies it, because I think it's the right way to do cancellation in C.
In C++ and with glibc, you could utilize the fact that glibc uses exceptions to implement thread cancellation and simply use pthread_cancel
(which uses a signal (SIGCANCEL) underneath) but catch it instead of letting it kill the thread.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53282566%2fsynchronization-issue-with-usage-of-pthread-kill-to-terminate-thread-blocked-f%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
In the first place, access to shared variable terminateFlag
by multiple threads must be protected by a mutex or similar synchronization mechanism, else your program does not conform and all bets are off. That might, for instance, look like this:
void *Thread1(void *args) {
pthread_mutex_lock(&a_mutex);
terminateFlag = 0;
while(!terminateFlag) {
pthread_mutex_unlock(&a_mutex);
// ...
pthread_mutex_lock(&a_mutex);
}
pthread_mutex_unlock(&a_mutex);
}
void* Thread2(void* args) {
// ...
if (terminateThread1) {
pthread_mutex_lock(&a_mutex);
terminateFlag = 1;
pthread_mutex_unlock(&a_mutex);
pthread_kill(ftid,SIGUSR2); //ftid is pthread_t variable of Thread1
pthread_join( ftid, NULL );
}
// ...
}
But that does not solve the main problem, that a signal sent by thread 2 may be delivered to thread 1 after it tests terminateFlag
but before it calls poll()
, though it does narrow the window in which that could happen.
The cleanest solution is that suggested already by @PaulSanders' answer: have thread 2 wake thread 1 via a file descriptor that thread 1 is polling (i.e. by means of a pipe). Inasmuch as you seem to have a plausible reason to seek an alternative approach, however, it should also be possible to make your signaling approach work by appropriate use of signal masking. Expanding on @Shawn's comment, here's how it would work:
The parent thread blocks
SIGUSR2
before starting thread 1, so that the latter, which inherits its signal mask from its parent, starts with that signal blocked.Thread 1 uses
ppoll()
instead ofpoll()
, so as to be able to specify thatSIGUSR2
will be unblocked for the duration of that call.ppoll()
does signal mask handling atomically, so that there is no opportunity for a signal to be lost when it is blocked before the call and unblocked within.Thread 2 uses
pthread_kill()
to sendSIGUSR2
to thread 1 to make it stop. Because that signal is only unblocked for that thread when it is performing appoll()
call, it will not be lost (blocked signals remain pending until unblocked). This is precisely the kind of usage scenario for whichppoll()
is designed.You should even be able to do away with the
terminateThread
variable and associated synchronization, because you should be able to rely upon the signal being delivered during appoll()
call and therefore causing theEINTR
code path to be exercised. That path does not rely onterminateThread
to make the thread stop.
You don't need a mutex to protect the shared variable.std::atomic
is sufficient. But +1 for ppoll.
– Paul Sanders
Nov 13 '18 at 19:38
I'm implementing ppoll(). If Thread2 set terminateFlag and sent signal to Thread1 when execution is between ppoll() and end of while() loop, Thread1 exits without handling the signal (as it is blocked). In this case, Does the signal sent by Thread2 would be in pending state or discarded ?
– Durgesh
Nov 16 '18 at 9:43
@Durgesh, I'm having trouble finding explicit documentation to this effect, but as far as I am aware, if a thread terminates while it has a signal pending then that signal is effectively ignored. However, I do recommend that you drop theterminateThread
variables, as you have to do the signaling regardless, and that does not need the variable. At most, make the variable a local one, and have the thread set it itself on theEINTR
code path instead of expecting some other thread to set it.
– John Bollinger
Nov 16 '18 at 14:58
@JohnBollinger I agree with you regarding setting flag locally within the thread onEINTR
. But is there any way to set this flag only for SIGUSR2 signal. I mean, is there any way to know which signal interrupted ppoll() system call ?
– Durgesh
Nov 22 '18 at 5:55
@Durgesh, in a single-threaded-scenario, you could have the signal handler record the signal number in a variable of typevolatile sigatomic_t
, but for a multithreaded scenario such as yours, I don't see how you bind that to the right thread. Your best bet is probably to block all signals that can be blocked and whose disposition is something other than termination, and haveppoll
unblock onlySIGUSR2
.
– John Bollinger
Nov 22 '18 at 15:59
add a comment |
In the first place, access to shared variable terminateFlag
by multiple threads must be protected by a mutex or similar synchronization mechanism, else your program does not conform and all bets are off. That might, for instance, look like this:
void *Thread1(void *args) {
pthread_mutex_lock(&a_mutex);
terminateFlag = 0;
while(!terminateFlag) {
pthread_mutex_unlock(&a_mutex);
// ...
pthread_mutex_lock(&a_mutex);
}
pthread_mutex_unlock(&a_mutex);
}
void* Thread2(void* args) {
// ...
if (terminateThread1) {
pthread_mutex_lock(&a_mutex);
terminateFlag = 1;
pthread_mutex_unlock(&a_mutex);
pthread_kill(ftid,SIGUSR2); //ftid is pthread_t variable of Thread1
pthread_join( ftid, NULL );
}
// ...
}
But that does not solve the main problem, that a signal sent by thread 2 may be delivered to thread 1 after it tests terminateFlag
but before it calls poll()
, though it does narrow the window in which that could happen.
The cleanest solution is that suggested already by @PaulSanders' answer: have thread 2 wake thread 1 via a file descriptor that thread 1 is polling (i.e. by means of a pipe). Inasmuch as you seem to have a plausible reason to seek an alternative approach, however, it should also be possible to make your signaling approach work by appropriate use of signal masking. Expanding on @Shawn's comment, here's how it would work:
The parent thread blocks
SIGUSR2
before starting thread 1, so that the latter, which inherits its signal mask from its parent, starts with that signal blocked.Thread 1 uses
ppoll()
instead ofpoll()
, so as to be able to specify thatSIGUSR2
will be unblocked for the duration of that call.ppoll()
does signal mask handling atomically, so that there is no opportunity for a signal to be lost when it is blocked before the call and unblocked within.Thread 2 uses
pthread_kill()
to sendSIGUSR2
to thread 1 to make it stop. Because that signal is only unblocked for that thread when it is performing appoll()
call, it will not be lost (blocked signals remain pending until unblocked). This is precisely the kind of usage scenario for whichppoll()
is designed.You should even be able to do away with the
terminateThread
variable and associated synchronization, because you should be able to rely upon the signal being delivered during appoll()
call and therefore causing theEINTR
code path to be exercised. That path does not rely onterminateThread
to make the thread stop.
You don't need a mutex to protect the shared variable.std::atomic
is sufficient. But +1 for ppoll.
– Paul Sanders
Nov 13 '18 at 19:38
I'm implementing ppoll(). If Thread2 set terminateFlag and sent signal to Thread1 when execution is between ppoll() and end of while() loop, Thread1 exits without handling the signal (as it is blocked). In this case, Does the signal sent by Thread2 would be in pending state or discarded ?
– Durgesh
Nov 16 '18 at 9:43
@Durgesh, I'm having trouble finding explicit documentation to this effect, but as far as I am aware, if a thread terminates while it has a signal pending then that signal is effectively ignored. However, I do recommend that you drop theterminateThread
variables, as you have to do the signaling regardless, and that does not need the variable. At most, make the variable a local one, and have the thread set it itself on theEINTR
code path instead of expecting some other thread to set it.
– John Bollinger
Nov 16 '18 at 14:58
@JohnBollinger I agree with you regarding setting flag locally within the thread onEINTR
. But is there any way to set this flag only for SIGUSR2 signal. I mean, is there any way to know which signal interrupted ppoll() system call ?
– Durgesh
Nov 22 '18 at 5:55
@Durgesh, in a single-threaded-scenario, you could have the signal handler record the signal number in a variable of typevolatile sigatomic_t
, but for a multithreaded scenario such as yours, I don't see how you bind that to the right thread. Your best bet is probably to block all signals that can be blocked and whose disposition is something other than termination, and haveppoll
unblock onlySIGUSR2
.
– John Bollinger
Nov 22 '18 at 15:59
add a comment |
In the first place, access to shared variable terminateFlag
by multiple threads must be protected by a mutex or similar synchronization mechanism, else your program does not conform and all bets are off. That might, for instance, look like this:
void *Thread1(void *args) {
pthread_mutex_lock(&a_mutex);
terminateFlag = 0;
while(!terminateFlag) {
pthread_mutex_unlock(&a_mutex);
// ...
pthread_mutex_lock(&a_mutex);
}
pthread_mutex_unlock(&a_mutex);
}
void* Thread2(void* args) {
// ...
if (terminateThread1) {
pthread_mutex_lock(&a_mutex);
terminateFlag = 1;
pthread_mutex_unlock(&a_mutex);
pthread_kill(ftid,SIGUSR2); //ftid is pthread_t variable of Thread1
pthread_join( ftid, NULL );
}
// ...
}
But that does not solve the main problem, that a signal sent by thread 2 may be delivered to thread 1 after it tests terminateFlag
but before it calls poll()
, though it does narrow the window in which that could happen.
The cleanest solution is that suggested already by @PaulSanders' answer: have thread 2 wake thread 1 via a file descriptor that thread 1 is polling (i.e. by means of a pipe). Inasmuch as you seem to have a plausible reason to seek an alternative approach, however, it should also be possible to make your signaling approach work by appropriate use of signal masking. Expanding on @Shawn's comment, here's how it would work:
The parent thread blocks
SIGUSR2
before starting thread 1, so that the latter, which inherits its signal mask from its parent, starts with that signal blocked.Thread 1 uses
ppoll()
instead ofpoll()
, so as to be able to specify thatSIGUSR2
will be unblocked for the duration of that call.ppoll()
does signal mask handling atomically, so that there is no opportunity for a signal to be lost when it is blocked before the call and unblocked within.Thread 2 uses
pthread_kill()
to sendSIGUSR2
to thread 1 to make it stop. Because that signal is only unblocked for that thread when it is performing appoll()
call, it will not be lost (blocked signals remain pending until unblocked). This is precisely the kind of usage scenario for whichppoll()
is designed.You should even be able to do away with the
terminateThread
variable and associated synchronization, because you should be able to rely upon the signal being delivered during appoll()
call and therefore causing theEINTR
code path to be exercised. That path does not rely onterminateThread
to make the thread stop.
In the first place, access to shared variable terminateFlag
by multiple threads must be protected by a mutex or similar synchronization mechanism, else your program does not conform and all bets are off. That might, for instance, look like this:
void *Thread1(void *args) {
pthread_mutex_lock(&a_mutex);
terminateFlag = 0;
while(!terminateFlag) {
pthread_mutex_unlock(&a_mutex);
// ...
pthread_mutex_lock(&a_mutex);
}
pthread_mutex_unlock(&a_mutex);
}
void* Thread2(void* args) {
// ...
if (terminateThread1) {
pthread_mutex_lock(&a_mutex);
terminateFlag = 1;
pthread_mutex_unlock(&a_mutex);
pthread_kill(ftid,SIGUSR2); //ftid is pthread_t variable of Thread1
pthread_join( ftid, NULL );
}
// ...
}
But that does not solve the main problem, that a signal sent by thread 2 may be delivered to thread 1 after it tests terminateFlag
but before it calls poll()
, though it does narrow the window in which that could happen.
The cleanest solution is that suggested already by @PaulSanders' answer: have thread 2 wake thread 1 via a file descriptor that thread 1 is polling (i.e. by means of a pipe). Inasmuch as you seem to have a plausible reason to seek an alternative approach, however, it should also be possible to make your signaling approach work by appropriate use of signal masking. Expanding on @Shawn's comment, here's how it would work:
The parent thread blocks
SIGUSR2
before starting thread 1, so that the latter, which inherits its signal mask from its parent, starts with that signal blocked.Thread 1 uses
ppoll()
instead ofpoll()
, so as to be able to specify thatSIGUSR2
will be unblocked for the duration of that call.ppoll()
does signal mask handling atomically, so that there is no opportunity for a signal to be lost when it is blocked before the call and unblocked within.Thread 2 uses
pthread_kill()
to sendSIGUSR2
to thread 1 to make it stop. Because that signal is only unblocked for that thread when it is performing appoll()
call, it will not be lost (blocked signals remain pending until unblocked). This is precisely the kind of usage scenario for whichppoll()
is designed.You should even be able to do away with the
terminateThread
variable and associated synchronization, because you should be able to rely upon the signal being delivered during appoll()
call and therefore causing theEINTR
code path to be exercised. That path does not rely onterminateThread
to make the thread stop.
answered Nov 13 '18 at 17:53
John BollingerJohn Bollinger
80.7k74275
80.7k74275
You don't need a mutex to protect the shared variable.std::atomic
is sufficient. But +1 for ppoll.
– Paul Sanders
Nov 13 '18 at 19:38
I'm implementing ppoll(). If Thread2 set terminateFlag and sent signal to Thread1 when execution is between ppoll() and end of while() loop, Thread1 exits without handling the signal (as it is blocked). In this case, Does the signal sent by Thread2 would be in pending state or discarded ?
– Durgesh
Nov 16 '18 at 9:43
@Durgesh, I'm having trouble finding explicit documentation to this effect, but as far as I am aware, if a thread terminates while it has a signal pending then that signal is effectively ignored. However, I do recommend that you drop theterminateThread
variables, as you have to do the signaling regardless, and that does not need the variable. At most, make the variable a local one, and have the thread set it itself on theEINTR
code path instead of expecting some other thread to set it.
– John Bollinger
Nov 16 '18 at 14:58
@JohnBollinger I agree with you regarding setting flag locally within the thread onEINTR
. But is there any way to set this flag only for SIGUSR2 signal. I mean, is there any way to know which signal interrupted ppoll() system call ?
– Durgesh
Nov 22 '18 at 5:55
@Durgesh, in a single-threaded-scenario, you could have the signal handler record the signal number in a variable of typevolatile sigatomic_t
, but for a multithreaded scenario such as yours, I don't see how you bind that to the right thread. Your best bet is probably to block all signals that can be blocked and whose disposition is something other than termination, and haveppoll
unblock onlySIGUSR2
.
– John Bollinger
Nov 22 '18 at 15:59
add a comment |
You don't need a mutex to protect the shared variable.std::atomic
is sufficient. But +1 for ppoll.
– Paul Sanders
Nov 13 '18 at 19:38
I'm implementing ppoll(). If Thread2 set terminateFlag and sent signal to Thread1 when execution is between ppoll() and end of while() loop, Thread1 exits without handling the signal (as it is blocked). In this case, Does the signal sent by Thread2 would be in pending state or discarded ?
– Durgesh
Nov 16 '18 at 9:43
@Durgesh, I'm having trouble finding explicit documentation to this effect, but as far as I am aware, if a thread terminates while it has a signal pending then that signal is effectively ignored. However, I do recommend that you drop theterminateThread
variables, as you have to do the signaling regardless, and that does not need the variable. At most, make the variable a local one, and have the thread set it itself on theEINTR
code path instead of expecting some other thread to set it.
– John Bollinger
Nov 16 '18 at 14:58
@JohnBollinger I agree with you regarding setting flag locally within the thread onEINTR
. But is there any way to set this flag only for SIGUSR2 signal. I mean, is there any way to know which signal interrupted ppoll() system call ?
– Durgesh
Nov 22 '18 at 5:55
@Durgesh, in a single-threaded-scenario, you could have the signal handler record the signal number in a variable of typevolatile sigatomic_t
, but for a multithreaded scenario such as yours, I don't see how you bind that to the right thread. Your best bet is probably to block all signals that can be blocked and whose disposition is something other than termination, and haveppoll
unblock onlySIGUSR2
.
– John Bollinger
Nov 22 '18 at 15:59
You don't need a mutex to protect the shared variable.
std::atomic
is sufficient. But +1 for ppoll.– Paul Sanders
Nov 13 '18 at 19:38
You don't need a mutex to protect the shared variable.
std::atomic
is sufficient. But +1 for ppoll.– Paul Sanders
Nov 13 '18 at 19:38
I'm implementing ppoll(). If Thread2 set terminateFlag and sent signal to Thread1 when execution is between ppoll() and end of while() loop, Thread1 exits without handling the signal (as it is blocked). In this case, Does the signal sent by Thread2 would be in pending state or discarded ?
– Durgesh
Nov 16 '18 at 9:43
I'm implementing ppoll(). If Thread2 set terminateFlag and sent signal to Thread1 when execution is between ppoll() and end of while() loop, Thread1 exits without handling the signal (as it is blocked). In this case, Does the signal sent by Thread2 would be in pending state or discarded ?
– Durgesh
Nov 16 '18 at 9:43
@Durgesh, I'm having trouble finding explicit documentation to this effect, but as far as I am aware, if a thread terminates while it has a signal pending then that signal is effectively ignored. However, I do recommend that you drop the
terminateThread
variables, as you have to do the signaling regardless, and that does not need the variable. At most, make the variable a local one, and have the thread set it itself on the EINTR
code path instead of expecting some other thread to set it.– John Bollinger
Nov 16 '18 at 14:58
@Durgesh, I'm having trouble finding explicit documentation to this effect, but as far as I am aware, if a thread terminates while it has a signal pending then that signal is effectively ignored. However, I do recommend that you drop the
terminateThread
variables, as you have to do the signaling regardless, and that does not need the variable. At most, make the variable a local one, and have the thread set it itself on the EINTR
code path instead of expecting some other thread to set it.– John Bollinger
Nov 16 '18 at 14:58
@JohnBollinger I agree with you regarding setting flag locally within the thread on
EINTR
. But is there any way to set this flag only for SIGUSR2 signal. I mean, is there any way to know which signal interrupted ppoll() system call ?– Durgesh
Nov 22 '18 at 5:55
@JohnBollinger I agree with you regarding setting flag locally within the thread on
EINTR
. But is there any way to set this flag only for SIGUSR2 signal. I mean, is there any way to know which signal interrupted ppoll() system call ?– Durgesh
Nov 22 '18 at 5:55
@Durgesh, in a single-threaded-scenario, you could have the signal handler record the signal number in a variable of type
volatile sigatomic_t
, but for a multithreaded scenario such as yours, I don't see how you bind that to the right thread. Your best bet is probably to block all signals that can be blocked and whose disposition is something other than termination, and have ppoll
unblock only SIGUSR2
.– John Bollinger
Nov 22 '18 at 15:59
@Durgesh, in a single-threaded-scenario, you could have the signal handler record the signal number in a variable of type
volatile sigatomic_t
, but for a multithreaded scenario such as yours, I don't see how you bind that to the right thread. Your best bet is probably to block all signals that can be blocked and whose disposition is something other than termination, and have ppoll
unblock only SIGUSR2
.– John Bollinger
Nov 22 '18 at 15:59
add a comment |
Consider having an additional file descriptor in the set of fds passed to poll
whose sole job is to make poll
return when you want to terminate the thread.
Thus, in thread 2 we would have something like:
if (terminateThread1) {
terminateFlag = 1;
send (terminate_fd, " ", 1, 0);
pthread_join (ftid, NULL);
}
}
And terminate_fd
would be in the set of fds passed to poll
by thread 1.
-- OR --
If the overhead of having an extra fd per thread is too much (as discussed in the comments) then send something to one of the existing fds that thread 1 ignores. This will cause poll to return and then thread 1 will terminate. You can even have this 'special' value act as the terminate flag, which makes the logic a little tidier.
There can be more than 5000 threads which are similar to Thread1. I can't use dedicated FD for each and every thread.
– Durgesh
Nov 13 '18 at 14:57
Why not? A file descriptor is not particularly heavyweight. How many fds is each thread waiting on? That would give a better estimate of the overhead of this approach.
– Paul Sanders
Nov 13 '18 at 15:00
Ours is a live streaming media server. For each and every connected camera, a dedicated thread will be spawned. Each thread waits on 2 sockets. So, if 20000 cameras are connected, the process needs to have 60K FDs considering dedicated FD per thread to terminate it. This limits the maximum number of cameras per sever as we have FD limit of process set to 60K
– Durgesh
Nov 13 '18 at 15:10
Perhaps you can send something (which thread 1 would ignore) to one of your existing fds. Waking up the poll by sending something to it is the key to this.
– Paul Sanders
Nov 13 '18 at 15:14
Ok. I will check the possibility of implementing such logic, if I don't get resolution to the problem that I had mentioned in my question.
– Durgesh
Nov 13 '18 at 15:18
add a comment |
Consider having an additional file descriptor in the set of fds passed to poll
whose sole job is to make poll
return when you want to terminate the thread.
Thus, in thread 2 we would have something like:
if (terminateThread1) {
terminateFlag = 1;
send (terminate_fd, " ", 1, 0);
pthread_join (ftid, NULL);
}
}
And terminate_fd
would be in the set of fds passed to poll
by thread 1.
-- OR --
If the overhead of having an extra fd per thread is too much (as discussed in the comments) then send something to one of the existing fds that thread 1 ignores. This will cause poll to return and then thread 1 will terminate. You can even have this 'special' value act as the terminate flag, which makes the logic a little tidier.
There can be more than 5000 threads which are similar to Thread1. I can't use dedicated FD for each and every thread.
– Durgesh
Nov 13 '18 at 14:57
Why not? A file descriptor is not particularly heavyweight. How many fds is each thread waiting on? That would give a better estimate of the overhead of this approach.
– Paul Sanders
Nov 13 '18 at 15:00
Ours is a live streaming media server. For each and every connected camera, a dedicated thread will be spawned. Each thread waits on 2 sockets. So, if 20000 cameras are connected, the process needs to have 60K FDs considering dedicated FD per thread to terminate it. This limits the maximum number of cameras per sever as we have FD limit of process set to 60K
– Durgesh
Nov 13 '18 at 15:10
Perhaps you can send something (which thread 1 would ignore) to one of your existing fds. Waking up the poll by sending something to it is the key to this.
– Paul Sanders
Nov 13 '18 at 15:14
Ok. I will check the possibility of implementing such logic, if I don't get resolution to the problem that I had mentioned in my question.
– Durgesh
Nov 13 '18 at 15:18
add a comment |
Consider having an additional file descriptor in the set of fds passed to poll
whose sole job is to make poll
return when you want to terminate the thread.
Thus, in thread 2 we would have something like:
if (terminateThread1) {
terminateFlag = 1;
send (terminate_fd, " ", 1, 0);
pthread_join (ftid, NULL);
}
}
And terminate_fd
would be in the set of fds passed to poll
by thread 1.
-- OR --
If the overhead of having an extra fd per thread is too much (as discussed in the comments) then send something to one of the existing fds that thread 1 ignores. This will cause poll to return and then thread 1 will terminate. You can even have this 'special' value act as the terminate flag, which makes the logic a little tidier.
Consider having an additional file descriptor in the set of fds passed to poll
whose sole job is to make poll
return when you want to terminate the thread.
Thus, in thread 2 we would have something like:
if (terminateThread1) {
terminateFlag = 1;
send (terminate_fd, " ", 1, 0);
pthread_join (ftid, NULL);
}
}
And terminate_fd
would be in the set of fds passed to poll
by thread 1.
-- OR --
If the overhead of having an extra fd per thread is too much (as discussed in the comments) then send something to one of the existing fds that thread 1 ignores. This will cause poll to return and then thread 1 will terminate. You can even have this 'special' value act as the terminate flag, which makes the logic a little tidier.
edited Nov 13 '18 at 19:36
answered Nov 13 '18 at 14:51
Paul SandersPaul Sanders
5,1851621
5,1851621
There can be more than 5000 threads which are similar to Thread1. I can't use dedicated FD for each and every thread.
– Durgesh
Nov 13 '18 at 14:57
Why not? A file descriptor is not particularly heavyweight. How many fds is each thread waiting on? That would give a better estimate of the overhead of this approach.
– Paul Sanders
Nov 13 '18 at 15:00
Ours is a live streaming media server. For each and every connected camera, a dedicated thread will be spawned. Each thread waits on 2 sockets. So, if 20000 cameras are connected, the process needs to have 60K FDs considering dedicated FD per thread to terminate it. This limits the maximum number of cameras per sever as we have FD limit of process set to 60K
– Durgesh
Nov 13 '18 at 15:10
Perhaps you can send something (which thread 1 would ignore) to one of your existing fds. Waking up the poll by sending something to it is the key to this.
– Paul Sanders
Nov 13 '18 at 15:14
Ok. I will check the possibility of implementing such logic, if I don't get resolution to the problem that I had mentioned in my question.
– Durgesh
Nov 13 '18 at 15:18
add a comment |
There can be more than 5000 threads which are similar to Thread1. I can't use dedicated FD for each and every thread.
– Durgesh
Nov 13 '18 at 14:57
Why not? A file descriptor is not particularly heavyweight. How many fds is each thread waiting on? That would give a better estimate of the overhead of this approach.
– Paul Sanders
Nov 13 '18 at 15:00
Ours is a live streaming media server. For each and every connected camera, a dedicated thread will be spawned. Each thread waits on 2 sockets. So, if 20000 cameras are connected, the process needs to have 60K FDs considering dedicated FD per thread to terminate it. This limits the maximum number of cameras per sever as we have FD limit of process set to 60K
– Durgesh
Nov 13 '18 at 15:10
Perhaps you can send something (which thread 1 would ignore) to one of your existing fds. Waking up the poll by sending something to it is the key to this.
– Paul Sanders
Nov 13 '18 at 15:14
Ok. I will check the possibility of implementing such logic, if I don't get resolution to the problem that I had mentioned in my question.
– Durgesh
Nov 13 '18 at 15:18
There can be more than 5000 threads which are similar to Thread1. I can't use dedicated FD for each and every thread.
– Durgesh
Nov 13 '18 at 14:57
There can be more than 5000 threads which are similar to Thread1. I can't use dedicated FD for each and every thread.
– Durgesh
Nov 13 '18 at 14:57
Why not? A file descriptor is not particularly heavyweight. How many fds is each thread waiting on? That would give a better estimate of the overhead of this approach.
– Paul Sanders
Nov 13 '18 at 15:00
Why not? A file descriptor is not particularly heavyweight. How many fds is each thread waiting on? That would give a better estimate of the overhead of this approach.
– Paul Sanders
Nov 13 '18 at 15:00
Ours is a live streaming media server. For each and every connected camera, a dedicated thread will be spawned. Each thread waits on 2 sockets. So, if 20000 cameras are connected, the process needs to have 60K FDs considering dedicated FD per thread to terminate it. This limits the maximum number of cameras per sever as we have FD limit of process set to 60K
– Durgesh
Nov 13 '18 at 15:10
Ours is a live streaming media server. For each and every connected camera, a dedicated thread will be spawned. Each thread waits on 2 sockets. So, if 20000 cameras are connected, the process needs to have 60K FDs considering dedicated FD per thread to terminate it. This limits the maximum number of cameras per sever as we have FD limit of process set to 60K
– Durgesh
Nov 13 '18 at 15:10
Perhaps you can send something (which thread 1 would ignore) to one of your existing fds. Waking up the poll by sending something to it is the key to this.
– Paul Sanders
Nov 13 '18 at 15:14
Perhaps you can send something (which thread 1 would ignore) to one of your existing fds. Waking up the poll by sending something to it is the key to this.
– Paul Sanders
Nov 13 '18 at 15:14
Ok. I will check the possibility of implementing such logic, if I don't get resolution to the problem that I had mentioned in my question.
– Durgesh
Nov 13 '18 at 15:18
Ok. I will check the possibility of implementing such logic, if I don't get resolution to the problem that I had mentioned in my question.
– Durgesh
Nov 13 '18 at 15:18
add a comment |
As you say yourself, you could use thread cancellation to solve this.
Outside of thread cancellation, I don't think there's a "right" way to solve this within POSIX (waking up the poll
call with a write
isn't exactly a generic method that would work for all situations in which a thread might get blocked), because POSIX's paradigm for making syscalls
and handling signals simply doesn't allow you to close the gap between a flag check and a potentially long blocking call.
void handler() { dont_enter_a_long_blocking_call_flg=1; }
int main()
{ //...
if(dont_enter_a_long_blocking_call_flg)
//THE GAP; what if the signal arrives here ?
potentially_long_blocking_call();
//....
}
The musl libc library uses signals for thread cancellation (because signals can break long-blocking calls that are in kernel mode)
and it uses them in conjunction with global assembly labels so that from the flag setting SIGCANCEL handler, it can do
(conceptually, I'm not pasting their actual code):
void sigcancel_handler(int Sig, siginfo_t *Info, void *Uctx)
{
thread_local_cancellation_flag=1;
if_interrupted_the_gap_move_Program_Counter_to_start_cancellation(Uctx);
}
Now if you changed if_interrupted_the_gap_move_Program_Counter_to_start_cancellation(Uctx);
to if_interrupted_the_gap_move_Program_Counter_to_make_the_syscall_fail(Uctx);
and exported the if_interrupted_the_gap_move_Program_Counter_to_make_the_syscall_fail
function along with the thread_local_cancellation_flag
.
then you can use it to:
- solve your problem robustly
implement robust signal cancelation with any signal without having to put any of thatpthread_cleanup_{push,pop}
stuff into your already working thread-safe singel threaded code - ensure assured normal-context reaction to a signal delivery in your target thread even if the signal is caught.
Basically without a libc extension like this, if you once kill()/pthread_kill()
a process/thread with a signal it handles or if put a function on a signal-sending timer, you cannot be sure of an assured reaction to the signal delivery, as the target may well receive the signal in a gap like above and hang indefinitely instead of responding to it.
I've implemented such a libc extension on top of musl libc and published it now https://github.com/pskocik/musl. The SIGNAL_EXAMPLES directory also shows some kill()
, pthread_kill
, and setitimer()
examples that under a demonstrated race condition hang with classical libcs but don't wit my extended musl. You can use that extended musl to solve your problem cleanly and I also use it in my personal project to do robust thread cancellation without having to litter my code with pthread_cleanup_{push,pop}
.
The obvious downside of this approach is that it's unportable and I only have it implemented for x86_64 musl. I've published it today in the hope that somebody (Cygwin, MacOSX?) copies it, because I think it's the right way to do cancellation in C.
In C++ and with glibc, you could utilize the fact that glibc uses exceptions to implement thread cancellation and simply use pthread_cancel
(which uses a signal (SIGCANCEL) underneath) but catch it instead of letting it kill the thread.
add a comment |
As you say yourself, you could use thread cancellation to solve this.
Outside of thread cancellation, I don't think there's a "right" way to solve this within POSIX (waking up the poll
call with a write
isn't exactly a generic method that would work for all situations in which a thread might get blocked), because POSIX's paradigm for making syscalls
and handling signals simply doesn't allow you to close the gap between a flag check and a potentially long blocking call.
void handler() { dont_enter_a_long_blocking_call_flg=1; }
int main()
{ //...
if(dont_enter_a_long_blocking_call_flg)
//THE GAP; what if the signal arrives here ?
potentially_long_blocking_call();
//....
}
The musl libc library uses signals for thread cancellation (because signals can break long-blocking calls that are in kernel mode)
and it uses them in conjunction with global assembly labels so that from the flag setting SIGCANCEL handler, it can do
(conceptually, I'm not pasting their actual code):
void sigcancel_handler(int Sig, siginfo_t *Info, void *Uctx)
{
thread_local_cancellation_flag=1;
if_interrupted_the_gap_move_Program_Counter_to_start_cancellation(Uctx);
}
Now if you changed if_interrupted_the_gap_move_Program_Counter_to_start_cancellation(Uctx);
to if_interrupted_the_gap_move_Program_Counter_to_make_the_syscall_fail(Uctx);
and exported the if_interrupted_the_gap_move_Program_Counter_to_make_the_syscall_fail
function along with the thread_local_cancellation_flag
.
then you can use it to:
- solve your problem robustly
implement robust signal cancelation with any signal without having to put any of thatpthread_cleanup_{push,pop}
stuff into your already working thread-safe singel threaded code - ensure assured normal-context reaction to a signal delivery in your target thread even if the signal is caught.
Basically without a libc extension like this, if you once kill()/pthread_kill()
a process/thread with a signal it handles or if put a function on a signal-sending timer, you cannot be sure of an assured reaction to the signal delivery, as the target may well receive the signal in a gap like above and hang indefinitely instead of responding to it.
I've implemented such a libc extension on top of musl libc and published it now https://github.com/pskocik/musl. The SIGNAL_EXAMPLES directory also shows some kill()
, pthread_kill
, and setitimer()
examples that under a demonstrated race condition hang with classical libcs but don't wit my extended musl. You can use that extended musl to solve your problem cleanly and I also use it in my personal project to do robust thread cancellation without having to litter my code with pthread_cleanup_{push,pop}
.
The obvious downside of this approach is that it's unportable and I only have it implemented for x86_64 musl. I've published it today in the hope that somebody (Cygwin, MacOSX?) copies it, because I think it's the right way to do cancellation in C.
In C++ and with glibc, you could utilize the fact that glibc uses exceptions to implement thread cancellation and simply use pthread_cancel
(which uses a signal (SIGCANCEL) underneath) but catch it instead of letting it kill the thread.
add a comment |
As you say yourself, you could use thread cancellation to solve this.
Outside of thread cancellation, I don't think there's a "right" way to solve this within POSIX (waking up the poll
call with a write
isn't exactly a generic method that would work for all situations in which a thread might get blocked), because POSIX's paradigm for making syscalls
and handling signals simply doesn't allow you to close the gap between a flag check and a potentially long blocking call.
void handler() { dont_enter_a_long_blocking_call_flg=1; }
int main()
{ //...
if(dont_enter_a_long_blocking_call_flg)
//THE GAP; what if the signal arrives here ?
potentially_long_blocking_call();
//....
}
The musl libc library uses signals for thread cancellation (because signals can break long-blocking calls that are in kernel mode)
and it uses them in conjunction with global assembly labels so that from the flag setting SIGCANCEL handler, it can do
(conceptually, I'm not pasting their actual code):
void sigcancel_handler(int Sig, siginfo_t *Info, void *Uctx)
{
thread_local_cancellation_flag=1;
if_interrupted_the_gap_move_Program_Counter_to_start_cancellation(Uctx);
}
Now if you changed if_interrupted_the_gap_move_Program_Counter_to_start_cancellation(Uctx);
to if_interrupted_the_gap_move_Program_Counter_to_make_the_syscall_fail(Uctx);
and exported the if_interrupted_the_gap_move_Program_Counter_to_make_the_syscall_fail
function along with the thread_local_cancellation_flag
.
then you can use it to:
- solve your problem robustly
implement robust signal cancelation with any signal without having to put any of thatpthread_cleanup_{push,pop}
stuff into your already working thread-safe singel threaded code - ensure assured normal-context reaction to a signal delivery in your target thread even if the signal is caught.
Basically without a libc extension like this, if you once kill()/pthread_kill()
a process/thread with a signal it handles or if put a function on a signal-sending timer, you cannot be sure of an assured reaction to the signal delivery, as the target may well receive the signal in a gap like above and hang indefinitely instead of responding to it.
I've implemented such a libc extension on top of musl libc and published it now https://github.com/pskocik/musl. The SIGNAL_EXAMPLES directory also shows some kill()
, pthread_kill
, and setitimer()
examples that under a demonstrated race condition hang with classical libcs but don't wit my extended musl. You can use that extended musl to solve your problem cleanly and I also use it in my personal project to do robust thread cancellation without having to litter my code with pthread_cleanup_{push,pop}
.
The obvious downside of this approach is that it's unportable and I only have it implemented for x86_64 musl. I've published it today in the hope that somebody (Cygwin, MacOSX?) copies it, because I think it's the right way to do cancellation in C.
In C++ and with glibc, you could utilize the fact that glibc uses exceptions to implement thread cancellation and simply use pthread_cancel
(which uses a signal (SIGCANCEL) underneath) but catch it instead of letting it kill the thread.
As you say yourself, you could use thread cancellation to solve this.
Outside of thread cancellation, I don't think there's a "right" way to solve this within POSIX (waking up the poll
call with a write
isn't exactly a generic method that would work for all situations in which a thread might get blocked), because POSIX's paradigm for making syscalls
and handling signals simply doesn't allow you to close the gap between a flag check and a potentially long blocking call.
void handler() { dont_enter_a_long_blocking_call_flg=1; }
int main()
{ //...
if(dont_enter_a_long_blocking_call_flg)
//THE GAP; what if the signal arrives here ?
potentially_long_blocking_call();
//....
}
The musl libc library uses signals for thread cancellation (because signals can break long-blocking calls that are in kernel mode)
and it uses them in conjunction with global assembly labels so that from the flag setting SIGCANCEL handler, it can do
(conceptually, I'm not pasting their actual code):
void sigcancel_handler(int Sig, siginfo_t *Info, void *Uctx)
{
thread_local_cancellation_flag=1;
if_interrupted_the_gap_move_Program_Counter_to_start_cancellation(Uctx);
}
Now if you changed if_interrupted_the_gap_move_Program_Counter_to_start_cancellation(Uctx);
to if_interrupted_the_gap_move_Program_Counter_to_make_the_syscall_fail(Uctx);
and exported the if_interrupted_the_gap_move_Program_Counter_to_make_the_syscall_fail
function along with the thread_local_cancellation_flag
.
then you can use it to:
- solve your problem robustly
implement robust signal cancelation with any signal without having to put any of thatpthread_cleanup_{push,pop}
stuff into your already working thread-safe singel threaded code - ensure assured normal-context reaction to a signal delivery in your target thread even if the signal is caught.
Basically without a libc extension like this, if you once kill()/pthread_kill()
a process/thread with a signal it handles or if put a function on a signal-sending timer, you cannot be sure of an assured reaction to the signal delivery, as the target may well receive the signal in a gap like above and hang indefinitely instead of responding to it.
I've implemented such a libc extension on top of musl libc and published it now https://github.com/pskocik/musl. The SIGNAL_EXAMPLES directory also shows some kill()
, pthread_kill
, and setitimer()
examples that under a demonstrated race condition hang with classical libcs but don't wit my extended musl. You can use that extended musl to solve your problem cleanly and I also use it in my personal project to do robust thread cancellation without having to litter my code with pthread_cleanup_{push,pop}
.
The obvious downside of this approach is that it's unportable and I only have it implemented for x86_64 musl. I've published it today in the hope that somebody (Cygwin, MacOSX?) copies it, because I think it's the right way to do cancellation in C.
In C++ and with glibc, you could utilize the fact that glibc uses exceptions to implement thread cancellation and simply use pthread_cancel
(which uses a signal (SIGCANCEL) underneath) but catch it instead of letting it kill the thread.
edited Nov 16 '18 at 10:03
answered Nov 14 '18 at 14:54
PSkocikPSkocik
32.9k64970
32.9k64970
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53282566%2fsynchronization-issue-with-usage-of-pthread-kill-to-terminate-thread-blocked-f%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
You might look into using
ppoll()
with an appropriate signal mask instead ofpoll()
.– Shawn
Nov 13 '18 at 16:00