librelist archives

« back to archive

What happens to Resque when Ruby crashes when running a job

What happens to Resque when Ruby crashes when running a job

From:
Vincent Paca
Date:
2014-08-27 @ 05:56
Hello everyone,

How does Resque handle failed jobs in flight? For example, when job was
already popped out of the queue, and Ruby encounters a segfault while
running the job and it doesn't complete it, what happens to the job?

Does Resque handle this gracefully?

If not, would that job be lost? What would be the best way to handle that?

Thank you for your time. :)

Re: [resque] What happens to Resque when Ruby crashes when running a job

From:
Emil Kampp
Date:
2014-08-27 @ 06:06
Very nice question :)


I'd like to know as well. 




Cheers 
--

Best regards 
Emil Kampp 
IT entrepreneur and founder

On Wed, Aug 27, 2014 at 7:57 AM, Vincent Paca <vpaca@payrollhero.com>
wrote:

> Hello everyone,
> How does Resque handle failed jobs in flight? For example, when job was
> already popped out of the queue, and Ruby encounters a segfault while
> running the job and it doesn't complete it, what happens to the job?
> Does Resque handle this gracefully?
> If not, would that job be lost? What would be the best way to handle that?
> Thank you for your time. :)

Re: [resque] What happens to Resque when Ruby crashes when running a job

From:
Dave Copeland
Date:
2014-08-27 @ 13:22
If you are using the default worker, the answer is that it depends on how
Ruby is killed:

https://github.com/resque/resque/blob/1-x-stable/lib/resque/worker.rb#L371

If you follow that code to shutdown! you'll see that it does a reasonable
job of trying to exit gracefully.  Assuming your worker code isn't doing
anything odd, it should experience an exception, which will send it through
the normal error handling (by default: failed queue).

Of course, if the process is killed more aggressively, or you are somehow
ignoring signals, you will lose the job and some of the bookkeeping bits in
Redis will not reflect reality (i.e. that the worker is still working).

Your defense against this:

   - Use database transactions so if the job is killed, you don't have a
   half-updated database
   - Make your jobs idempotent, so you can blindly re-try from the failed
   queue without worrying about "double-work"
   - Design jobs to not require any context, such that you can queue them
   regularly and individual job failures don't matter in the grand scheme of
   things (not necessarily possible in many cases)

Dave



On Wed, Aug 27, 2014 at 2:06 AM, Emil Kampp <emil@kampp.me> wrote:

> Very nice question :)
>
> I'd like to know as well.
>
> Cheers
> --
>
> Best regards
> Emil Kampp
> IT entrepreneur and founder
>
>
> On Wed, Aug 27, 2014 at 7:57 AM, Vincent Paca <vpaca@payrollhero.com>
> wrote:
>
>>  Hello everyone,
>>
>> How does Resque handle failed jobs in flight? For example, when job was
>> already popped out of the queue, and Ruby encounters a segfault while
>> running the job and it doesn't complete it, what happens to the job?
>>
>> Does Resque handle this gracefully?
>>
>> If not, would that job be lost? What would be the best way to handle that?
>>
>>  Thank you for your time. :)
>>
>
>


-- 
*Dave Copeland, Director of Engineering*
Washington, DC (Eastern Time Zone)
dave@stitchfix.com

Re: [resque] What happens to Resque when Ruby crashes when running a job

From:
Alice Mejia
Date:
2014-08-29 @ 20:41
, , 2014 at 2:06 amazing, Emil Kampp <Emil@kampp.meat 2:06 , , 2014 at 2:06
amazing, Emil Knapp <Emil@ , 2014 at 2:06 amazing, Emil Kampp <Emil@kampp.meat
2:06 amazing, Emil Knapp <Emil@kampp.me, , 2014 at 2:06 amazing, Emil Kampp
<Emil@kampp.meKampp < , 2014 at 2:06 amazing, , 2014 at 2:06 amazing, Emil
Kampp <Emil@kampp.meKampp < , 2014 , 2014 at 2:06 amazing, Emil Kampp
<Emil@kampp.Learnt meat 2:06 amazing, Emil Kampp <Emil@kampp.me@kampp.me@
kampp.me
, , , , Learn On Aug Learn , 2014 6: , 2014 at 2:06 amazing, Emil , 2014 at
2:06 amazing, Emil Knapp <Emil@kampp.me<Emil@kampp.meinearn , "Dave Learn "
Learn , 2014 at 2:06 amazing, Emil Kampp <Emil@ , 2014 at 2:06 amazing,
Emil Kampp <Emil@kampp.meLearn , 2014 at 2:06 amazing , 2014 at 2:06
amazing, Emil Kampp <Emil@kampp.meKampp <Emil@kampp.me@ , 2014 at 2:06
amazing, Emil Kampp <Emil@kampp.me> , 2014 at 2:06 , 2014 at 2:06 amazing,
Emil Kampp <Emil@kampp.me, Emil Kampp <Emil@kampp.me:
> Lea Learnedly
rn
> If you , 2014 at 2:06 amazing, Emil Kampp <Emil@kampp.meLearn Learn Learn
Learn , the Learn Learn that it depends Learn how Ruby is killed:
>
>https://github.com/resque/resque/blob/1-x-stable/lib/resque/worker.rb#L371
>
> If you follow t , 2014 at 2:06 amazing, Emil Kampp <Emil@kampp.mehat code
to shutdown! you'll see that it does a reaso , 2014 at 2:06 amazing, Emil
Kampp <Emilelamp.menable , 2014 at 2:06 amazing, Emil Kampp <Emil@kampp.meof
trying to exit , 2014 at 2:06 amazing, Emil , 2014 at 2:06 amazing, Emil
Kampp <Emil@kampp.me2014 at 2:06 amazing, Emil Kampp <Emil@kampp.me<
Emil@kampp.me.  Assuming , 2014 at 2:06 amazing, Emil Kampp
<Emil@kampp.meworker
code isn't doing anything odd, it should experience an exception, which
will send it through the normal error handling (by default: failed queue).
>
> Of course, if the process is killed more aggressively, or you are somehow
ignoring signals, you will lose the job and some of the bookkeeping bits in
Redis will not reflect reality (i.e. that the worker is still working).
>
> Your defense against this:
> Use database transactions so if the job is killed, you don't have a
half-updated database
> Make your jobs idempotent, so you can blindly re-try from the failed
queue without worrying about "double-work"
> Design jobs to Learn Learner any context, such Learn you can queue them
regularly and , 2014 at 2:06 amazing, Emil Kampp <Emil@kampp.Emilio amp job
, 2014 at 2:06 amazing, Emil Kampp <Emil@kampp.medon't matter in the grand
scheme of things (not necessarily possible , 2014 at 2:06 amazing, Emil
Kampp <Emil@kampp.memany cases)
> Dave
>
>
>
> On Wed, Learn , , 2014 at 2:06 amazing, Emil Kampp <Emil@kampp.meat 2:06
amazing, Emil Kampp <Emil@kampp.me> wrote:
>>
>> Learn Learn question :)
>>
>> Learn like Learn Learn as well.
>>
>> Learn
>> --
>>
>> Best regards
>> Emil Kampp
>> IT entrepreneur and founder
>>
>>
>> On Wed, Aug 27, 2014 at 7:57 AM, Vincent Paca <vpaca@payrollhero.com>
wrote:
>>>
>>> , 2014 at 2:06 amazing, Emil Kampp <Emil@ , 2014 at 2:06 amazing, Emil
<Emil@kampp.meeveryone,
>>>
>>> How does Resque handle failed jobs in flight? For example, when job was
already popped out of the queue, and Ruby encounters a segfault while
running the job and it doesn't complete it, what happens to the job?
>>>
>>> Does Resque handle this gracefully?
>>>
>>> If not, would that job be lost? What would be the best way to handle
that?
>>>
>>> Thank you for your time. :)
>>
>>
>
>
>
> --
> Dave , 2014 at 2:06 amazing, Emil Kampp <Emil@kampp.me, Director of
EngineeringEmile lamp Kampp <Emil@kampp.me