librelist archives

« back to archive

Migrating from Backgroundrb to Resque

Migrating from Backgroundrb to Resque

From:
Reynard Hilman
Date:
2010-04-16 @ 20:24
I'm in the process of migrating our Backgroundrb to Resque for many obvious
reasons :) However there is one thing that I think Backgroundrb is still the
best tool, just want to make sure I didn't miss anything on Resque.

We have background job that authenticates users to 3rd party system. Since
it's a network operation we have to expect network errors, slow respond,
etc, thus it has to be on the background. Also since it's an authentication
process, we need to be able to run this task concurrently since many users
can login at the same time.
In Backgroundrb, you can have thread pool, so many authentication tasks can
run concurrently, plus it's fast because of the event machine architecture,
new thread is created immediately as soon as there is authentication
request.

I just don't see Resque as a good replacement for this task. since it
doesn't have threading and it has 5 seconds delay on checking for new job
(when it's idle). Is that a correct analysis?

thanks,
- reynard

Re: [resque] Migrating from Backgroundrb to Resque

From:
Chris Wanstrath
Date:
2010-04-16 @ 21:25
On Fri, Apr 16, 2010 at 4:24 PM, Reynard Hilman <reynard.list@gmail.com> wrote:

> I just don't see Resque as a good replacement for this task. since it
> doesn't have threading and it has 5 seconds delay on checking for new job
> (when it's idle). Is that a correct analysis?

This is exactly the point of Resque. We (GitHub) run 50+ workers at
any given time so that we can process jobs concurrently.

Resque holds the opinion that threads in MRI (the main Ruby) are
flawed and forking + processes are the way to handle concurrency.

If you have any questions or want specific examples of how this works
in Resque, post a snippet or two and I'll do my best to show how it
works.

Good luck!

Chris

Re: [resque] Migrating from Backgroundrb to Resque

From:
Reynard Hilman
Date:
2010-04-19 @ 17:03
On Fri, Apr 16, 2010 at 5:25 PM, Chris Wanstrath <chris@ozmm.org> wrote:

>
> This is exactly the point of Resque. We (GitHub) run 50+ workers at
> any given time so that we can process jobs concurrently.
>
> Resque holds the opinion that threads in MRI (the main Ruby) are
> flawed and forking + processes are the way to handle concurrency.
>
> If you have any questions or want specific examples of how this works
> in Resque, post a snippet or two and I'll do my best to show how it
> works.
>

Hi Chris, but wouldn't 50 workers takes a lot more resources than just using
backgroundrb with thread? I suppose we'll have that many workers when our
site gets as busy as GitHub :) currently 3 resque workers is enough to
process all the jobs and 1 backgroundrb for authentication which can handle
50 concurrent tasks.

Regards,
- reynard

Re: [resque] Migrating from Backgroundrb to Resque

From:
Chris Wanstrath
Date:
2010-04-19 @ 19:00
Resque is not for people who care about resources, it's about  
stability and scale.

It's definitely not for everyone.

I would never trust my site to Ruby green threads, though, at any scale.

Chris

On Apr 19, 2010, at 12:03 PM, Reynard Hilman <reynard.list@gmail.com>  
wrote:

>
> On Fri, Apr 16, 2010 at 5:25 PM, Chris Wanstrath <chris@ozmm.org>  
> wrote:
>
> This is exactly the point of Resque. We (GitHub) run 50+ workers at
> any given time so that we can process jobs concurrently.
>
> Resque holds the opinion that threads in MRI (the main Ruby) are
> flawed and forking + processes are the way to handle concurrency.
>
> If you have any questions or want specific examples of how this works
> in Resque, post a snippet or two and I'll do my best to show how it
> works.
>
> Hi Chris, but wouldn't 50 workers takes a lot more resources than  
> just using backgroundrb with thread? I suppose we'll have that many  
> workers when our site gets as busy as GitHub :) currently 3 resque  
> workers is enough to process all the jobs and 1 backgroundrb for  
> authentication which can handle 50 concurrent tasks.
>
> Regards,
> - reynard

Re: [resque] Migrating from Backgroundrb to Resque

From:
Reynard Hilman
Date:
2010-04-20 @ 17:58
On Mon, Apr 19, 2010 at 3:00 PM, Chris Wanstrath <chris@ozmm.org> wrote:

> Resque is not for people who care about resources, it's about stability and
> scale.
>
> It's definitely not for everyone.
>

It's definitely for me, just not for everything I need to do (yet) :)

- reynard

Re: [resque] Migrating from Backgroundrb to Resque

From:
Ian Warshak
Date:
2010-04-19 @ 20:34
On Mon, Apr 19, 2010 at 2:00 PM, Chris Wanstrath <chris@ozmm.org> wrote:

> Resque is not for people who care about resources, it's about stability and
> scale.
>
> It's definitely not for everyone.
>
> I would never trust my site to Ruby green threads, though, at any scale.
>

I agree, unless your site is diningphilosophers.com, in which case, green
threads would probably be just fine. :-)

Threading inside your rails app is definitely a bad practice. Even if your
code is written in a threaded way, there are lots of places that blocking IO
will block the entire process including your clients response. I found this
out the hard way trying to send off an email in a separate thread which
completely bombed due to Ruby's DNS resolver blocking the entire process.

Ian



>
> Chris
>
> On Apr 19, 2010, at 12:03 PM, Reynard Hilman <reynard.list@gmail.com>
> wrote:
>
>
> On Fri, Apr 16, 2010 at 5:25 PM, Chris Wanstrath < <chris@ozmm.org>
> chris@ozmm.org> wrote:
>
>>
>> This is exactly the point of Resque. We (GitHub) run 50+ workers at
>> any given time so that we can process jobs concurrently.
>>
>> Resque holds the opinion that threads in MRI (the main Ruby) are
>> flawed and forking + processes are the way to handle concurrency.
>>
>> If you have any questions or want specific examples of how this works
>> in Resque, post a snippet or two and I'll do my best to show how it
>> works.
>>
>
> Hi Chris, but wouldn't 50 workers takes a lot more resources than just
> using backgroundrb with thread? I suppose we'll have that many workers when
> our site gets as busy as GitHub :) currently 3 resque workers is enough to
> process all the jobs and 1 backgroundrb for authentication which can handle
> 50 concurrent tasks.
>
> Regards,
> - reynard
>
>

Re: [resque] Migrating from Backgroundrb to Resque

From:
Michael Russo
Date:
2010-04-19 @ 21:50
As a thought exercise, let's talk about how to use Resque and make this 
example (authentication against a third-party over HTTP) fast.

For maximum performance, we want to use evented stack to make asynchronous
HTTP requests (many requests, concurrently).  This can be done in Ruby, 
for example, with em-http-request. [1]

There's nothing stopping us right now from using em-http-request inside of
a Resque worker.   However, we really want to make multiple requests at 
once from within the same process to take full advantage.

To accomplish this, we could store the list of users that need auth'ing in
a separate place (a Redis list would work well for this), and then a 
worker could grab a chunk of the list to process.  This is a bad hack, 
though, because you've lost the clean encapsulation of a "job".

What if we could tell a worker that there are certain classes of jobs that
should be consumed more than one-at-a-time?  The worker would grab the 
list of jobs and hand them off to the child it's just forked, and the 
child would be responsible for reporting back which jobs passed and which 
ones failed.

Just an idea.  I realize that there are some implementation challenges 
(for starters, there is not necessarily a 1-1 mapping between job type and
queue), but it does raise some interesting possibilities.

[1]: http://github.com/igrigorik/em-http-request

-Michael

On 2010-04-19, at 3:00 PM, Chris Wanstrath wrote:

> Resque is not for people who care about resources, it's about stability 
and scale.
> 
> It's definitely not for everyone.
> 
> I would never trust my site to Ruby green threads, though, at any scale.
> 
> Chris
> 
> On Apr 19, 2010, at 12:03 PM, Reynard Hilman <reynard.list@gmail.com> wrote:
> 
>> 
>> On Fri, Apr 16, 2010 at 5:25 PM, Chris Wanstrath <chris@ozmm.org> wrote:
>> 
>> This is exactly the point of Resque. We (GitHub) run 50+ workers at
>> any given time so that we can process jobs concurrently.
>> 
>> Resque holds the opinion that threads in MRI (the main Ruby) are
>> flawed and forking + processes are the way to handle concurrency.
>> 
>> If you have any questions or want specific examples of how this works
>> in Resque, post a snippet or two and I'll do my best to show how it
>> works.
>> 
>> Hi Chris, but wouldn't 50 workers takes a lot more resources than just 
using backgroundrb with thread? I suppose we'll have that many workers 
when our site gets as busy as GitHub :) currently 3 resque workers is 
enough to process all the jobs and 1 backgroundrb for authentication which
can handle 50 concurrent tasks. 
>> 
>> Regards, 
>> - reynard

Re: [resque] Migrating from Backgroundrb to Resque

From:
Tony Arcieri
Date:
2010-04-20 @ 22:34
On Mon, Apr 19, 2010 at 3:50 PM, Michael Russo <mjrusso@gmail.com> wrote:

> To accomplish this, we could store the list of users that need auth'ing in
> a separate place (a Redis list would work well for this), and then a worker
> could grab a chunk of the list to process.  This is a bad hack, though,
> because you've lost the clean encapsulation of a "job".
>

"Bad hack"?  I would contend trying to encapsulate individual URLs within a
single job is a "bad hack" which conflates what you'd like your workers to
do and the state of a particular URL.

If you want to track the lifetime of particular URLs, I'd suggest giving
them a state which you can query and have a state machine-like model for
each URL you can track, such as what's provided by the workflow gem.

Relying on jobs for this sort of state tracking is, in my opinion, a bad
design decision.  You will always get better performance working on batches
of URLs, and the fetch state should be tracked in the database imo, since it
will need to be queried.

-- 
Tony Arcieri
Medioh! A Kudelski Brand

Re: [resque] Migrating from Backgroundrb to Resque

From:
Michael Russo
Date:
2010-04-20 @ 23:12
A request comes in and creates a job.  The job is thrown on a queue.  The 
job is processed later.  Where in this design is anything getting 
"conflated"?

Yes, there are other designs, but this one is simple.

This job happens to consist of hitting a URL, which we know we can speed 
up simply by performing a huge number of (evented) requests in parallel.  
To that end, I would like to think of the optimization problem more 
generally from the perspective of "how can a worker perform multiple jobs 
of a particular type at the same time?".

-Michael 

On 2010-04-20, at 6:34 PM, Tony Arcieri wrote:

> On Mon, Apr 19, 2010 at 3:50 PM, Michael Russo <mjrusso@gmail.com> wrote:
> To accomplish this, we could store the list of users that need auth'ing 
in a separate place (a Redis list would work well for this), and then a 
worker could grab a chunk of the list to process.  This is a bad hack, 
though, because you've lost the clean encapsulation of a "job".
> 
> "Bad hack"?  I would contend trying to encapsulate individual URLs 
within a single job is a "bad hack" which conflates what you'd like your 
workers to do and the state of a particular URL.
> 
> If you want to track the lifetime of particular URLs, I'd suggest giving
them a state which you can query and have a state machine-like model for 
each URL you can track, such as what's provided by the workflow gem.
> 
> Relying on jobs for this sort of state tracking is, in my opinion, a bad
design decision.  You will always get better performance working on 
batches of URLs, and the fetch state should be tracked in the database 
imo, since it will need to be queried.
> 
> -- 
> Tony Arcieri
> Medioh! A Kudelski Brand

Re: [resque] Migrating from Backgroundrb to Resque

From:
Tony Arcieri
Date:
2010-04-20 @ 23:17
On Tue, Apr 20, 2010 at 5:12 PM, Michael Russo <mjrusso@gmail.com> wrote:

> A request comes in and creates a job.  The job is thrown on a queue.  The
> job is processed later.  Where in this design is anything getting
> "conflated"?
>

You're suggesting that having a single job that works on multiple targets is
a "bad hack"


> Yes, there are other designs, but this one is simple.
>

It's also naive.  One of the earliest optimizations you can perform on this
type of system is to have it work on batches.  Believe me, been there,  done
that...


> This job happens to consist of hitting a URL, which we know we can speed up
> simply by performing a huge number of (evented) requests in parallel.  To
> that end, I would like to think of the optimization problem more generally
> from the perspective of "how can a worker perform multiple jobs of a
> particular type at the same time?".
>

Or, an alternative approach is to have a single job with multiple targets...

You call it hackish, I call it... what everybody does when trying to work at
at scale.  There's end-to-end overhead involved.  You're arguing it's
simpler that way.  I'm arguing it doesn't scale as well.  Batch processing
multiple URLs is not a "hack"

-- 
Tony Arcieri
Medioh! A Kudelski Brand

Re: [resque] Migrating from Backgroundrb to Resque

From:
Michael Russo
Date:
2010-04-21 @ 00:13
On 2010-04-20, at 7:17 PM, Tony Arcieri wrote:

> On Tue, Apr 20, 2010 at 5:12 PM, Michael Russo <mjrusso@gmail.com> wrote:
> A request comes in and creates a job.  The job is thrown on a queue.  
The job is processed later.  Where in this design is anything getting 
"conflated"?
> 
> You're suggesting that having a single job that works on multiple 
targets is a "bad hack"

From a monitoring and management standpoint (think: administration via 
Resque Web) it is much, much better when a job doesn't have to tell its 
worker to "go to this other statemachine and figure out what you actually 
need to do".

I have some workers that do jobs in the batch style that has been 
discussed, and I want to find a way to move away from this.

-Michael

Re: [resque] Migrating from Backgroundrb to Resque

From:
Tony Arcieri
Date:
2010-04-21 @ 00:47
On Tue, Apr 20, 2010 at 6:13 PM, Michael Russo <mjrusso@gmail.com> wrote:

> From a monitoring and management standpoint (think: administration via
> Resque Web) it is much, much better when a job doesn't have to tell its
> worker to "go to this other statemachine and figure out what you actually
> need to do".
>

It doesn't work like that.  For building jobs, you query the state of a
particular set of URLs, and extract a batch of them to process based on
their state.  The batch forms the arguments of the job.

In terms of administration, if you're dealing with massive amounts of URLs
to the point that concurrency actually becomes a pressing concern, you're
going to pretty much want to build your own admin interface around that
particular task, which can query URLs based on their state.

The state machine mechanics kick in after a particular URL is processed and
its state needs to be updated.  That can be accomplished with something like
the workflow gem.  Querying is as easy as reading a database through
ActiveRecord, which IMO is more useful for querying than the Resque web
interface (which is more useful for you figuring out what, if anything, your
workers are doing)

-- 
Tony Arcieri
Medioh! A Kudelski Brand

Re: [resque] Migrating from Backgroundrb to Resque

From:
Tim Haines
Date:
2010-04-19 @ 21:54
Hey Michael,

What you've described in terms of having a job that grabs x ids from a redis
list, and runs em-http-request to go fetch data for them is something I've
implemented for a couple of projects over the last month.  In the interests
of learning a better way, I'd like to ask you why you think it's a bad hack?

Tim.

On Tue, Apr 20, 2010 at 9:50 AM, Michael Russo <mjrusso@gmail.com> wrote:

> As a thought exercise, let's talk about how to use Resque and make this
> example (authentication against a third-party over HTTP) fast.
>
> For maximum performance, we want to use evented stack to make asynchronous
> HTTP requests (many requests, concurrently).  This can be done in Ruby, for
> example, with em-http-request. [1]
>
> There's nothing stopping us right now from using em-http-request inside of
> a Resque worker.   However, we really want to make multiple requests at once
> from within the same process to take full advantage.
>
> To accomplish this, we could store the list of users that need auth'ing in
> a separate place (a Redis list would work well for this), and then a worker
> could grab a chunk of the list to process.  This is a bad hack, though,
> because you've lost the clean encapsulation of a "job".
>
> What if we could tell a worker that there are certain classes of jobs that
> should be consumed more than one-at-a-time?  The worker would grab the list
> of jobs and hand them off to the child it's just forked, and the child would
> be responsible for reporting back which jobs passed and which ones failed.
>
> Just an idea.  I realize that there are some implementation challenges (for
> starters, there is not necessarily a 1-1 mapping between job type and
> queue), but it does raise some interesting possibilities.
>
> [1]: http://github.com/igrigorik/em-http-request
>
> -Michael
>
> On 2010-04-19, at 3:00 PM, Chris Wanstrath wrote:
>
> Resque is not for people who care about resources, it's about stability and
> scale.
>
> It's definitely not for everyone.
>
> I would never trust my site to Ruby green threads, though, at any scale.
>
> Chris
>
> On Apr 19, 2010, at 12:03 PM, Reynard Hilman <reynard.list@gmail.com>
> wrote:
>
>
> On Fri, Apr 16, 2010 at 5:25 PM, Chris Wanstrath < <chris@ozmm.org>
> chris@ozmm.org> wrote:
>
>>
>> This is exactly the point of Resque. We (GitHub) run 50+ workers at
>> any given time so that we can process jobs concurrently.
>>
>> Resque holds the opinion that threads in MRI (the main Ruby) are
>> flawed and forking + processes are the way to handle concurrency.
>>
>> If you have any questions or want specific examples of how this works
>> in Resque, post a snippet or two and I'll do my best to show how it
>> works.
>>
>
> Hi Chris, but wouldn't 50 workers takes a lot more resources than just
> using backgroundrb with thread? I suppose we'll have that many workers when
> our site gets as busy as GitHub :) currently 3 resque workers is enough to
> process all the jobs and 1 backgroundrb for authentication which can handle
> 50 concurrent tasks.
>
> Regards,
> - reynard
>
>
>

Re: [resque] Migrating from Backgroundrb to Resque

From:
Michael Russo
Date:
2010-04-19 @ 22:06
On 2010-04-19, at 5:54 PM, Tim Haines wrote:

> Hey Michael,
> 
> What you've described in terms of having a job that grabs x ids from a 
redis list, and runs em-http-request to go fetch data for them is 
something I've implemented for a couple of projects over the last month.  
In the interests of learning a better way, I'd like to ask you why you 
think it's a bad hack?


Hi Tim,

Ideally, I would like to be able to look at the arguments of a job and 
know which users I'm logging in, or which RSS feeds I'm updating, etc.

If you have some separate, mutable data structure that you're pulling 
arguments from, then you need to do some extra acrobatics to both see what
went wrong and to recover.

I've done some similar things, but I can't help but think that there must 
be a better way :)

-Michael

Re: [resque] Migrating from Backgroundrb to Resque

From:
Tony Arcieri
Date:
2010-04-19 @ 20:27
On Fri, Apr 16, 2010 at 3:25 PM, Chris Wanstrath <chris@ozmm.org> wrote:

> Resque holds the opinion that threads in MRI (the main Ruby) are
> flawed and forking + processes are the way to handle concurrency.
>

I just thought I'd throw my two cents in and say that I strongly agree with
this statement.  If you're using MRI, the implementation of threading
provided is completely abominable.  Even though some of the newer Ruby
implementations like JRuby do actually provide a decent implementation of
threads, threads themselves still provide a lousy and error-prone mechanism
for concurrency.

I am a big fan of Resque's shared-nothing multiprocess approach.

-- 
Tony Arcieri
Medioh! A Kudelski Brand

Re: [resque] Migrating from Backgroundrb to Resque

From:
Ian Warshak
Date:
2010-04-19 @ 18:26
Reynard,

A bit off topic, but can you tell me more about how you are using
backgroundrb to speed up the authentication process? What gains are you
getting since you have to wait for the response of the slow service before
you can proceed? The only that I can think of is that you are gaining speed
by eliminating the network conneciton/teardown time.

Thanks
Ian

On Fri, Apr 16, 2010 at 3:24 PM, Reynard Hilman <reynard.list@gmail.com>wrote:

>
> I'm in the process of migrating our Backgroundrb to Resque for many obvious
> reasons :) However there is one thing that I think Backgroundrb is still the
> best tool, just want to make sure I didn't miss anything on Resque.
>
> We have background job that authenticates users to 3rd party system. Since
> it's a network operation we have to expect network errors, slow respond,
> etc, thus it has to be on the background. Also since it's an authentication
> process, we need to be able to run this task concurrently since many users
> can login at the same time.
> In Backgroundrb, you can have thread pool, so many authentication tasks can
> run concurrently, plus it's fast because of the event machine architecture,
> new thread is created immediately as soon as there is authentication
> request.
>
> I just don't see Resque as a good replacement for this task. since it
> doesn't have threading and it has 5 seconds delay on checking for new job
> (when it's idle). Is that a correct analysis?
>
> thanks,
>  - reynard
>

Re: [resque] Migrating from Backgroundrb to Resque

From:
Reynard Hilman
Date:
2010-04-20 @ 17:33
Hi Ian,
I guess the main speed benefit is that backgroundrb process the request
immediately, instead of waiting for 5 seconds when there is nothing in the
queue. The main concern is that to handle 50 concurrent auth request, I
would need to have 50 resque workers which would take a lot of memory (it's
about 50Mb each in my case).

I have heard about the bad things with threading in ruby but it has been
working fine for this purpose. The thread is only on background worker, and
the rails app doesn't create any thread. Also in this case the thread doesnt
have to sync with other threads, it just stores the result in mysql. So I
don't have to deal with concurrent programming issues :)

- reynard


On Mon, Apr 19, 2010 at 2:26 PM, Ian Warshak <iwarshak@stripey.net> wrote:

> Reynard,
>
> A bit off topic, but can you tell me more about how you are using
> backgroundrb to speed up the authentication process? What gains are you
> getting since you have to wait for the response of the slow service before
> you can proceed? The only that I can think of is that you are gaining speed
> by eliminating the network conneciton/teardown time.
>
> Thanks
> Ian
>
> On Fri, Apr 16, 2010 at 3:24 PM, Reynard Hilman <reynard.list@gmail.com>wrote:
>
>>
>> I'm in the process of migrating our Backgroundrb to Resque for many
>> obvious reasons :) However there is one thing that I think Backgroundrb is
>> still the best tool, just want to make sure I didn't miss anything on
>> Resque.
>>
>> We have background job that authenticates users to 3rd party system. Since
>> it's a network operation we have to expect network errors, slow respond,
>> etc, thus it has to be on the background. Also since it's an authentication
>> process, we need to be able to run this task concurrently since many users
>> can login at the same time.
>> In Backgroundrb, you can have thread pool, so many authentication tasks
>> can run concurrently, plus it's fast because of the event machine
>> architecture, new thread is created immediately as soon as there is
>> authentication request.
>>
>> I just don't see Resque as a good replacement for this task. since it
>> doesn't have threading and it has 5 seconds delay on checking for new job
>> (when it's idle). Is that a correct analysis?
>>
>> thanks,
>>  - reynard
>>
>
>

Re: [resque] Migrating from Backgroundrb to Resque

From:
Ian Warshak
Date:
2010-04-20 @ 18:14
Hi Reynard,

I think I framed my question wrong. What I am curious to know is, what does
backgroundrb gain you right now? If you have to wait for the result of the
webservice call before you can let the user login, what does doing in
background gain you?

Ian

On Tue, Apr 20, 2010 at 12:33 PM, Reynard Hilman <reynard.list@gmail.com>wrote:

> Hi Ian,
> I guess the main speed benefit is that backgroundrb process the request
> immediately, instead of waiting for 5 seconds when there is nothing in the
> queue. The main concern is that to handle 50 concurrent auth request, I
> would need to have 50 resque workers which would take a lot of memory (it's
> about 50Mb each in my case).
>
> I have heard about the bad things with threading in ruby but it has been
> working fine for this purpose. The thread is only on background worker, and
> the rails app doesn't create any thread. Also in this case the thread doesnt
> have to sync with other threads, it just stores the result in mysql. So I
> don't have to deal with concurrent programming issues :)
>
> - reynard
>
>
>
> On Mon, Apr 19, 2010 at 2:26 PM, Ian Warshak <iwarshak@stripey.net> wrote:
>
>> Reynard,
>>
>> A bit off topic, but can you tell me more about how you are using
>> backgroundrb to speed up the authentication process? What gains are you
>> getting since you have to wait for the response of the slow service before
>> you can proceed? The only that I can think of is that you are gaining speed
>> by eliminating the network conneciton/teardown time.
>>
>> Thanks
>> Ian
>>
>> On Fri, Apr 16, 2010 at 3:24 PM, Reynard Hilman <reynard.list@gmail.com>wrote:
>>
>>>
>>> I'm in the process of migrating our Backgroundrb to Resque for many
>>> obvious reasons :) However there is one thing that I think Backgroundrb is
>>> still the best tool, just want to make sure I didn't miss anything on
>>> Resque.
>>>
>>> We have background job that authenticates users to 3rd party system.
>>> Since it's a network operation we have to expect network errors, slow
>>> respond, etc, thus it has to be on the background. Also since it's an
>>> authentication process, we need to be able to run this task concurrently
>>> since many users can login at the same time.
>>> In Backgroundrb, you can have thread pool, so many authentication tasks
>>> can run concurrently, plus it's fast because of the event machine
>>> architecture, new thread is created immediately as soon as there is
>>> authentication request.
>>>
>>> I just don't see Resque as a good replacement for this task. since it
>>> doesn't have threading and it has 5 seconds delay on checking for new job
>>> (when it's idle). Is that a correct analysis?
>>>
>>> thanks,
>>>  - reynard
>>>
>>
>>
>

Re: [resque] Migrating from Backgroundrb to Resque

From:
Reynard Hilman
Date:
2010-04-20 @ 19:06
Hi Ian,
Most of the time the auth process is quick (less than 1 second). so when a
user login, the backgroundrb immediately processes the request and stores
the result in mysql, the browser checks every second or so. and the user is
immediately logged in.
If I use resque, when the queue is empty there is a chance for 5 seconds
delay (I guess that gets smaller as you have more workers since they don't
all poll at the same 5 seconds interval), but there is a good chance that a
user has to wait at least 5 seconds before the auth request is processed. Of
course there is nothing we can do if the 3rd party server is slow in
responding, but I'm just trying to minimize the delay.

- reynard


On Tue, Apr 20, 2010 at 2:14 PM, Ian Warshak <iwarshak@stripey.net> wrote:

> Hi Reynard,
>
> I think I framed my question wrong. What I am curious to know is, what does
> backgroundrb gain you right now? If you have to wait for the result of the
> webservice call before you can let the user login, what does doing in
> background gain you?
>
> Ian
>
>
> On Tue, Apr 20, 2010 at 12:33 PM, Reynard Hilman <reynard.list@gmail.com>wrote:
>
>> Hi Ian,
>> I guess the main speed benefit is that backgroundrb process the request
>> immediately, instead of waiting for 5 seconds when there is nothing in the
>> queue. The main concern is that to handle 50 concurrent auth request, I
>> would need to have 50 resque workers which would take a lot of memory (it's
>> about 50Mb each in my case).
>>
>> I have heard about the bad things with threading in ruby but it has been
>> working fine for this purpose. The thread is only on background worker, and
>> the rails app doesn't create any thread. Also in this case the thread doesnt
>> have to sync with other threads, it just stores the result in mysql. So I
>> don't have to deal with concurrent programming issues :)
>>
>> - reynard
>>
>>
>>
>> On Mon, Apr 19, 2010 at 2:26 PM, Ian Warshak <iwarshak@stripey.net>wrote:
>>
>>> Reynard,
>>>
>>> A bit off topic, but can you tell me more about how you are using
>>> backgroundrb to speed up the authentication process? What gains are you
>>> getting since you have to wait for the response of the slow service before
>>> you can proceed? The only that I can think of is that you are gaining speed
>>> by eliminating the network conneciton/teardown time.
>>>
>>> Thanks
>>> Ian
>>>
>>> On Fri, Apr 16, 2010 at 3:24 PM, Reynard Hilman <reynard.list@gmail.com>wrote:
>>>
>>>>
>>>> I'm in the process of migrating our Backgroundrb to Resque for many
>>>> obvious reasons :) However there is one thing that I think Backgroundrb is
>>>> still the best tool, just want to make sure I didn't miss anything on
>>>> Resque.
>>>>
>>>> We have background job that authenticates users to 3rd party system.
>>>> Since it's a network operation we have to expect network errors, slow
>>>> respond, etc, thus it has to be on the background. Also since it's an
>>>> authentication process, we need to be able to run this task concurrently
>>>> since many users can login at the same time.
>>>> In Backgroundrb, you can have thread pool, so many authentication tasks
>>>> can run concurrently, plus it's fast because of the event machine
>>>> architecture, new thread is created immediately as soon as there is
>>>> authentication request.
>>>>
>>>> I just don't see Resque as a good replacement for this task. since it
>>>> doesn't have threading and it has 5 seconds delay on checking for new job
>>>> (when it's idle). Is that a correct analysis?
>>>>
>>>> thanks,
>>>>  - reynard
>>>>
>>>
>>>
>>
>

Re: [resque] Migrating from Backgroundrb to Resque

From:
Tony Arcieri
Date:
2010-04-20 @ 19:43
It sounds like your latency problem would be mitigated by the blocking pop
feature (BLPOP) in newer versions of Redis:

http://code.google.com/p/redis/wiki/BlpopCommand

<http://code.google.com/p/redis/wiki/BlpopCommand>As I understand it this is
slated for Resque 2.0

On Tue, Apr 20, 2010 at 1:06 PM, Reynard Hilman <reynard.list@gmail.com>wrote:

> Hi Ian,
> Most of the time the auth process is quick (less than 1 second). so when a
> user login, the backgroundrb immediately processes the request and stores
> the result in mysql, the browser checks every second or so. and the user is
> immediately logged in.
> If I use resque, when the queue is empty there is a chance for 5 seconds
> delay (I guess that gets smaller as you have more workers since they don't
> all poll at the same 5 seconds interval), but there is a good chance that a
> user has to wait at least 5 seconds before the auth request is processed. Of
> course there is nothing we can do if the 3rd party server is slow in
> responding, but I'm just trying to minimize the delay.
>
> - reynard
>
>
>
> On Tue, Apr 20, 2010 at 2:14 PM, Ian Warshak <iwarshak@stripey.net> wrote:
>
>> Hi Reynard,
>>
>> I think I framed my question wrong. What I am curious to know is, what
>> does backgroundrb gain you right now? If you have to wait for the result of
>> the webservice call before you can let the user login, what does doing in
>> background gain you?
>>
>> Ian
>>
>>
>> On Tue, Apr 20, 2010 at 12:33 PM, Reynard Hilman <reynard.list@gmail.com>wrote:
>>
>>> Hi Ian,
>>> I guess the main speed benefit is that backgroundrb process the request
>>> immediately, instead of waiting for 5 seconds when there is nothing in the
>>> queue. The main concern is that to handle 50 concurrent auth request, I
>>> would need to have 50 resque workers which would take a lot of memory (it's
>>> about 50Mb each in my case).
>>>
>>> I have heard about the bad things with threading in ruby but it has been
>>> working fine for this purpose. The thread is only on background worker, and
>>> the rails app doesn't create any thread. Also in this case the thread doesnt
>>> have to sync with other threads, it just stores the result in mysql. So I
>>> don't have to deal with concurrent programming issues :)
>>>
>>> - reynard
>>>
>>>
>>>
>>> On Mon, Apr 19, 2010 at 2:26 PM, Ian Warshak <iwarshak@stripey.net>wrote:
>>>
>>>> Reynard,
>>>>
>>>> A bit off topic, but can you tell me more about how you are using
>>>> backgroundrb to speed up the authentication process? What gains are you
>>>> getting since you have to wait for the response of the slow service before
>>>> you can proceed? The only that I can think of is that you are gaining speed
>>>> by eliminating the network conneciton/teardown time.
>>>>
>>>> Thanks
>>>> Ian
>>>>
>>>> On Fri, Apr 16, 2010 at 3:24 PM, Reynard Hilman <reynard.list@gmail.com
>>>> > wrote:
>>>>
>>>>>
>>>>> I'm in the process of migrating our Backgroundrb to Resque for many
>>>>> obvious reasons :) However there is one thing that I think Backgroundrb is
>>>>> still the best tool, just want to make sure I didn't miss anything on
>>>>> Resque.
>>>>>
>>>>> We have background job that authenticates users to 3rd party system.
>>>>> Since it's a network operation we have to expect network errors, slow
>>>>> respond, etc, thus it has to be on the background. Also since it's an
>>>>> authentication process, we need to be able to run this task concurrently
>>>>> since many users can login at the same time.
>>>>> In Backgroundrb, you can have thread pool, so many authentication tasks
>>>>> can run concurrently, plus it's fast because of the event machine
>>>>> architecture, new thread is created immediately as soon as there is
>>>>> authentication request.
>>>>>
>>>>> I just don't see Resque as a good replacement for this task. since it
>>>>> doesn't have threading and it has 5 seconds delay on checking for new job
>>>>> (when it's idle). Is that a correct analysis?
>>>>>
>>>>> thanks,
>>>>>  - reynard
>>>>>
>>>>
>>>>
>>>
>>
>


-- 
Tony Arcieri
Medioh! A Kudelski Brand

Re: [resque] Migrating from Backgroundrb to Resque

From:
Tim Haines
Date:
2010-04-20 @ 22:28
Hey Reynard,

Just wanted to point out the 5 second sleep is configurable - (INTERVAL=1
for 1 second) but yes, as Tony said, the Blocking pop will be much nicer.

Tim.

On Wed, Apr 21, 2010 at 7:06 AM, Reynard Hilman <reynard.list@gmail.com>wrote:

> Hi Ian,
> Most of the time the auth process is quick (less than 1 second). so when a
> user login, the backgroundrb immediately processes the request and stores
> the result in mysql, the browser checks every second or so. and the user is
> immediately logged in.
> If I use resque, when the queue is empty there is a chance for 5 seconds
> delay (I guess that gets smaller as you have more workers since they don't
> all poll at the same 5 seconds interval), but there is a good chance that a
> user has to wait at least 5 seconds before the auth request is processed. Of
> course there is nothing we can do if the 3rd party server is slow in
> responding, but I'm just trying to minimize the delay.
>
> - reynard
>
>
>
> On Tue, Apr 20, 2010 at 2:14 PM, Ian Warshak <iwarshak@stripey.net> wrote:
>
>> Hi Reynard,
>>
>> I think I framed my question wrong. What I am curious to know is, what
>> does backgroundrb gain you right now? If you have to wait for the result of
>> the webservice call before you can let the user login, what does doing in
>> background gain you?
>>
>> Ian
>>
>>
>> On Tue, Apr 20, 2010 at 12:33 PM, Reynard Hilman <reynard.list@gmail.com>wrote:
>>
>>> Hi Ian,
>>> I guess the main speed benefit is that backgroundrb process the request
>>> immediately, instead of waiting for 5 seconds when there is nothing in the
>>> queue. The main concern is that to handle 50 concurrent auth request, I
>>> would need to have 50 resque workers which would take a lot of memory (it's
>>> about 50Mb each in my case).
>>>
>>> I have heard about the bad things with threading in ruby but it has been
>>> working fine for this purpose. The thread is only on background worker, and
>>> the rails app doesn't create any thread. Also in this case the thread doesnt
>>> have to sync with other threads, it just stores the result in mysql. So I
>>> don't have to deal with concurrent programming issues :)
>>>
>>> - reynard
>>>
>>>
>>>
>>> On Mon, Apr 19, 2010 at 2:26 PM, Ian Warshak <iwarshak@stripey.net>wrote:
>>>
>>>> Reynard,
>>>>
>>>> A bit off topic, but can you tell me more about how you are using
>>>> backgroundrb to speed up the authentication process? What gains are you
>>>> getting since you have to wait for the response of the slow service before
>>>> you can proceed? The only that I can think of is that you are gaining speed
>>>> by eliminating the network conneciton/teardown time.
>>>>
>>>> Thanks
>>>> Ian
>>>>
>>>> On Fri, Apr 16, 2010 at 3:24 PM, Reynard Hilman <reynard.list@gmail.com
>>>> > wrote:
>>>>
>>>>>
>>>>> I'm in the process of migrating our Backgroundrb to Resque for many
>>>>> obvious reasons :) However there is one thing that I think Backgroundrb is
>>>>> still the best tool, just want to make sure I didn't miss anything on
>>>>> Resque.
>>>>>
>>>>> We have background job that authenticates users to 3rd party system.
>>>>> Since it's a network operation we have to expect network errors, slow
>>>>> respond, etc, thus it has to be on the background. Also since it's an
>>>>> authentication process, we need to be able to run this task concurrently
>>>>> since many users can login at the same time.
>>>>> In Backgroundrb, you can have thread pool, so many authentication tasks
>>>>> can run concurrently, plus it's fast because of the event machine
>>>>> architecture, new thread is created immediately as soon as there is
>>>>> authentication request.
>>>>>
>>>>> I just don't see Resque as a good replacement for this task. since it
>>>>> doesn't have threading and it has 5 seconds delay on checking for new job
>>>>> (when it's idle). Is that a correct analysis?
>>>>>
>>>>> thanks,
>>>>>  - reynard
>>>>>
>>>>
>>>>
>>>
>>
>

Re: [resque] Migrating from Backgroundrb to Resque

From:
Reynard Hilman
Date:
2010-04-21 @ 17:16
Thanks for the tips, and BLPOP looks really good for resque. So latency
isn't really a problem, but I'm now back to the concurrency problem, which I
think is still best handled by thread (I know many people have strong
feeling against thread :) but I just cannot think of a better way to handle
this case without thread.
actually using thread is more scalable in this case, since the thread can be
created as needed very cheaply, so If I need to handle 100 or 200 concurrent
login I don't need to start more dedicated processes to handle that. I
suppose this can be implemented in Resque with BLPOP and an option to create
thread instead of process. but that probably doesn't line up with resque
philosophy.

- reynard

On Tue, Apr 20, 2010 at 6:28 PM, Tim Haines <tmhaines@gmail.com> wrote:

> Hey Reynard,
>
> Just wanted to point out the 5 second sleep is configurable - (INTERVAL=1
> for 1 second) but yes, as Tony said, the Blocking pop will be much nicer.
>
> Tim.
>
>
> On Wed, Apr 21, 2010 at 7:06 AM, Reynard Hilman <reynard.list@gmail.com>wrote:
>
>> Hi Ian,
>> Most of the time the auth process is quick (less than 1 second). so when a
>> user login, the backgroundrb immediately processes the request and stores
>> the result in mysql, the browser checks every second or so. and the user is
>> immediately logged in.
>> If I use resque, when the queue is empty there is a chance for 5 seconds
>> delay (I guess that gets smaller as you have more workers since they don't
>> all poll at the same 5 seconds interval), but there is a good chance that a
>> user has to wait at least 5 seconds before the auth request is processed. Of
>> course there is nothing we can do if the 3rd party server is slow in
>> responding, but I'm just trying to minimize the delay.
>>
>> - reynard
>>
>>
>>
>> On Tue, Apr 20, 2010 at 2:14 PM, Ian Warshak <iwarshak@stripey.net>wrote:
>>
>>> Hi Reynard,
>>>
>>> I think I framed my question wrong. What I am curious to know is, what
>>> does backgroundrb gain you right now? If you have to wait for the result of
>>> the webservice call before you can let the user login, what does doing in
>>> background gain you?
>>>
>>> Ian
>>>
>>>
>>> On Tue, Apr 20, 2010 at 12:33 PM, Reynard Hilman <reynard.list@gmail.com
>>> > wrote:
>>>
>>>> Hi Ian,
>>>> I guess the main speed benefit is that backgroundrb process the request
>>>> immediately, instead of waiting for 5 seconds when there is nothing in the
>>>> queue. The main concern is that to handle 50 concurrent auth request, I
>>>> would need to have 50 resque workers which would take a lot of memory (it's
>>>> about 50Mb each in my case).
>>>>
>>>> I have heard about the bad things with threading in ruby but it has been
>>>> working fine for this purpose. The thread is only on background worker, and
>>>> the rails app doesn't create any thread. Also in this case the thread doesnt
>>>> have to sync with other threads, it just stores the result in mysql. So I
>>>> don't have to deal with concurrent programming issues :)
>>>>
>>>> - reynard
>>>>
>>>>
>>>>
>>>> On Mon, Apr 19, 2010 at 2:26 PM, Ian Warshak <iwarshak@stripey.net>wrote:
>>>>
>>>>> Reynard,
>>>>>
>>>>> A bit off topic, but can you tell me more about how you are using
>>>>> backgroundrb to speed up the authentication process? What gains are you
>>>>> getting since you have to wait for the response of the slow service before
>>>>> you can proceed? The only that I can think of is that you are gaining speed
>>>>> by eliminating the network conneciton/teardown time.
>>>>>
>>>>> Thanks
>>>>> Ian
>>>>>
>>>>> On Fri, Apr 16, 2010 at 3:24 PM, Reynard Hilman <
>>>>> reynard.list@gmail.com> wrote:
>>>>>
>>>>>>
>>>>>> I'm in the process of migrating our Backgroundrb to Resque for many
>>>>>> obvious reasons :) However there is one thing that I think Backgroundrb is
>>>>>> still the best tool, just want to make sure I didn't miss anything on
>>>>>> Resque.
>>>>>>
>>>>>> We have background job that authenticates users to 3rd party system.
>>>>>> Since it's a network operation we have to expect network errors, slow
>>>>>> respond, etc, thus it has to be on the background. Also since it's an
>>>>>> authentication process, we need to be able to run this task concurrently
>>>>>> since many users can login at the same time.
>>>>>> In Backgroundrb, you can have thread pool, so many authentication
>>>>>> tasks can run concurrently, plus it's fast because of the event machine
>>>>>> architecture, new thread is created immediately as soon as there is
>>>>>> authentication request.
>>>>>>
>>>>>> I just don't see Resque as a good replacement for this task. since it
>>>>>> doesn't have threading and it has 5 seconds delay on checking for new job
>>>>>> (when it's idle). Is that a correct analysis?
>>>>>>
>>>>>> thanks,
>>>>>>  - reynard
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: [resque] Migrating from Backgroundrb to Resque

From:
Tony Arcieri
Date:
2010-04-21 @ 17:20
On Wed, Apr 21, 2010 at 11:16 AM, Reynard Hilman <reynard.list@gmail.com>wrote:

>
> Thanks for the tips, and BLPOP looks really good for resque. So latency
> isn't really a problem, but I'm now back to the concurrency problem, which I
> think is still best handled by thread (I know many people have strong
> feeling against thread :) but I just cannot think of a better way to handle
> this case without thread.
>

As was already suggested, you can use EventMachine to perform the requests
concurrently via an event-driven system, rather than threads:

http://github.com/eventmachine/em-http-request

-- 
Tony Arcieri
Medioh! A Kudelski Brand

Re: [resque] Migrating from Backgroundrb to Resque

From:
Reynard Hilman
Date:
2010-04-21 @ 18:11
Thanks Tony, I'll look into that in my spare time :) since that means
rewriting some of the code. but it would be nice it that can handle my need.


- reynard


On Wed, Apr 21, 2010 at 1:20 PM, Tony Arcieri <tony.arcieri@medioh.com>wrote:

> On Wed, Apr 21, 2010 at 11:16 AM, Reynard Hilman <reynard.list@gmail.com>wrote:
>
>>
>> Thanks for the tips, and BLPOP looks really good for resque. So latency
>> isn't really a problem, but I'm now back to the concurrency problem, which I
>> think is still best handled by thread (I know many people have strong
>> feeling against thread :) but I just cannot think of a better way to handle
>> this case without thread.
>>
>
> As was already suggested, you can use EventMachine to perform the requests
> concurrently via an event-driven system, rather than threads:
>
> http://github.com/eventmachine/em-http-request
>
> --
> Tony Arcieri
> Medioh! A Kudelski Brand
>