ProblemTo be honest you don’t have ideal application during start-up development. Some actions are not so fast as you want. For instance you may have some heavy admin reports requests, some lazy import/exports actions, slow third part services network requests, etc…But what if you have already some user traffic and your application just dies due to few non-optimized requests although overall application is fast enough? And your application just stop to respond. Why it happens? Due to request balancing.
Regular request balancer schemeUsually you have static amount of application workers that process web requests. Assume you have 4 workers. Look at scheme below:
In this example only first 4 “slow-request” users are “happy”. Other users are waiting for response.Your frontend or backend (it depends on your hosting scheme) server of-course have internal request queue and use some algorithm to queue requests. Let’s assume fist 4 users requested resources that are not optimized. So all available workers are busy. So any next request will be queued and will wait for free worker (even if it is very light and fast). Under load it looks like everything is slow or even does not response at all. What is solution in this case?
SolutionActually all slow requests should be re-factored to background processes. But it takes more time! And you want solution ASAP otherwise you start-up is over. Obviously you can just increase amount of workers. But it does not help you when more users request slow resources . In order to save situation you need to use more smarter request queue.Let’s conсider next approach. What if your balancer can have two queues one for “slow” and another for “fast” requests. Balancer may match “slow” requests by protocol, url, etc and put they in to “slow” queue. Another requests are put in to “fast” queue. Then you may limit parallel processing to 1 or 2 requests in “slow” queue. Thereby your will have always free workers for “fast” queue.
“Slow-Fast” request balancer scheme
In this example only 3 “slow-request” users are waiting for response. All “fast-request” users are “happy”.
Haproxy as “Slow-Fast” request balancerThe main idea is setup HAProxy as your main front-end that use old frontend as backend.
Then configure two backends as queues – one for slow requests and default for other requests. In terms of HAProxy configuration it can be implemented as:
frontend MAIN ... acl slow_path1 path_req -i /admin/report1 acl slow_path2 path_req -i /admin/report2 ... use_backend SLOW_APP if slow_path1 METH_GET use_backend SLOW_APP if slow_path2 METH_GET ... default_backend FAST_APP backend FAST_APP ... server app APP_SERVER_IP:80 minconn 500 maxconn 500 backend SLOW_APP ... server app APP_SERVER_IP:80 minconn 2 maxconn 2As we see balancer has two backends section that points to the same application host. But they have different limits. FAST_APP backend limits inbound traffic to 500 concurrent connections and SLOW_APP limits to only 2. The amount of concurrent connections in slow backend depends on total amount of your application workers. Usually is 50% from the amount of total workers. For instance if you have 2 master unicorn processes with 2 workers each minconn should be 2.If your are interested check full configuration haproxy.cfg in our github repository.It has two backends – APPLICATION_FAST and APPLICATION_SLOW. Both point to the same application server but with different queue limits. In frontend section we define acls for GET/POST urls slow_urls_get/slow_urls/post that read list of url-regexp from corresponding files. Then we specify APPLICATION_SLOW for these acls and APPLICATION_FAST for any other requests. In this way it works.