High Performance AWS EC2 / RDS + Nginx + PHP-FPM setup

3

General description of the issue

We're currently running an application on a PaaS type solution for PHP. Their solution is based on AWS cloud and due to the fact that their plans don't fit our scaling needs, we've decided to migrate to AWS directly the application is performing "well" ~100ms in app responses at around 400 requests per minute in production but it takes a long while to respond with my setup on AWS. Please keep in mind that each one of those requests does a database insert + some expensive selects that compute statistics.

Current AWS Setup attempt

1 medium RDS server (which is not the bottleneck, because I've checked) 1 medium r3 EC2 server running nginx + PHP FPM + Ubuntu x64 14.04 I've been running some benchmarks and tried to simulate our normal traffic load as close as possible and it just starts missbehaving under constant load.

Current Configs in use

Nginx

user www-data;
worker_processes 2;
pid /run/nginx.pid;
worker_rlimit_nofile 30000;

events {
    worker_connections 8192;
    #multi_accept on;
    use epoll;
}

http {

    ##
    # Basic Settings
    ##

    sendfile on;
    tcp_nopush on;
    tcp_nodelay off;
    keepalive_timeout 30;
    types_hash_max_size 2048;
    server_tokens off;

    # increase buffer and timeouts
    fastcgi_connect_timeout 60;
    fastcgi_send_timeout 180;
    fastcgi_read_timeout 180;
    fastcgi_buffer_size 128k;
    fastcgi_buffers 4 256k;
    fastcgi_busy_buffers_size 256k;
    fastcgi_temp_file_write_size 256k;
    fastcgi_intercept_errors on;
}

Nginx sites/available

server {
  listen          80;
  server_name     project.example.com;

  root            /var/www/project/public/;
  access_log      off;

  charset utf-8;

  index index.html index.htm index.php;


  location / {
      try_files $uri $uri/ /index.php?q=$uri&$args;
  } 

  # catch all
  error_page 404 /index.php;


  location ~ \.php$ {
      # Pass the PHP files to PHP FastCGI for processing

      fastcgi_split_path_info ^(.+\.php)(/.+)$;
      include /etc/nginx/fastcgi_params;
      #fastcgi_pass 127.0.0.1:9000;
      fastcgi_pass unix:/var/run/php5-fpm.sock;
      fastcgi_index index.php;
  }
}

php-fpm.conf

emergency_restart_threshold = 3
emergency_restart_interval = 1m
process_control_timeout = 5s

php-fpm pool

pm = dynamic
pm.max_children = 48
pm.start_servers = 18
pm.min_spare_servers = 16
pm.max_spare_servers = 24
pm.max_requests = 50

/etc/sysctl.conf

net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 16384 16777216
net.core.somaxconn = 4096
net.core.netdev_max_backlog = 16384
net.ipv4.tcp_max_syn_backlog = 8192
net.ipv4.tcp_syncookies = 1
net.ipv4.ip_local_port_range = 1024 65535
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_congestion_control = cubic

Opcache settings

php -i | grep opcache

Additional .ini files parsed => /etc/php5/cli/conf.d/05-opcache.ini,
opcache.blacklist_filename => no value => no value
opcache.consistency_checks => 0 => 0
opcache.dups_fix => Off => Off
opcache.enable => On => On
opcache.enable_cli => Off => Off
opcache.enable_file_override => Off => Off
opcache.error_log => no value => no value
opcache.fast_shutdown => 0 => 0
opcache.file_update_protection => 2 => 2
opcache.force_restart_timeout => 180 => 180
opcache.inherited_hack => On => On
opcache.interned_strings_buffer => 4 => 4
opcache.load_comments => 1 => 1
opcache.log_verbosity_level => 1 => 1
opcache.max_accelerated_files => 50000 => 50000
opcache.max_file_size => 0 => 0
opcache.max_wasted_percentage => 5 => 5
opcache.memory_consumption => 128 => 128
opcache.optimization_level => 0xFFFFFFFF => 0xFFFFFFFF
opcache.preferred_memory_model => no value => no value
opcache.protect_memory => 0 => 0
opcache.restrict_api => no value => no value
opcache.revalidate_freq => 2 => 2
opcache.revalidate_path => Off => Off
opcache.save_comments => 1 => 1
opcache.use_cwd => On => On
opcache.validate_timestamps => On => On

php -i | grep apc

/etc/php5/cli/conf.d/20-apcu.ini,
apc
apcu
apc.coredump_unmap => Off => Off
apc.enable_cli => Off => Off
apc.enabled => On => On
apc.entries_hint => 4096 => 4096
apc.gc_ttl => 3600 => 3600
apc.mmap_file_mask => no value => no value
apc.preload_path => no value => no value
apc.rfc1867 => Off => Off
apc.rfc1867_freq => 0 => 0
apc.rfc1867_name => APC_UPLOAD_PROGRESS => APC_UPLOAD_PROGRESS
apc.rfc1867_prefix => upload_ => upload_
apc.rfc1867_ttl => 3600 => 3600
apc.serializer => php => php
apc.shm_segments => 1 => 1
apc.shm_size => 32M => 32M
apc.slam_defense => On => On
apc.smart => 0 => 0
apc.ttl => 0 => 0
apc.use_request_time => On => On
apc.writable => /tmp => /tmp

benchmark results

AWS setup
Concurrency Level:      75
Time taken for tests:   18.836 seconds
Complete requests:      111
Failed requests:        0
Total transferred:      238269 bytes
HTML transferred:       36630 bytes
Requests per second:    5.89 [#/sec] (mean)
Time per request:       12726.963 [ms] (mean)
Time per request:       169.693 [ms] (mean, across all concurrent requests)
Transfer rate:          12.35 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:       58  226 113.0    276     390
Processing:  2099 8960 3593.2   9524   16784
Waiting:     2087 8947 3591.8   9517   16783
Total:       2377 9186 3585.0   9593   17164

Percentage of the requests served within a certain time (ms)
  50%   9512
  66%  11085
  75%  11747
  80%  12323
  90%  12954
  95%  14459
  98%  15792
  99%  16201
 100%  17164 (longest request)
production setup
 Document Length:        331 bytes

 Concurrency Level:      75
 Time taken for tests:   7.544 seconds
 Complete requests:      595
 Failed requests:        0
 Total transferred:      1220905 bytes
 HTML transferred:       196945 bytes
 Requests per second:    78.87 [#/sec] (mean)
 Time per request:       950.922 [ms] (mean)
 Time per request:       12.679 [ms] (mean, across all concurrent requests)
 Transfer rate:          158.05 [Kbytes/sec] received

 Connection Times (ms)
               min  mean[+/-sd] median   max
 Connect:       58  105  78.4     76     384
 Processing:   265  787 263.1    725    1382
 Waiting:      265  785 262.9    723    1381
 Total:        419  891 267.7    836    1742

 Percentage of the requests served within a certain time (ms)
   50%    836
   66%   1002
   75%   1071
   80%   1129
   90%   1263
   95%   1376
   98%   1662
   99%   1672
  100%   1742 (longest request)     
Sample top output:
top - 12:58:24 up 4 min,  1 user,  load average: 41.69, 16.15, 5.95
Tasks: 121 total,  51 running,  70 sleeping,   0 stopped,   0 zombie
%Cpu(s): 17.7 us, 21.0 sy,  0.0 ni, 40.2 id,  0.9 wa,  0.2 hi,  0.0 si, 20.0 st
KiB Mem:   3838876 total,   643628 used,  3195248 free,    18028 buffers
KiB Swap:  1048572 total,        0 used,  1048572 free.   169920 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND
 1035 www-data  20   0  354540  24272  15224 R 11.8  0.6   0:02.61 php5-fpm
 1037 www-data  20   0  356116  26092  15804 R 11.8  0.7   0:02.66 php5-fpm
 1038 www-data  20   0  355344  25036  15552 R 11.8  0.7   0:02.64 php5-fpm
 1042 www-data  20   0  355588  25392  15660 R 11.8  0.7   0:02.59 php5-fpm
 1044 www-data  20   0  354548  24820  15760 R 11.8  0.6   0:02.63 php5-fpm
 1047 www-data  20   0  356364  26416  15792 R 11.8  0.7   0:02.63 php5-fpm
 1538 www-data  20   0  356300  25092  14624 R 11.8  0.7   0:02.39 php5-fpm
 1046 www-data  20   0  356628  26616  15740 R  5.9  0.7   0:02.61 php5-fpm
 1051 www-data  20   0  356360  26572  15960 R  5.9  0.7   0:02.63 php5-fpm
 1052 www-data  20   0  354544  24780  15988 R  5.9  0.6   0:02.63 php5-fpm
 1512 www-data  20   0  353124  21904  14620 R  5.9  0.6   0:02.55 php5-fpm
 1514 www-data  20   0  355856  24540  14620 R  5.9  0.6   0:02.49 php5-fpm
 1517 www-data  20   0  355272  24028  14620 R  5.9  0.6   0:02.48 php5-fpm
 1518 www-data  20   0  355048  24176  14620 R  5.9  0.6   0:02.44 php5-fpm
 1520 www-data  20   0  355600  24264  14620 R  5.9  0.6   0:02.44 php5-fpm
 1525 www-data  20   0  355344  24460  14620 R  5.9  0.6   0:02.41 php5-fpm
 1527 www-data  20   0  355344  24436  14620 R  5.9  0.6   0:02.41 php5-fpm
 1528 www-data  20   0  354760  23848  14620 R  5.9  0.6   0:02.41 php5-fpm
 1539 www-data  20   0  356072  25200  14620 R  5.9  0.7   0:02.38 php5-fpm

Conclusions on my side

  • I attempted to see if nginx is the culprit by serving a .txt file at the desired load, worked flawlessly.
  • I then attempted to see if php-fpm is at fault by serving a very simple .php file containing "echo "ok";" on both production and aws, and aws actually performed a bit better under higher load
  • the MySQL RDS database sits with 11 active connections even when I'm pummeling the server with ab -c 100 -n 10000
  • the hardware doesn't look like it's the issue either, since when under test it has 30% CPU load, loads of free RAM and the swap is untouched.
  • I'm not getting any sort of errors from either nginx or php-fpm in their respective logs, just slow responses.
  • the codebase itself cannot be the issue, because why in the world would it perform well on the original production server?
nginx
amazon-ec2
amazon-web-services
php-fpm
high-load
asked on Server Fault Jun 19, 2014 by Carvefx • edited Jun 19, 2014 by Carvefx

0 Answers

Nobody has answered this question yet.


User contributions licensed under CC BY-SA 3.0