Multi-tasking in PHP Wednesday, December 5, 12

About Me • Formerly CTO for Company52 • Currently work at Brandmovers on Northside Drive • • Self-taught Wednesday, December 5, 12 Full-time geek since 2007

Wednesday, December 5, 12

Use Cases • System resources are not the bottleneck • Batch processing involving an API: • • • • E-Mail Geocoding Electronic billing Daemons Wednesday, December 5, 12

Alternatives • Gearman • curl_multi_* • Other scripting languages Wednesday, December 5, 12

Theory Wednesday, December 5, 12

Two ways to multi-task Wednesday, December 5, 12 Multi-processing Multi-threading Separate memory Shared memory Errors are isolated Errors are not isolated Separate permissions Same permissions Linux/UNIX Windows

Multiprocessing Process 1 state: PC, stack Process 2 Multithreading Process 1 program virtual processor Thread 1 memory state: PC, stack virtual processor virtual processor Thread 2 program state: PC, stack state: PC, stack program virtual processor memory Thread m Process n state: PC, stack state: PC, stack program virtual processor memory Process n Courtesy www.fmc-modeling.org Wednesday, December 5, 12 virtual processor memory

Multiprocessing • “The simultaneous execution of two or more programs by separate CPUs under integrated control.” • Clones the entire process, except resources • Copy-on-write memory Wednesday, December 5, 12

Forking Diagram courtesy cnx.org Wednesday, December 5, 12

Child Process • A cloned copy of a parent process • Receives a new process ID and a parent process ID • Does some work • Dies Wednesday, December 5, 12

…sort of. Photo Credit: Christopher Brian (2011 Toronto Zombie Walk) Wednesday, December 5, 12

Parent Responsibilities • Reproduction • Monitors child process status • “Reap” zombie processes Wednesday, December 5, 12

Process Signals Wednesday, December 5, 12 Signal Description SIGCHLD Child process died SIGINT User Interrupt SIGTERM Terminate SIGKILL Forcibly terminate

PHP Implementation Wednesday, December 5, 12

Requirements • • • Unix-like operating system • PHP Semaphore extension, optional (—enable-sysvsem, —enable-sysvshm, —enable-sysvmsg) • • Plenty of memory Wednesday, December 5, 12 PHP 4.1+ PHP PCNTL extension (compile with —enable-pcntl) Multiple CPU cores

Overview 1. Define signal handlers 2. Fetch a dataset 3. Fork off one child process for each item 4. Stop forking when a threshold is reached, and sleep 5. Reap a child process whenever SIGCHLD is received 6. If there’s more work to do, fork more processes 7. When all child processes have been reaped, terminate Wednesday, December 5, 12

declare(ticks = 1); // Setup our signal handlers pcntl_signal(SIGTERM, “signal_handler”); pcntl_signal(SIGINT, “signal_handler”); pcntl_signal(SIGCHLD, “signal_handler”); Wednesday, December 5, 12

function signal_handler($signal) { switch ($signal) {

case SIGINT:

case SIGTERM:

// kill all child processes

exit(0);

case SIGCHLD:

// reap a child process

reap_child();

break; } } Wednesday, December 5, 12

$pid = pcntl_fork(); switch($pid) { case 0:

// Child process

call_user_func($callback, $data);

posix_kill(posix_getppid(), SIGCHLD);

exit; case -1:

// Parent process, fork failed

throw new Exception(“Out of memory!”); default:

// Parent process, fork succeeded

$processes[$pid] = TRUE; } Wednesday, December 5, 12

Repeat for each unit of work Wednesday, December 5, 12

function reap_child() { // Check if any child process has terminated, // and if so remove it from memory $pid = pcntl_wait($status, WNOHANG); if ($pid < 0) {

throw new Exception(“Out of memory”); } elseif ($pid > 0) {

unset($processes[$pid]); } } Wednesday, December 5, 12

Demo Time! http://gist.github.com/4212160 Wednesday, December 5, 12

Wednesday, December 5, 12

DON’T: • • • • • • Wednesday, December 5, 12 DON’T fork a web process (CLI only!) DON’T overload your system DON’T open resources before forking DO respect common POSIX signals DO remove zombie processes DO force new database connections in children mysql_reconnect($s, $u, $p, TRUE);

Challenges Wednesday, December 5, 12

Race Conditions • A logic bug where the result is affected by the sequence or timing of uncontrollable events • • • • • Adding debug logic can change timing Wednesday, December 5, 12 Dirty reads Lost data Unpredictable behavior Deadlocks, hanging, crashing

Wednesday, December 5, 12

Wednesday, December 5, 12

Solutions • Handle I/O in the parent process exclusively • Manage resources with semaphores and/or mutexes Wednesday, December 5, 12

Semaphores • Semaphore = atomically updated counter • Mutex = binary semaphore with ownership • PHP: sem_get(), sem_release() • Transactional databases use semaphores Wednesday, December 5, 12

Deadlocks Image credit: csunplugged.org Wednesday, December 5, 12

Bonus Slides Wednesday, December 5, 12

Shared Memory • • • Advanced inter-process communication • PHP System V Shared Memory extension (—enable-sysvshm) Pass data back to the parent process PHP Shared Memory extension (—enable-shmop) • • Wednesday, December 5, 12 More robust Compatible with other languages

Daemonization • Fork, kill parent • Orphaned child process continues running • Signal and error handling are critical • Server daemons usually fork child processes to handle requests Wednesday, December 5, 12

Wednesday, December 5, 12

Thank You! @compwright http://bit.ly/atlphpm Wednesday, December 5, 12