PHP is most often used for developing web based applications. But PHP can also be used for developing starting scripts using the command line. Scripts can easily be run as cron jobs, daemons, gearman workers and some other techniques to execute a set of tasks periodically, on-demand and/or parallel. In this blog post I introduce a new open source project I started for developing a php daemon/process starter. Before I start rambling about the project, called PhpTaskDaemon, I will explain more about the problem I try to solve and the current available technologies for solving the same problem. After this I will explain more about the project itself and the problem it tries to solve. Finally I will outline the project requirements and introduce the future blog post.
What? Using PHP for something else than generating HTML?
In some cases it is desirable to run a PHP script periodically to do system administrative tasks. For example: periodically delete old database records, collect new information from the Internet, talk to a serial device to control your power outlet wireless and all kind of other scripts. Two differences with developing web applications are the time a process can take and the way scripts/functions are triggered. A process can even take forever using a simple while(true) statement.
Running PHP in the background
There are different ways to run PHP as a background process. Below is a list of possible solutions.
- Command Line: The command line can be used to start php scripts. Such approach is ideal for system administrative tasks, which needs to run only once at a production environment.
- Crontab: However the command line can be used to start scripts manually, the linux crontab is a more conventional way to start scripts. But this does not solve all problems. A crontab only runs once per minute and also can not ensure to start only a single instance of the task. (this can be fixed checking it in the php script or using a bash script).
- Custom Daemon: Creating a daemon is relatively simply using a simple while(true) statement. (memory, device and network availability, disk space and no other reasons to crash an application).
- Parallel script: Tasks heavily depending on external resources such as web services do not require a lot of processing power, memory and or disk space of the host computer itself. Therefore it can be faster to run multiple tasks of such kind in parallel. In an environment with pcntl enabled processes can be forked to clone a running process, where the child starts and parent continue at a different line. The draw back of such approach is managing the creating and cleaning up of child processes. Inter Process Communication can be achieved through usage of shared memory, semaphores, message queues and other php beginner stuff :-).
- Gearman: Gearman provides a good solution for running workers in the background asynchronously and run multiple workers at the same time. Starting a lot of different workers can achieved by registering all the functions before starting the worker. The worker than handles tasks sequentially offered by the gearman job server. Multiple workers can be started in parallel by starting the processes manually or a simple shell script to start all the instances and run them in the background (using &).
In some of my projects I need to start multiple gearman workers and/or a combination of the process types mentioned above. When the number of tasks and/or instances increase it is hard to start and monitor all processes. I would like to see an easier solution to start and monitor multiple background tasks with different triggers. The solution I propose tries to abstract the listed examples in a single executable script.
PhpTaskDaemon is a library for creating php daemons for unix environments (requirement: pcntl and posix extension). It provides a simple api consisting of two methods for defining the loading of a task queue and the execution of a single task. The tasks are run by a manager, which defines the way when and how tasks are executed. A single command line script is used to start, stop and monitor the daemon. The features and requirements and wishlist of the application is listed below. The source code of the project can be found on GitHub.
- Run a set of task with a single script
- Start multiple workers/instances of tasks
- Define a task regardless* of the way of running the task (extend or config a task)
- Single way for configuring daemon settings with a single config file
- Single way for logging daemon executing to one or more logfiles.
- Execute in the background (daemonize)
- Single way of logging to one or more logfiles
- Single way of reading to one or more configfiles
- Nicely shutting down the system on interrupts
- Nicely changing user and group permissions
- Unix startup script (/etc/init.d)
- Monitoring current processes
- Statistics of historic processes
- Web interface for the daemon
Blog posts in this series
This blog post is an item in a series of blog posts about the development of the PhpTaskDaemon project. Follow the real progress of the project on the GitHub project page. The following blog posts of this series have been published in the past:
* An introduction of the PhpTaskDaemon project (this post)
Next time I will blog about the following aspects of the PhpTaskDaemon system.
* The current state of running PHP scripts using the command line (updated: 2 jan 2011)
* Defining Tasks (updated 16 jan 2011)
* Running and monitoring the daemon
* Building managers: shared memory, semaphores and sockets
* Creating a small web front end for monitoring the daemon