Jeb Wilkins

Coding and climbing in Cumbria.

Cats and Dogs (or PHP and Rails)

I’ve been thinking a little about PHP and hosting recently. Whilst Apache with mod_php is very simple to use and incredibly popular, it does have its own sets of problems. Amongst these

  • are inefficient use of resources - every image/css/js file that gets downloaded is server by a web process with a full copy of PHP embedded in it.

  • security - all scripts run as the same user, which is whatever Apache is running as

Passenger, aka mod_rack, operates on a different mechanism - Passenger is small and runs within Apache itself, ¬†apps are spawned as a separate processes on the fly as and when needed. Apache then talks to these apps over a socket. Since they’re a separate processes, they can be run as the appropriate user avoiding security problems.

The split nature of Passenger means it also has a largely unknown feature of being able to run WSGI apps (ie Python) since they connect operate in a similar manner. So long as the app at the other end talks correctly to the socket, Passenger wont care, and Ruby’s Rack is largely a clone of WSGI from the Python world.

In many ways this is similar to Apache and FastCGI PHP but without the complexity involved in setting that up. Having set up servers with FastCGI and running PHP as the correct user I can attest that this is a non trivial setup, loosing all the deployment advantages PHP normally provides.

An interesting alternative would be to some how tie PHP to Passenger - using Passenger as a proven lightweight spawner within Apache, and having it spawn PHP processes as appropriate users on demand. The two ways to do this are either

  • writing PHP apps specifically for this system, which get loaded all at once, and then talk a Rack like protocol to Passenger through the connecting socket, clearing themselves up after each request

  • a custom PHP SAPI which listens the socket to Passenger, launches PHP scripts and waits for them to exit (much like the existing mod_php). Then feeding the output back down the socket to Passenger.

Either of these would require a new ‘app launcher’ script in Passenger to handle the initial spawning of PHP.

I think the second option is the more interesting since it would work with existing PHP scripts, and would result in PHP still operating in the normal manner. Such a SAPI is, in principle, similar to the new embedded HTTP server in PHP54 - a long running PHP process, which listens and talks on sockets. Having looked at the PHP sources this looks plausible but well outside my C skills - maybe its time I properly learnt to code C. Still its an interesting idea and I’ve not yet found anything to suggest someone is working on something similar.

And There Went the Year!

Well, 2 posts in a year - I guess that shows how busy my years been. I’m now starting to count down for a xmas break, the snow’s arrived finally and I got out for some climbing on Helvellyn today, long may it last.

I’ve not had a lot of time for personal hacking but I’ve been busy with a lot of interesting tech for client projects over the last year. Its been the usual mix of PHP and Ruby. On the PHP side I returned to Joomla for a project, and was a little surprised at how little had changed - at least that made it easy to pickup.

I’ve also been using the Kohana framework, which a co worker had recommended to me some years ago. As a Rails programmer, Kohana felt pretty natural and seemed to be well thought out, and easily extensible. The last point was the real plus for me, allowing me to add tricks like standardised form builders, rendering views as PDF, etc. The only real downer was its ORM reads the database structure on every page request - eeek! Again, this can be overridden though thankfully. Another project I came across which I’d not used before was DOMpdf - a very nice PDF generation library allowing you to use HTML and CSS to create your PDFs.

On the Ruby side, I’ve been busy with a mobile site using JQuery Mobile. Integration with the ruby site was straight forward, and as ever Rails handles adding mobile views of the pages brilliantly. JQM itself left me with mixed feelings - it made it very easy to create a mobile site, and the form handling and list layouts are brilliant. You are quite limited to their styling though - custom layouts proved hard to implement. Theming again was quite restrictive, though I found a neat way of rewriting the various CSS colours using HSL (as opposed to RGB) to get the shades I wanted. The biggest concern I have with JQM is that it requires the whole of JQuery, combined they weigh in pretty heavy for a GPRS connection. This seems unnecessary and excessive if most of your audiance are on Mobile IE9, or something WebKit based (BB, iPhone, Android). A neat project for the future would be to pull the form, and possibly the list styling out into a separate library and make that run on top of Zepto.js

Looking forward to the start of next year, I’ve got some interesting work lined up with full text searching (probably using Sphinx) and possibly a location based project using Google Maps. I may even get round developing some of the products I always have knocking around in my head!

PHP and Mod_fcgid

I’ve just moved the server this blogs hosted on and one of my objectives was set the new machine up using mod_fcgid with PHP via FastCGI rather than the more traditional apache + mod_php. Why?

  1. Apache + mod_php is not the most memory efficient solution, every time a static asset is requested, apache is forked, complete with an entire copy of php, itself much larger than the copy of Apache its embedded within.

  2. Security - mod_php is pretty bad from a security point of view - all php code runs as the web process, hence if an attacker manages to get code onto the server (eg someones insecure uploader) that code can not only read the files from the site they have compromised, but from every other site on the box.

  3. I want to run rails code on the machine as well and it seems insane to have a request destined for a rails app being processed by a small apache process with a large copy of php embedded within it (see point 1)

In reality this has proven more difficult than I’d anticipated, and is probably not as efficient as I’d hoped.

Installing Apache is relatively straightforward, the computer in question is running Ubuntu 10.04 LTS, so installing apache-mpm-worker will get you an up and running web server, and you can configure vhosts as normal.

To run PHP scripts you also need to install libapache2-mod-fcgid, php5-cgi, apache2-suexec. You’ll then need an executable FastCGI wrapper script, you need one of these per user account, and these need to exist within /var/www - so I’d recommend disabling /var/www as a web root, otherwise these scripts will be exposed to the web. The other gotcha with these scripts is that they need to be owned by the user account you are suexec’ing to, and so does their parent directory

I created the following in /var/www/fcgi-wrappers/jebw/php-fcgi-script

#!/bin/sh
PHPRC=/etc/php5/cgi/
export PHPRC
export PHP_FCGI_MAX_REQUESTS=5000
export PHP_FCGI_CHILDREN=2
exec /usr/lib/cgi-bin/php

You then need to set ownership and permissions

chown -R jebw:jebw /var/www/fcgi-wrappers/jebw
chmod 755 /var/www/fcgi-wrappers/jebw/php-fcgi-script

At this point you can setup fcgi in you apache vhost

<VirtualHost *:80>
  ServerName www.jdwilkins.co.uk
  DocumentRoot /home/jebw/website/
  <IfModule mod_fcgid.c>
    SuexecUserGroup jebw jebw
    <Directory /home/jebw/website/>
      Options +ExecCGI
      AllowOverride All
      AddHandler fcgid-script .php
      FCGIWrapper /var/www/fcgi-wrappers/jebw/php-fcgi-script .php
      Order allow,deny
      Allow from all
    </Directory>
  </IfModule>
</VirtualHost>

One final tweak is to php.ini (in /etc/php5/cgi/php.ini - since we’re using FastCGI)

cgi.fix_pathinfo=1

At that point you’re all set to go. The bonus of this method is that things like file uploads get the correct file ownership, the definite downside is that you end up with separate copies of PHP running for every user account.

PHP Data Factory

I’ve recently written a super simple Data Factory for PHP, called FactoryLib. This can be used to create test data when unit testing your PHP code.

Including this in your tests is very simple

require 'factory_lib.php' ;
Factory::$factory_data['my_table_name'] = array('name' => 'Jeb Wilkins', 'age' => '101') ;

There are two static methods which can be used to create data

  • Create - predictably used to create rows - Factory::create(‘my_other_table’)

  • Hash - returns an associative array of the data that would otherwise have been inserted - Factory::hash(‘my_table_name’)

Column values can be overridden from the defaults by supplying and array of overriding values as the second parameter, eg

Factory::create('my_table_name', array('name' => 'Alter Ego')) ;

Dynamic Definitions

You can also generate differing string data dynamically using the magic string so

Factory::$factory_data['some_table'] = array('name' => 'Test User ', 'age' => 17)) ;

for($i = 10; $i < 0; $i++)
  Factory::create('some_table') ;

would create 10 users name ‘Test User 1’ through to ‘Test User 10’. To reset the counter use either Factory::reset_counter() to reset the counter for all tables, or Factory::reset_counter(‘my_table_name’) to just reset the counter for a specific table name.

To use this in your code you can do something like the following

require_once 'PHPUnit/Framework.php' ;
require_once 'factory_lib.php' ;

$GLOBALS['show_db_errors'] = true ; // causes factory lib to output errors if something goes wrong

mysql_connect() ; //may need some connection parameters depending upon your setup
mysql_select_db('my_test_db') ; // You do _not_ want this to be your development (or production) database

Factory::$factory_data['person'] = array('name' => 'Test User ', 'address1' => 'First Line of Address') ;

class personTest extends PHPUnit_Framework_TestCase {

  function setUp() {
    Factory::truncate('person') ; // function not implemented yet but will be shortly
  }

  function testCountingUsers() {
    for($i = 0; $i < 5; $i++)
       Factory::create('person') ;

    $this->assertEquals(5, MyPersonClass::countAllPeople()) ;
  }
}

Get the code from GitHub

Hot and Cold

Its been a busy couple of weeks, its been pretty cold and icy here in Cumbria, the same as much of the country and I’ve managed to take advantage of the ice and have got a few good ice climbs in. Last week I was surfing and chilling in Fuerta Ventura where it was considerably warmer, 25 degrees, so coming home was a bit of a shock to the system. Amongst all that I’ve been working on a booking and payment component for Joomla for a client which is now nearly finished, plus the odd bit of Ruby on Rails work on one of my main projects.

Using Git and FTP to Manage Webservers

One of the clients I work with use FTP to deploy software to their servers, but when I do the development work I’d rather keep my changes in a local copy, managed under the Git version control system, hence my current workflow consists of

  1. Download Source Code

  2. Make some changes and Commit

  3. Make some more changes and commit again

  4. Download any new changes from the server (eg from other developers)

  5. Figure out all changed files from Gits logs and FTP those up to the server

I wanted a way to automate this workflow but couldn’t find anything which would do it so in the best hacker fashion I’ve written my own. Its called Munkey, and is written in Ruby and uses the Git commandline and Ruby’s Net::FTP module behind the scenes. It maintains a separate branch representing upstream which it merges from and to, and which it can use to optimise the uploads/downloads via FTP.

I can clone an FTP source with

munkey clone ftp://user:pass@server/path/to/code localfolder

I can then work with the code as normal in the created folder, when I want to pull in changes from the server as I’m going along I stash any local changes and do

munkey pull

When I’ve completed a piece of work and its ready to be uploaded, I do

munkey pull munkey push

The first pull is to make sure my copy of the files is up to date before I do the push. Both push and pull are optimised to be as quick as possible, pull only downloads files newer than the last pull (or with a different files size to the local copy) and push only uploads files which are change in my master branch.

Munkeys been up on GitHub for a while but I’ve now published it as a gem so you can just

sudo gem install munkey

This will also install FtpSync - a useful library I wrote for Munkey which can sync FTP folders with local ones.

The source to both FtpSync and Munkey are on my GitHub page http://github.com/jebw

The Web Without a Webserver

Desktop software seems to be becoming more and more of a niche requirement when it comes to bespoke software. In comparison to web based software it tends to be slower to develop, and more time consuming to roll out updates, etc.

Recently I’d had a client come to me with a requirement to write a data capture application to work in parallel with a website they were developing. There was nothing in particular that required a desktop application with the exception it needed to be available when the users were offline. I suggested they instead write the data capture part in HTML5, and depend upon their user using a modern web browser (eg anything but IE, or an IE9 preview). In reality this is no more arduous than requiring specific versions of dot net or a jvm to be installed and has the advantage of leveraging the existing work they’d done for the rest of the site.

How does this work? Pages which are required offline need to have a manifest listing their resources, these resources will be cached and made available later when offline. Users can then bookmark these pages, even if the browser is in offline mode they will still be able to get to these pages. This in turn brings in a second problem, data storage - normally developers use the likes of mysql for persistence with PHP or Rails, or something else, for generating the pages. If data is held on the server and needs to be available offline then it can be stored in localStorage, this is a very simple Hash based API for storing data, eg

window.localStorage.setItem(‘mykey’, ‘data to save’) ; window.localStorage.getItem(‘mykey’) ;

I took the approach of just JSON encoding the various data structures to store them in localStorage. WebKit based browsers have a more advanced storage option, they have an sqlite database embedded within them but since thats not an accepted standard yet and Firefox doesn’t support it and theres no word on whether IE9 will, I chose to keep things simple.

The final difference is the lack of a preprocessor - the page can be run through PHP or similar at the point its generated, but if its going to need to change after its cached in the browser, or its showing data from localStorage then those parts will need to be generated by hand in javascript.

The projects not live yet but the data capture now feels like a seamless part of the website, and the users wont need to worry about maintaining or updating a separate capture program. Since I started this project google have helpfully published a lot of information relating to the new specs and features in HTML5.

http://www.html5rocks.com/

Hacking Gstreamer in C

C has never been my strong point but I’ve just been playing around with gstreamer, and have managed to write a super simple audio player. Its using decodebin (not playbin) so can handle the various audio formats which gstreamer supports.

Pleasantly surprised at the progress I was able to make in such a short time - hats off to the gst dev’s who made this so simple. With any luck I might actually be able to achieve my goal of a gstreamer plugin for MPD eventually.

Update: Got tag reading working as well - woot!

Awesome Week on Skye

Just had an awesome holiday on Skye. Met friends up in Glen Brittle and spent a day on the ridge and a day climbing on the Cioch in stunning weather. Woke up to rain the next day so headed to Raasay, a very picturesque island of the coast of Skye where it was still sunny. Stayed there for a couple of days before returning for a mammoth days climbing back on the Cioch - eight 2 or 3 star pitches in a row!

Next day we headed up to the sea cliffs in the north by Neist point, unfortunately it bucketed down just as we turned up but the evening was sunny enough and just sat watching the sun set into the sea, very cool. The last day we cragged on the sea cliffs, we ab’d in belayed in the sun right down by the water on the wave cut platform at Destitution Point (cheery name). Later on we even saw a basking shark go past!

We headed back to sliachan campsite that evening, commenting on how we’d avoided going there in the past because of horror stories about midges and how we’d had no problems. Spoke too soon - midges got in at one in the morning and were eating us alive. We gave up and packed up there and then, drove 3 hours further south to where it was more windy and slept for the remainder of the night before carrying on home.

My First Talk

I’ve not had much time for hacking lately but I did my first talk the other day for the newly formed GeekUp in Lancaster. They were scouting round for someone to do a talk and in a fit of enthusiasm I volunteered. It seemed to go well despite me being really nervous before hand, once I was talking I seemed to find my stride though.

The talk was an overview on Version Control, both the existing centralised stuff like Subversion, and the newer Distributed systems like Git and Bazaar. Most of the audience seemed pretty familiar with Subversion so I skipped through the first half quickly and then did an ad-hoc expansion of the distributed stuff, overall the talk seemed very well received so was quite rewarding.