Apologies for the short notice--I've been rushed just trying to get the darn thing ready--but I will be giving a talk at Iowa Code Camp on Saturday, November 1st.

If you can't attend, or want to review my talk before/after I give it, you can find it here. Feel free to offer any feedback. It's much too late to make major changes at this point, but feedback is welcome regardless.

(Technically the talk is about PHP, but tagged Lisp because I blame Common Lisp for inspiring a bunch of it.)

I was highly amused by the reactions to my previous post on the programming and php subreddits. So I spent a little time to isolate the typehint conversion optional argument issue.

It’s not me, it’s PHP. And frankly, I don’t understand the point of the backtrace going to the trouble of making things references if doing things with those references isn’t even supported.

That bug description isn’t entirely accurate as to what’s going on, however, as some test cases show.

<?php

function f(int $a, int $b = null, int $c = null) {
  error_log(json_encode([$a, $b, $c]));
  error_log(json_encode(func_get_args()));
}

function changeArg($code, $str) {
  preg_match('/^Argument (\\d+) /', $str, $m);
  $i = $m[1]-1;
  $bt = debug_backtrace(null, 2);
  ++$bt[1]['args'][$i];
  return true;
}

set_error_handler('changeArg', E_ALL);

f(3, 2, 1);

Outputs:

[4,2,1]
[4,3,2]

Note that the required argument changes, but the optional argument does not. Also note that func_get_args() returns the altered argument, in disagreement with the actual argument.

Where it gets interesting is if we alter that $i to be $i+1. That is, if we change the argument after the one we got an error about.

<?php

function f(int $a, int $b = null, int $c = null) {
  error_log(json_encode([$a, $b, $c]));
  error_log(json_encode(func_get_args()));
}

function changeArg($code, $str) {
  preg_match('/^Argument (\\d+) /', $str, $m);
  $i = $m[1]-1;
  $bt = debug_backtrace(null, 2);
  if (count($bt[1]['args']) > $i+1) {
    ++$bt[1]['args'][$i+1];
  }
  return true;
}

set_error_handler('changeArg', E_ALL);

f(3, 2, 1);

Outputs:

[3,3,2]
[3,3,2]

Then the change sticks, and the optional argument is modified.

By switching to $i-1, you’ll notice that even the required argument fails to be modified. Which means in the default case where one tries to modify the argument an error was triggered about, the argument’s reference is being broken slightly earlier for optional arguments than for required arguments. That, to me, strongly suggests a bug, because the behavior is inconsistent.

But my use-case isn’t supported, and presumably it doesn’t affect any other use-cases, so I’ll either have to live with it or write a code pre-processor which fixes up the issue. Ah well.

I'm left confused about why there exists such a thing as continuable errors if fixing the error and continuing on isn't supported, but ... PHP isn't exactly known for it's strong sense of feature coherency.

Controller Methods

What we call a “controller method”, at least, is simply a function which is called on the web side via AJAX and returns some value back (generally either JSON or a snippet of HTML).

When I started at my current employer, controller methods looked like this:

class MyController extends Controller {
  function f() {
    if (!$this->required('foo', 'bar', 'baz')) return;
    if (!is_numeric($this->requestvars['foo'])) return;
    // lots of code
    if (isset($this->requestvars['quux'])) {
      // do something with quux
    }
    // more code
    return $someString;
  }
}

Aside from being hideous, this has a number of glaring problems:

  1. The error handling is terrible.
  2. Validation is hard, and thus incredibly easy to screw up or forget entirely.
  3. Even something that should be easy―merely figuring out how to call the method―requires reading and understanding the entire method.

This just won’t do.

Surely it would be much nicer if controller methods could simply be defined like a regular function. Fortunately, PHP offers some manner of reflexive capabilities, meaning we can ask it what arguments a function takes. We can then match up GET/POST parameters by name, and send the function the proper arguments.

In other words, we can define the function more like:

class MyController extends Controller {
  function f($foo, $bar, $baz, $quux = null) {
    // lots of code
    if (isset($quux)) {
      // do something with quux
    }
    // more code
    return new Response\HTML($someString);
  }
}

And have it actually work. That’s much nicer. Now, we can call the method from PHP as easily as we call it from JavaScript, and we don’t have to read the entire function to figure out what arguments it takes.

(The astute reader will also notice I’ve moved to returning an object, so the response has a type. This is super-handy, because now it’s easy to ensure we send the apropriate content-type, enabling the JS side to do more intelligent things with it.)

Of course, this only tells us which arguments it takes, and whether they’re optional or required. We still need easier data validation. PHP provides type hints, but they only work for classes. Or do they?

Type Hints

In a brazen display of potentially ill-advised hackery (our code is a little more involved, but that should give you the general idea), I added an error handler that enables us to define non-class types to validate things.

So now we can do this:

class MyController extends Controller {
  function f(
    int $foo,
    string $bar,
    string $baz,
    int $quux = null
  ) {
    // lots of code
    if (isset($quux)) {
      // do something with quux
    }
    // more code
    return new Response\HTML($someString);
  }
}

And all the machinery ensures that by the time f() is executing, $foo looks like an integer, as does $quux if it was provided.

Now the caller of the code can readily know what the value of the variables should look like, and the programmer of the function doesn’t really have an excuse for not picking a type because it’s so easy.

Of course, this isn’t sufficient yet either. For instance, if I’d like to be able to pass a date into the controller, it has to be a string. Then the writer of the controller has to convert it to an appropriate class. Surely it’d be much nicer if the author of the controller method could say “I want a DateTime object”, which would be automagically converted from a specially-formatted string sent by the client.

Type Conversion via Typehints

Because PHP provides references via the backtrace mechanism, we can modify the parameters a function was called with.

class MyController extends Controller {
  function f(
    int $foo,
    string $bar,
    DateTime $baz,
    int $quux = null
  ) {
    // lots of code
    if (isset($quux)) {
      // do something with quux
    }
    // more code
    return new Response\HTML($someString);
  }
}

So while $baz might be POSTed as baz=2014-08-16, what f() gets is a PHP DateTime object representing that date. Due to the implementation mechanism, even something as simple as:

$mycontroller->f(1, “bar”, “2014-08-16”);

Will result in $baz being a DateTime object inside f().

Caveat

There is an unfortunate caveat, and I have yet to figure out if it’s a quirk of the way I implemented things, or a quirk in the way PHP is implemented, but optional arguments do not change. That is, SomeClass $var = null will result in $var still being a string. func_get_args() will contain the altered value, however.

Multiple Inheritance and Method Combinations

PHP is a single inheritance language. Traits add some ability to build mixins, which is super-handy, but has some annoying restrictions. Particularly around calling methods―in particular, you can’t define a method in a trait, override it in a class which uses a trait, and then call the trait method from the class method. At least, not easily and generally.

Plus there’s no concept of method combinations. It’d be really handy to be able to say “hey, add this stuff to the return value” (e.g., by appending to an array) and have it just happen, rather than having to know how to combine your stuff with the parent method’s stuff.

While I’m sad to say I don’t have this working generally across any class, I have managed to get it working for a particular base class where it’s most useful to our codebase. Subclasses and traits can define certain methods, and when called, the class heirarchy will be automatically walked and the results of calling each method in the heirarchy will be combined.

trait BobsJams {
  static function BobsJams_getAdditionalJams() {
    return [ new CranberryJam(), new StrawberryJam() ];
  }
}

trait JimsJams {
  static function JimsJams_getAdditionalJams() {
    return [ new BlackberryJam() ];
  }
}

class Jams {
  function getJams() {
    return (new MethodCombinator([], 'array_merge'))
      ->execute(new ReflectionClass(get_called_class()), "getAdditionalJams");
  }
}

class FewJams extends Jams {
  static function getAdditionalJams() {
    return [ new PineappleJam() ];
  }
}

class LotsOJams extends FewJams {
  use BobsJams;
  use JimsJams;

  static function getAdditionalJams() {
    return [ new OrangeJam() ];
  }
}

(new LotsOJams())->getJams();
// => [ OrangeJam, CranberryJam, StrawberryJam, BlackberryJam, PineappleJam ]

(The somewhat annoying prefix on the traits’ method names is to avoid forcing users of a trait to deal with name collisions.)

Naturally, all the magic of the getJams() method is hidden away in the MethodCombinator class, but it just walks the class hierarchy―traits included―using the C3 Linearization algorithm, calls those methods, and then combines them all using the combinator function (in this case, array_merge).

This, as you might imagine, greatly simplifies some code.

Oh, but you’re not impressed by shoehorning some level of multiple inheritance into a singly-inherited language? Fine, how about…

Context-Sensitive Object Behavior

Web code tends to be live, while mobile code is harshly asynchronous (as in: still needs to function when you have no signal, and then do something reasonable with data changes when you do have signal again), so what we care about changes between our Mobile API and our Web code, and yet we’d still like to share the basic structure of any given piece of data so we don’t have to write things twice or keep twice as much in our heads.

Heavily inspired by Pascal Costanza’s Context-Oriented-Programming, we define our data structures something like this:

class MyThing extends Struct {
  public $partA;
  public $userID;
  // ...
  function getAdditionalDefaultContextualComponents() {
    return [ new MyThingWebUI(), new MyThingMobileAPI() ];
  }
}

class MyThingWebUI extends Contextual {
  public $isReadOnly;
  // ...
  function getApplicableLayer() { return "WebUI"; }
}

class MyThingMobileAPI extends Contextual {
  public $partB;
  // ...
  function getApplicableLayer() { return "MobileAPI"; }
}

The two Contextual subclasses define things that are only available within particular contexts (layers). Thus, within the context of WebUI, MyThing appears from the outside to look like:

{
  "partA": "foo",
  "userID": 12,
  "isReadOnly": false,
}

But within the Mobile API, that same $myThing object looks like:

{
  "partA": "foo",
  "userID": 12,
  "partB": "bar",
}

In addition to adding new properties, each layer can also exclude keys from JSON serialization, add aliases for keys (thus allowing mobiles to send/fetch data using old_key, when we rename something to new_key), and probably a few other things I’m forgetting.

Conclusion

PHP is remarkably malleable. error_handlers can be used as a poor-man’s handler-bind (unlike exceptions, they run before the stack is unwound, but you’re stuck dispatching on regular expressions if you want more than one); scalar type hints can be provided as a library; and traits can be abused to provide a level of multiple inheritance well beyond what was intended. While this malleability is certainly handy, I miss writing code in a language that doesn’t require jumping through hoops to provide what feel like basic facilities. But I’m also incredibly glad I can draw from the well of ideas in Common Lisp and bring some of that into the lives of developers with less exposure to the fantastic facilities Lisp provides.

Bonus!

My employer is desperate for user feedback, and as such is offering a free eight week trial. So if you want to poke at stuff and mock me when things don’t work very well (my core areas are nutritional analysis for recipes and food-related search results), that’s a thing you can do.

If you're outside the US, I should warn you that we have a number of known bugs and shortcomings you're much more likely to hit (we use a US-based product database; searching for things outside ASCII doesn't work due to MySQL having columns marked as the wrong charset; and there's a lot of weirdness around time because most user times end up stored as unix timestamps). The two bugs will be fixed eventually, but since they're complicated and as the US is our target market they're not exactly at the top of the list.

The job continues to go well. Yay! I've even managed to pick up a few lessons.

It's not bad code, it's just different
As someone who has been historically quick to declare code crap, it's a refreshing change of pace to pick up on the attitude that maybe this code isn't crap, maybe I just don't understand it very well yet. And I'm trying to take that to heart: there's plenty of things in the code base that aren't the way I'd have written them, but that doesn't make the way they're written any less valid.
It's okay to go home
In spite of working at a start-up, all of my coworkers are older than I am, and nearly all of them have families. That means they go home; they understand being sick and prefer you stay home and not infect everyone else; they get that sometimes you need to call a plumber and will be working from home that day.
Bring up issues, then fix them
Being a lone developer for so long, I got used to "see thing I think is problem, fix thing I think is problem". I'm still trying to get the hang of when I can just do something and when I need to discuss it with other developers first.

Of course, my coworkers aren't perfect

Sometimes they worry about micro-optimizations

Is it better to use autoloading, or explicit requires? That leads to the question of which is faster, and regrettably, I fall into the same trap and respond in kind "Well, X is faster because of Y.". The real answer, of course, is that it's a micro-optimization which is entirely irrelevant to the speed of our application1 and we should be worried about which is easier to maintain.

(NB: We've since switched entirely to autoloading for code maintenance reasons.)

Limited experience "at scale"
Nearly all the other developers worked together at a prior company building a C++ application where each installation served maybe a hundred users. Naturally, this affects their perception of how to structure a program and what things are efficient, so there's a lot of functions which do things like return a resultset instead of a data structure to avoid the extra loop. Much of the code passes around row ids, rather than objects or arrays of data, so the db gets queried for the same data a lot. (We are making progress on this front, but new features tend to take priority.)
Limited web experience
I've found and either closed, or built tools which help avoid, numerous classes of web-specific security vulnerabilities.

But overall, I have the distinct privilege of working with some intelligent people who are passionate about their work, have strong opinions formed through years of experience, who nonetheless still maintain an openness to trying new ways of doing things (and that's no platitude: I've spearheaded at least half a dozen major changes thus far2).

But then, neither am I

Have yet to fully assimilate the formatting convention
The biggest annoyance to my coworkers is probably my habit of doing
if (condition) do_thing();
rather than
if (condition) {
  do_thing();
}
But it's certainly not the only formatting difference I haven't managed to avail myself of. And even without formatting differences, I tend to write code in a very different style―if you ever come across the words 'curry' or 'filter' in our codebase, there are very good odds I touched that code.
Or, really, any of the existing conventions
Lowercase file names was the convention, until I introduced a lib/ directory and started filling it with mixed-case filenames which matched the case of the class names.
Accidentally co-opted the name of the mobile API
The API used by the phones was referred to as "the mobile API", or often just "API" for short. I introduced a new convention for routing AJAX requests to controller methods, and called it the "web API", or usually just "API" for short. This caused enough confusion that the mobile API was renamed "services". I still wince a little whenever somebody says "services" because it's my fault.

Of course, liking work does have some downsides

Because work is fulfilling, it's much easier to spend time working on work, rather than taking the hit of context-switching to a side-project on the weekends. I sometimes manage to not work on weekends, but that's very different from working on something else. So the various projects I maintain, or have plans to write, make no progress.

Footnotes

  1. Our deployment servers have APC turned on and configured to not even bother checking the filesystem for updates. That's a bigger performance win than any quibbling over autoloading-vs-require_once we could do, and even that is a pretty negligible difference.
  2. Our most recent hire has begun to spearheading a few, as well. Hooray for new hires not yet being numb to the pain points!

(Why yes, I am still alive. Just very bad at writing lately. :'()

Much to my surprise, the FTP protocol has managed to not die yet, and I recently received a patch from Rafael Jesús Alcántara Pérez to get cl-ftp running on ABCL.

So thanks to Rafael's patch, I've finally taken the time to convert cl-ftp from darcs to git, and tossed it up on github like all the cool kids are doing these days. Supposedly it even now runs on ABCL.

Enjoy!

November 2014

S M T W T F S
       1
2345678
9101112131415
16171819202122
23242526272829
30      

Syndicate

RSS Atom

Most Popular Tags