As we have previously announced, Wikipedia migrated its platform to HHVM and we have found some interesting details on their experience posted by Ori Livneh in a Wikimedia blog post. Today we are reading more details on this Wikipedia migration from HHVM guys, since Brett Simmers spent four weeks in July and August of 2014 working at the Wikimedia Foundation office in San Francisco to help them out with some final migration issues.
Most of the time was spent on fixing HHVM’s DOMDocument support, since Wikimedia uses a custom PHP stream wrapper, which HHVM implementation of DOMDocument didn’t properly support. Brett explained that :
The main problem blocking its merge was a potential security issue: the default PHP settings allow XML external entity injection attacks, which we didn’t want to allow in HHVM.
Also a good amount of time was spent debugging an issue with MediaWiki’s Lua scripting extension. This extension is compatible with both PHP 5.x and HHVM thanks to HHVM’s Zend compatibility layer; if you’re thinking of switching to HHVM but have some native extensions you depend on, this is a great example to follow. After fixing all these issues, the team spent some time looking at HHVM’s performance on the MediaWiki workload.
It’s true that HHVM learned already a lot from Facebook PHP code which contains recurring idioms and design patterns, but it’s interesting to notice that they still found two easy wins :
the first of which was simply that the default Ubuntu PCRE package didn’t have the JIT enabled. We rebuilt PCRE with the JIT option enabled, which HHVM knows how to take advantage of, and saw a 8% improvement in the parsing benchmark.
The second is related to PHP classes destructor functions and then it helped gaining another 4% by improving the wikitext parser benchmark :
PHP classes can have destructor functions that are run when instances of that class are destroyed, even if that only happens at the very end of the request because the object was kept alive by a global variable or a cyclic reference. Exactly matching PHP’s behavior here requires a small but measurable amount of extra bookkeeping at runtime, so HHVM has an option to not run destructors for objects that are alive at the end of a request.
Brett pointed to another important point is that Wikipedia migrated from a “fairly old” PHP 5.3 version, While PHP 5.6 have some real performance improvements over 5.3. The graph below is a comparison between HHVM and PHP5.3 and show a gain of 45% using HHVM. It will be great to do some benchmark using PHP 5.6 to see the difference :
You can also notice the difference especially in the CPU usage before and after moving to HHVM which is amazing :
A great experience and big thanks to the Wikipedia and HHVM guys for sharing these details with the community.