Introduction

What now ?

The world needs a serious PHP to PHP transpiler … period! So I started this one right here

Why ???

Because it’s 2016 and we all know readable, testable code beats fast code now. A nice introduction to the reasoning behind this viewpoint can be found here. Then again, if two pieces of code do exactly the same and the only reason one of them is slower so that it can be tested and understood nicely, why would the end user of the code have to suffer because of it? The end user surely doesn’t care about running the tests or reading the source code …

So if one were to find a way of transforming testable and easily readable code into faster, logically equivalent code, developers could continue doing what they’re doing … and end users of the code would get a boost in performance for free.

In compiled languages this kind of thing is done by the compiler to some degree. The readability aspect of things can easily be seen by anyone. Just take your fun little hello world for example:

  1. Put this into test.c:
#include "stdio.h"

int main(){
	printf("Hello World");
	return 0;
}
  1. Your compiler can turns that into assembly code first, giving you this mess:
	.file	"test.c"
	.intel_syntax noprefix
	.section	.rodata
.LC0:
	.string	"Hello World"
	.text
	.globl	main
	.type	main, <a href='https://github.com/function' class='user-mention'>@function</a>
main:
.LFB0:
	.cfi_startproc
	push	rbp
	.cfi_def_cfa_offset 16
	.cfi_offset 6, -16
	mov	rbp, rsp
	.cfi_def_cfa_register 6
	mov	edi, OFFSET FLAT:.LC0
	mov	eax, 0
	call	printf
	mov	eax, 0
	pop	rbp
	.cfi_def_cfa 7, 8
	ret
	.cfi_endproc
.LFE0:
	.size	main, .-main
	.ident	"GCC: (Debian 5.3.1-4) 5.3.1 20151219"
	.section	.note.GNU-stack,"",<a href='https://github.com/progbits' class='user-mention'>@progbits</a>

you can generate the above via for example:

gcc -m64 -masm=intel -S test.c -o test.s
  1. This your assembler and linker then take and turn it into something your CPU can understand:
gcc -m64 test.c -o test

Gives you your good old binary that you can execute. The contents of which are optimized for machine readability and performance but you would not want to be in the business in writing them from scratch. You can get a sense of the complexity in the compiled hello world executable using readelf.

readelf -a test

Showing you this information about the file:

ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x4003f0
  Start of program headers:          64 (bytes into file)
  Start of section headers:          4584 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         8
  Size of section headers:           64 (bytes)
  Number of section headers:         31
  Section header string table index: 28

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .interp           PROGBITS         0000000000400200  00000200
       000000000000001c  0000000000000000   A       0     0     1
  [ 2] .note.ABI-tag     NOTE             000000000040021c  0000021c
       0000000000000020  0000000000000000   A       0     0     4
  [ 3] .note.gnu.build-i NOTE             000000000040023c  0000023c
       0000000000000024  0000000000000000   A       0     0     4
  [ 4] .gnu.hash         GNU_HASH         0000000000400260  00000260
       000000000000001c  0000000000000000   A       5     0     8
  [ 5] .dynsym           DYNSYM           0000000000400280  00000280
       0000000000000060  0000000000000018   A       6     1     8
  [ 6] .dynstr           STRTAB           00000000004002e0  000002e0
       000000000000003f  0000000000000000   A       0     0     1
  [ 7] .gnu.version      VERSYM           0000000000400320  00000320
       0000000000000008  0000000000000002   A       5     0     2
  [ 8] .gnu.version_r    VERNEED          0000000000400328  00000328
       0000000000000020  0000000000000000   A       6     1     8
  [ 9] .rela.dyn         RELA             0000000000400348  00000348
       0000000000000018  0000000000000018   A       5     0     8
  [10] .rela.plt         RELA             0000000000400360  00000360
       0000000000000030  0000000000000018  AI       5    24     8
  [11] .init             PROGBITS         0000000000400390  00000390
       000000000000001a  0000000000000000  AX       0     0     4
  [12] .plt              PROGBITS         00000000004003b0  000003b0
       0000000000000030  0000000000000010  AX       0     0     16
  [13] .plt.got          PROGBITS         00000000004003e0  000003e0
       0000000000000008  0000000000000000  AX       0     0     8
  [14] .text             PROGBITS         00000000004003f0  000003f0
       0000000000000182  0000000000000000  AX       0     0     16
  [15] .fini             PROGBITS         0000000000400574  00000574
       0000000000000009  0000000000000000  AX       0     0     4
  [16] .rodata           PROGBITS         0000000000400580  00000580
       0000000000000010  0000000000000000   A       0     0     4
  [17] .eh_frame_hdr     PROGBITS         0000000000400590  00000590
       0000000000000034  0000000000000000   A       0     0     4
  [18] .eh_frame         PROGBITS         00000000004005c8  000005c8
       00000000000000f4  0000000000000000   A       0     0     8
  [19] .init_array       INIT_ARRAY       00000000006006c0  000006c0
       0000000000000008  0000000000000000  WA       0     0     8
  [20] .fini_array       FINI_ARRAY       00000000006006c8  000006c8
       0000000000000008  0000000000000000  WA       0     0     8
  [21] .jcr              PROGBITS         00000000006006d0  000006d0
       0000000000000008  0000000000000000  WA       0     0     8
  [22] .dynamic          DYNAMIC          00000000006006d8  000006d8
       00000000000001d0  0000000000000010  WA       6     0     8
  [23] .got              PROGBITS         00000000006008a8  000008a8
       0000000000000008  0000000000000008  WA       0     0     8
  [24] .got.plt          PROGBITS         00000000006008b0  000008b0
       0000000000000028  0000000000000008  WA       0     0     8
  [25] .data             PROGBITS         00000000006008d8  000008d8
       0000000000000010  0000000000000000  WA       0     0     8
  [26] .bss              NOBITS           00000000006008e8  000008e8
       0000000000000008  0000000000000000  WA       0     0     1
  [27] .comment          PROGBITS         0000000000000000  000008e8
       0000000000000025  0000000000000001  MS       0     0     1
  [28] .shstrtab         STRTAB           0000000000000000  000010db
       000000000000010c  0000000000000000           0     0     1
  [29] .symtab           SYMTAB           0000000000000000  00000910
       0000000000000600  0000000000000018          30    47     8
  [30] .strtab           STRTAB           0000000000000000  00000f10
       00000000000001cb  0000000000000000           0     0     1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), l (large)
  I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
  O (extra OS processing required) o (OS specific), p (processor specific)

There are no section groups in this file.

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  PHDR           0x0000000000000040 0x0000000000400040 0x0000000000400040
                 0x00000000000001c0 0x00000000000001c0  R E    8
  INTERP         0x0000000000000200 0x0000000000400200 0x0000000000400200
                 0x000000000000001c 0x000000000000001c  R      1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
                 0x00000000000006bc 0x00000000000006bc  R E    200000
  LOAD           0x00000000000006c0 0x00000000006006c0 0x00000000006006c0
                 0x0000000000000228 0x0000000000000230  RW     200000
  DYNAMIC        0x00000000000006d8 0x00000000006006d8 0x00000000006006d8
                 0x00000000000001d0 0x00000000000001d0  RW     8
  NOTE           0x000000000000021c 0x000000000040021c 0x000000000040021c
                 0x0000000000000044 0x0000000000000044  R      4
  GNU_EH_FRAME   0x0000000000000590 0x0000000000400590 0x0000000000400590
                 0x0000000000000034 0x0000000000000034  R      4
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     10

 Section to Segment mapping:
  Segment Sections...
   00     
   01     .interp 
   02     .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .plt.got .text .fini .rodata .eh_frame_hdr .eh_frame 
   03     .init_array .fini_array .jcr .dynamic .got .got.plt .data .bss 
   04     .dynamic 
   05     .note.ABI-tag .note.gnu.build-id 
   06     .eh_frame_hdr 
   07     

Dynamic section at offset 0x6d8 contains 24 entries:
  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x000000000000000c (INIT)               0x400390
 0x000000000000000d (FINI)               0x400574
 0x0000000000000019 (INIT_ARRAY)         0x6006c0
 0x000000000000001b (INIT_ARRAYSZ)       8 (bytes)
 0x000000000000001a (FINI_ARRAY)         0x6006c8
 0x000000000000001c (FINI_ARRAYSZ)       8 (bytes)
 0x000000006ffffef5 (GNU_HASH)           0x400260
 0x0000000000000005 (STRTAB)             0x4002e0
 0x0000000000000006 (SYMTAB)             0x400280
 0x000000000000000a (STRSZ)              63 (bytes)
 0x000000000000000b (SYMENT)             24 (bytes)
 0x0000000000000015 (DEBUG)              0x0
 0x0000000000000003 (PLTGOT)             0x6008b0
 0x0000000000000002 (PLTRELSZ)           48 (bytes)
 0x0000000000000014 (PLTREL)             RELA
 0x0000000000000017 (JMPREL)             0x400360
 0x0000000000000007 (RELA)               0x400348
 0x0000000000000008 (RELASZ)             24 (bytes)
 0x0000000000000009 (RELAENT)            24 (bytes)
 0x000000006ffffffe (VERNEED)            0x400328
 0x000000006fffffff (VERNEEDNUM)         1
 0x000000006ffffff0 (VERSYM)             0x400320
 0x0000000000000000 (NULL)               0x0

Relocation section '.rela.dyn' at offset 0x348 contains 1 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
0000006008a8  000300000006 R_X86_64_GLOB_DAT 0000000000000000 __gmon_start__ + 0

Relocation section '.rela.plt' at offset 0x360 contains 2 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
0000006008c8  000100000007 R_X86_64_JUMP_SLO 0000000000000000 printf@GLIBC_2.2.5 + 0
0000006008d0  000200000007 R_X86_64_JUMP_SLO 0000000000000000 __libc_start_main@GLIBC_2.2.5 + 0

The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported.

Symbol table '.dynsym' contains 4 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND printf@GLIBC_2.2.5 (2)
     2: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __libc_start_main@GLIBC_2.2.5 (2)
     3: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND __gmon_start__

Symbol table '.symtab' contains 64 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000400200     0 SECTION LOCAL  DEFAULT    1 
     2: 000000000040021c     0 SECTION LOCAL  DEFAULT    2 
     3: 000000000040023c     0 SECTION LOCAL  DEFAULT    3 
     4: 0000000000400260     0 SECTION LOCAL  DEFAULT    4 
     5: 0000000000400280     0 SECTION LOCAL  DEFAULT    5 
     6: 00000000004002e0     0 SECTION LOCAL  DEFAULT    6 
     7: 0000000000400320     0 SECTION LOCAL  DEFAULT    7 
     8: 0000000000400328     0 SECTION LOCAL  DEFAULT    8 
     9: 0000000000400348     0 SECTION LOCAL  DEFAULT    9 
    10: 0000000000400360     0 SECTION LOCAL  DEFAULT   10 
    11: 0000000000400390     0 SECTION LOCAL  DEFAULT   11 
    12: 00000000004003b0     0 SECTION LOCAL  DEFAULT   12 
    13: 00000000004003e0     0 SECTION LOCAL  DEFAULT   13 
    14: 00000000004003f0     0 SECTION LOCAL  DEFAULT   14 
    15: 0000000000400574     0 SECTION LOCAL  DEFAULT   15 
    16: 0000000000400580     0 SECTION LOCAL  DEFAULT   16 
    17: 0000000000400590     0 SECTION LOCAL  DEFAULT   17 
    18: 00000000004005c8     0 SECTION LOCAL  DEFAULT   18 
    19: 00000000006006c0     0 SECTION LOCAL  DEFAULT   19 
    20: 00000000006006c8     0 SECTION LOCAL  DEFAULT   20 
    21: 00000000006006d0     0 SECTION LOCAL  DEFAULT   21 
    22: 00000000006006d8     0 SECTION LOCAL  DEFAULT   22 
    23: 00000000006008a8     0 SECTION LOCAL  DEFAULT   23 
    24: 00000000006008b0     0 SECTION LOCAL  DEFAULT   24 
    25: 00000000006008d8     0 SECTION LOCAL  DEFAULT   25 
    26: 00000000006008e8     0 SECTION LOCAL  DEFAULT   26 
    27: 0000000000000000     0 SECTION LOCAL  DEFAULT   27 
    28: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS crtstuff.c
    29: 00000000006006d0     0 OBJECT  LOCAL  DEFAULT   21 __JCR_LIST__
    30: 0000000000400420     0 FUNC    LOCAL  DEFAULT   14 deregister_tm_clones
    31: 0000000000400460     0 FUNC    LOCAL  DEFAULT   14 register_tm_clones
    32: 00000000004004a0     0 FUNC    LOCAL  DEFAULT   14 __do_global_dtors_aux
    33: 00000000006008e8     1 OBJECT  LOCAL  DEFAULT   26 completed.6971
    34: 00000000006006c8     0 OBJECT  LOCAL  DEFAULT   20 __do_global_dtors_aux_fin
    35: 00000000004004c0     0 FUNC    LOCAL  DEFAULT   14 frame_dummy
    36: 00000000006006c0     0 OBJECT  LOCAL  DEFAULT   19 __frame_dummy_init_array_
    37: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS test.c
    38: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS crtstuff.c
    39: 00000000004006b8     0 OBJECT  LOCAL  DEFAULT   18 __FRAME_END__
    40: 00000000006006d0     0 OBJECT  LOCAL  DEFAULT   21 __JCR_END__
    41: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS 
    42: 00000000006006c8     0 NOTYPE  LOCAL  DEFAULT   19 __init_array_end
    43: 00000000006006d8     0 OBJECT  LOCAL  DEFAULT   22 _DYNAMIC
    44: 00000000006006c0     0 NOTYPE  LOCAL  DEFAULT   19 __init_array_start
    45: 0000000000400590     0 NOTYPE  LOCAL  DEFAULT   17 __GNU_EH_FRAME_HDR
    46: 00000000006008b0     0 OBJECT  LOCAL  DEFAULT   24 _GLOBAL_OFFSET_TABLE_
    47: 0000000000400570     2 FUNC    GLOBAL DEFAULT   14 __libc_csu_fini
    48: 00000000006008d8     0 NOTYPE  WEAK   DEFAULT   25 data_start
    49: 00000000006008e8     0 NOTYPE  GLOBAL DEFAULT   25 _edata
    50: 0000000000400574     0 FUNC    GLOBAL DEFAULT   15 _fini
    51: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND printf@@GLIBC_2.2.5
    52: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __libc_start_main@@GLIBC_
    53: 00000000006008d8     0 NOTYPE  GLOBAL DEFAULT   25 __data_start
    54: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND __gmon_start__
    55: 00000000006008e0     0 OBJECT  GLOBAL HIDDEN    25 __dso_handle
    56: 0000000000400580     4 OBJECT  GLOBAL DEFAULT   16 _IO_stdin_used
    57: 0000000000400500   101 FUNC    GLOBAL DEFAULT   14 __libc_csu_init
    58: 00000000006008f0     0 NOTYPE  GLOBAL DEFAULT   26 _end
    59: 00000000004003f0    42 FUNC    GLOBAL DEFAULT   14 _start
    60: 00000000006008e8     0 NOTYPE  GLOBAL DEFAULT   26 __bss_start
    61: 00000000004004e6    26 FUNC    GLOBAL DEFAULT   14 main
    62: 00000000006008e8     0 OBJECT  GLOBAL HIDDEN    25 __TMC_END__
    63: 0000000000400390     0 FUNC    GLOBAL DEFAULT   11 _init

Version symbols section '.gnu.version' contains 4 entries:
 Addr: 0000000000400320  Offset: 0x000320  Link: 5 (.dynsym)
  000:   0 (*local*)       2 (GLIBC_2.2.5)   2 (GLIBC_2.2.5)   0 (*local*)    

Version needs section '.gnu.version_r' contains 1 entries:
 Addr: 0x0000000000400328  Offset: 0x000328  Link: 6 (.dynstr)
  000000: Version: 1  File: libc.so.6  Cnt: 1
  0x0010:   Name: GLIBC_2.2.5  Flags: none  Version: 2

Displaying notes found at file offset 0x0000021c with length 0x00000020:
  Owner                 Data size	Description
  GNU                  0x00000010	NT_GNU_ABI_TAG (ABI version tag)
    OS: Linux, ABI: 2.6.32

Displaying notes found at file offset 0x0000023c with length 0x00000024:
  Owner                 Data size	Description
  GNU                  0x00000014	NT_GNU_BUILD_ID (unique build ID bitstring)
    Build ID: 854ac116e151f58ae6dc574453a7fa3498216a33

Much of which you as the developer of the hello world program did not have to think about and which got optimized without your doing.

In PHP a very similar thing happens, the interpreter does not generate an executable that can be run directly, but it compiles your code into OP-Codes that can be interpreted by the machine and then eventually turned into direct machine code instructions like those in the above output.

The important difference between the compiler and the interpreter in this case though is, that the compiler can take it’s time and optimize as best as it can from whatever code it gets. The PHP interpreter cannot do that to the same extend. An easy to understand example would be this code:

<?php

if(strlen($string) > 5)

vs:

<?php

if(isset($string[5])

The latter is one of these generally known performance micro optimizations in PHP. It is somewhat easy to understand why the latter is faster than the former: The former computes the the length of a string and then checks if it is longer than 5 chars, the latter simply looks if the string has a char at index 5 (position 6) and hence concludes that the string cannot be longer than 5 chars. Arguably the former is easier to read as it is clear to anyone what is going on, the latter needs some interpretation as to why we’re checking index 5 on a string, but is faster.

Now you could argue that PHP shouldn’t discriminate here and simply handle both code snippets the same way, given their logical equivalency. Surely this would make the former run as fast as the latter, the guy implementing this in the interpreter would surely know, that the latter is running faster and have PHP create teh OP Code for it instead of for the strlen version. But even though it’s 2016 now … reality is not a picnic and establishing logical equivalency obviously costs CPU cycles too. Sure we do have Opcaches now and all that, but to a certain degree the interpreter still has to decide whether it will be cheaper to optimize code snippet X before compiling it into OP Code or if it isn’t eventually just cheaper to simply run the code.

If you think this example through, the interpreter would have to decide along these lines for example:

  1. I compare the output of strlen with 5.
  2. The output of strlen is only used for this comparison and not subject to a nested assignment.
  3. Hence I will turn the strlen call into an isset.

In order for it to do this though, the interpreter would need to constantly keep looking ahead and/or behind when turning tokenized syntax into Op-Codes. Depending on the code in question this could be very expensive compared to simple execution of the code. In essence this is the reason interpreted languages are generally slower than compiled ones in terms of their CPU cycle use.

Adding a strong transpiler into the mix can, depending on the code in question, do some of the optimizations the interpreter cannot do at runtime and lessen the negative performance impact of interpreter use and writing readable, testable code.

Additionally PHP has another angle a transpiler can work on, it’s just a way to trivial language. PHP probably is as easy as it gets in terms of just doing thing X and not having to worry about many details or conventions. Often times following a few rules you don’t have to follow, will greatly improve the results you’re getting in terms of performance though.

Hence the motivation behind the PHP-Transpiler is to remove testability and readability complications from code going into production and after they have served their purpose in development as well as correcting design flaws, that incur a performance penalty in any case.

Let’s look at two examples …

Example One: Including and Requiring Files

Once your PHP project has grown beyond entirely trivial size, you are likely going to use multiple files to hold your code. Oftentimes though these files will still all be loaded on every run of your script. Their presence is merely a means of convenience for you while coding. Loading those additional files does come with some performance drawbacks and can be optimized away by the PHP-Transpiler. An example of this in action would be the following code consisting of two files in the same folder:

parent.php:

<?php
$start = microtime(true);
for($i=0; $i < 1000000; $i++){
	include 'child.php';
}

echo 'Running me took ' . (microtime(true) - $start) . 's';

child.php:

<?php
'a' . 'b';

Lets run this:

On my machine it’ll come out to 0.274s. Not to bad for 1M runs through that loop, but still lets see what the transpiler makes out of this. I put the files for this in the folder ‘include’ and want my transpilation result to end up in ‘include-out’, so I run:

php-transpiler transpile include include-out

Now my parent.php looks like this:

<?php $start = microtime(true);for ($i = 0;$i < 1000000;$i++){'a'.'b';}echo 'Running me took ' . (microtime(true) - $start) . 's';

Not very readable … but now the whole thing takes 0.018s all of a sudden. A more than 10x speedup in execution!. We paid a lot for having that include in this admittedly academic example, think about your huge projects though, these things do add up in some cases.

Example Two: Failing to Define Object Properties

An example of where sloppy coding can be optimized away automatically would be failing to define object properties that are eventually used at runtime.

<?php

class SloppyClass {

	public function __construct($a, $b){
		$this->a = $a;
		$this->b = $b;
	}

	public function concat(){
		return $this->a . $this->b;
	}
}

$start = microtime(true);
for($i=0; $i < 1000000; $i++){
	(new SloppyClass('a', 'b'))->concat();
}

echo 'Running me took ' . (microtime(true)- $start ) . 's';

This code will run, but your PHP bytecode compiler will not be able to optimize the hashtables it sets up for the sloppy class, instead it will have to setup a dynamic hashtable for $a and $b. So at any rate, running this takes 0.508s on my machine.

Lets run it through the transpiler :) The transpiler produces:

<?php class SloppyClass{public $b;public $a;public function __construct($a, $b){$this->a = $a;$this->b = $b;}public function concat(){return $this->a . $this->b;}}$start = microtime(true);for ($i = 0;$i < 1000000;$i++){(new SloppyClass('a', 'b'))->concat();}echo 'Running me took ' . (microtime(true) - $start) . 's';

not all that nice looking, but lets run it … 0.415s … just because you couldn’t be bothered to declare $a and $b you wasted that tenth of a second … The transpiler caught the issue though and fixed it.

Running it, it also tells you about your mistake in the crude CLI output currently implemented.

That’s it for Now

Hope this was somewhat interesting for the time being, let’s see how far I can take this project :)