PHP Transpiler
Introduction
What now ?
The world needs a serious PHP to PHP transpiler … period! So I started this one right here …
Why ???
Because it’s 2016 and we all know readable, testable code beats fast code now. A nice introduction to the reasoning behind this viewpoint can be found here. Then again, if two pieces of code do exactly the same and the only reason one of them is slower so that it can be tested and understood nicely, why would the end user of the code have to suffer because of it? The end user surely doesn’t care about running the tests or reading the source code …
So if one were to find a way of transforming testable and easily readable code into faster, logically equivalent code, developers could continue doing what they’re doing … and end users of the code would get a boost in performance for free.
In compiled languages this kind of thing is done by the compiler to some degree. The readability aspect of things can easily be seen by anyone. Just take your fun little hello world for example:
- Put this into test.c:
#include "stdio.h"
int main(){
printf("Hello World");
return 0;
}
- Your compiler can turns that into assembly code first, giving you this mess:
.file "test.c"
.intel_syntax noprefix
.section .rodata
.LC0:
.string "Hello World"
.text
.globl main
.type main, <a href='https://github.com/function' class='user-mention'>@function</a>
main:
.LFB0:
.cfi_startproc
push rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
mov rbp, rsp
.cfi_def_cfa_register 6
mov edi, OFFSET FLAT:.LC0
mov eax, 0
call printf
mov eax, 0
pop rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Debian 5.3.1-4) 5.3.1 20151219"
.section .note.GNU-stack,"",<a href='https://github.com/progbits' class='user-mention'>@progbits</a>
you can generate the above via for example:
gcc -m64 -masm=intel -S test.c -o test.s
- This your assembler and linker then take and turn it into something your CPU can understand:
gcc -m64 test.c -o test
Gives you your good old binary that you can execute. The contents of which are optimized for machine readability and performance but you would not want to be in the business in writing them from scratch. You can get a sense of the complexity in the compiled hello world executable using readelf.
readelf -a test
Showing you this information about the file:
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x4003f0
Start of program headers: 64 (bytes into file)
Start of section headers: 4584 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 8
Size of section headers: 64 (bytes)
Number of section headers: 31
Section header string table index: 28
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] .interp PROGBITS 0000000000400200 00000200
000000000000001c 0000000000000000 A 0 0 1
[ 2] .note.ABI-tag NOTE 000000000040021c 0000021c
0000000000000020 0000000000000000 A 0 0 4
[ 3] .note.gnu.build-i NOTE 000000000040023c 0000023c
0000000000000024 0000000000000000 A 0 0 4
[ 4] .gnu.hash GNU_HASH 0000000000400260 00000260
000000000000001c 0000000000000000 A 5 0 8
[ 5] .dynsym DYNSYM 0000000000400280 00000280
0000000000000060 0000000000000018 A 6 1 8
[ 6] .dynstr STRTAB 00000000004002e0 000002e0
000000000000003f 0000000000000000 A 0 0 1
[ 7] .gnu.version VERSYM 0000000000400320 00000320
0000000000000008 0000000000000002 A 5 0 2
[ 8] .gnu.version_r VERNEED 0000000000400328 00000328
0000000000000020 0000000000000000 A 6 1 8
[ 9] .rela.dyn RELA 0000000000400348 00000348
0000000000000018 0000000000000018 A 5 0 8
[10] .rela.plt RELA 0000000000400360 00000360
0000000000000030 0000000000000018 AI 5 24 8
[11] .init PROGBITS 0000000000400390 00000390
000000000000001a 0000000000000000 AX 0 0 4
[12] .plt PROGBITS 00000000004003b0 000003b0
0000000000000030 0000000000000010 AX 0 0 16
[13] .plt.got PROGBITS 00000000004003e0 000003e0
0000000000000008 0000000000000000 AX 0 0 8
[14] .text PROGBITS 00000000004003f0 000003f0
0000000000000182 0000000000000000 AX 0 0 16
[15] .fini PROGBITS 0000000000400574 00000574
0000000000000009 0000000000000000 AX 0 0 4
[16] .rodata PROGBITS 0000000000400580 00000580
0000000000000010 0000000000000000 A 0 0 4
[17] .eh_frame_hdr PROGBITS 0000000000400590 00000590
0000000000000034 0000000000000000 A 0 0 4
[18] .eh_frame PROGBITS 00000000004005c8 000005c8
00000000000000f4 0000000000000000 A 0 0 8
[19] .init_array INIT_ARRAY 00000000006006c0 000006c0
0000000000000008 0000000000000000 WA 0 0 8
[20] .fini_array FINI_ARRAY 00000000006006c8 000006c8
0000000000000008 0000000000000000 WA 0 0 8
[21] .jcr PROGBITS 00000000006006d0 000006d0
0000000000000008 0000000000000000 WA 0 0 8
[22] .dynamic DYNAMIC 00000000006006d8 000006d8
00000000000001d0 0000000000000010 WA 6 0 8
[23] .got PROGBITS 00000000006008a8 000008a8
0000000000000008 0000000000000008 WA 0 0 8
[24] .got.plt PROGBITS 00000000006008b0 000008b0
0000000000000028 0000000000000008 WA 0 0 8
[25] .data PROGBITS 00000000006008d8 000008d8
0000000000000010 0000000000000000 WA 0 0 8
[26] .bss NOBITS 00000000006008e8 000008e8
0000000000000008 0000000000000000 WA 0 0 1
[27] .comment PROGBITS 0000000000000000 000008e8
0000000000000025 0000000000000001 MS 0 0 1
[28] .shstrtab STRTAB 0000000000000000 000010db
000000000000010c 0000000000000000 0 0 1
[29] .symtab SYMTAB 0000000000000000 00000910
0000000000000600 0000000000000018 30 47 8
[30] .strtab STRTAB 0000000000000000 00000f10
00000000000001cb 0000000000000000 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), l (large)
I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
O (extra OS processing required) o (OS specific), p (processor specific)
There are no section groups in this file.
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
PHDR 0x0000000000000040 0x0000000000400040 0x0000000000400040
0x00000000000001c0 0x00000000000001c0 R E 8
INTERP 0x0000000000000200 0x0000000000400200 0x0000000000400200
0x000000000000001c 0x000000000000001c R 1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x00000000000006bc 0x00000000000006bc R E 200000
LOAD 0x00000000000006c0 0x00000000006006c0 0x00000000006006c0
0x0000000000000228 0x0000000000000230 RW 200000
DYNAMIC 0x00000000000006d8 0x00000000006006d8 0x00000000006006d8
0x00000000000001d0 0x00000000000001d0 RW 8
NOTE 0x000000000000021c 0x000000000040021c 0x000000000040021c
0x0000000000000044 0x0000000000000044 R 4
GNU_EH_FRAME 0x0000000000000590 0x0000000000400590 0x0000000000400590
0x0000000000000034 0x0000000000000034 R 4
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 10
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .plt.got .text .fini .rodata .eh_frame_hdr .eh_frame
03 .init_array .fini_array .jcr .dynamic .got .got.plt .data .bss
04 .dynamic
05 .note.ABI-tag .note.gnu.build-id
06 .eh_frame_hdr
07
Dynamic section at offset 0x6d8 contains 24 entries:
Tag Type Name/Value
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
0x000000000000000c (INIT) 0x400390
0x000000000000000d (FINI) 0x400574
0x0000000000000019 (INIT_ARRAY) 0x6006c0
0x000000000000001b (INIT_ARRAYSZ) 8 (bytes)
0x000000000000001a (FINI_ARRAY) 0x6006c8
0x000000000000001c (FINI_ARRAYSZ) 8 (bytes)
0x000000006ffffef5 (GNU_HASH) 0x400260
0x0000000000000005 (STRTAB) 0x4002e0
0x0000000000000006 (SYMTAB) 0x400280
0x000000000000000a (STRSZ) 63 (bytes)
0x000000000000000b (SYMENT) 24 (bytes)
0x0000000000000015 (DEBUG) 0x0
0x0000000000000003 (PLTGOT) 0x6008b0
0x0000000000000002 (PLTRELSZ) 48 (bytes)
0x0000000000000014 (PLTREL) RELA
0x0000000000000017 (JMPREL) 0x400360
0x0000000000000007 (RELA) 0x400348
0x0000000000000008 (RELASZ) 24 (bytes)
0x0000000000000009 (RELAENT) 24 (bytes)
0x000000006ffffffe (VERNEED) 0x400328
0x000000006fffffff (VERNEEDNUM) 1
0x000000006ffffff0 (VERSYM) 0x400320
0x0000000000000000 (NULL) 0x0
Relocation section '.rela.dyn' at offset 0x348 contains 1 entries:
Offset Info Type Sym. Value Sym. Name + Addend
0000006008a8 000300000006 R_X86_64_GLOB_DAT 0000000000000000 __gmon_start__ + 0
Relocation section '.rela.plt' at offset 0x360 contains 2 entries:
Offset Info Type Sym. Value Sym. Name + Addend
0000006008c8 000100000007 R_X86_64_JUMP_SLO 0000000000000000 printf@GLIBC_2.2.5 + 0
0000006008d0 000200000007 R_X86_64_JUMP_SLO 0000000000000000 __libc_start_main@GLIBC_2.2.5 + 0
The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported.
Symbol table '.dynsym' contains 4 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FUNC GLOBAL DEFAULT UND printf@GLIBC_2.2.5 (2)
2: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __libc_start_main@GLIBC_2.2.5 (2)
3: 0000000000000000 0 NOTYPE WEAK DEFAULT UND __gmon_start__
Symbol table '.symtab' contains 64 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000400200 0 SECTION LOCAL DEFAULT 1
2: 000000000040021c 0 SECTION LOCAL DEFAULT 2
3: 000000000040023c 0 SECTION LOCAL DEFAULT 3
4: 0000000000400260 0 SECTION LOCAL DEFAULT 4
5: 0000000000400280 0 SECTION LOCAL DEFAULT 5
6: 00000000004002e0 0 SECTION LOCAL DEFAULT 6
7: 0000000000400320 0 SECTION LOCAL DEFAULT 7
8: 0000000000400328 0 SECTION LOCAL DEFAULT 8
9: 0000000000400348 0 SECTION LOCAL DEFAULT 9
10: 0000000000400360 0 SECTION LOCAL DEFAULT 10
11: 0000000000400390 0 SECTION LOCAL DEFAULT 11
12: 00000000004003b0 0 SECTION LOCAL DEFAULT 12
13: 00000000004003e0 0 SECTION LOCAL DEFAULT 13
14: 00000000004003f0 0 SECTION LOCAL DEFAULT 14
15: 0000000000400574 0 SECTION LOCAL DEFAULT 15
16: 0000000000400580 0 SECTION LOCAL DEFAULT 16
17: 0000000000400590 0 SECTION LOCAL DEFAULT 17
18: 00000000004005c8 0 SECTION LOCAL DEFAULT 18
19: 00000000006006c0 0 SECTION LOCAL DEFAULT 19
20: 00000000006006c8 0 SECTION LOCAL DEFAULT 20
21: 00000000006006d0 0 SECTION LOCAL DEFAULT 21
22: 00000000006006d8 0 SECTION LOCAL DEFAULT 22
23: 00000000006008a8 0 SECTION LOCAL DEFAULT 23
24: 00000000006008b0 0 SECTION LOCAL DEFAULT 24
25: 00000000006008d8 0 SECTION LOCAL DEFAULT 25
26: 00000000006008e8 0 SECTION LOCAL DEFAULT 26
27: 0000000000000000 0 SECTION LOCAL DEFAULT 27
28: 0000000000000000 0 FILE LOCAL DEFAULT ABS crtstuff.c
29: 00000000006006d0 0 OBJECT LOCAL DEFAULT 21 __JCR_LIST__
30: 0000000000400420 0 FUNC LOCAL DEFAULT 14 deregister_tm_clones
31: 0000000000400460 0 FUNC LOCAL DEFAULT 14 register_tm_clones
32: 00000000004004a0 0 FUNC LOCAL DEFAULT 14 __do_global_dtors_aux
33: 00000000006008e8 1 OBJECT LOCAL DEFAULT 26 completed.6971
34: 00000000006006c8 0 OBJECT LOCAL DEFAULT 20 __do_global_dtors_aux_fin
35: 00000000004004c0 0 FUNC LOCAL DEFAULT 14 frame_dummy
36: 00000000006006c0 0 OBJECT LOCAL DEFAULT 19 __frame_dummy_init_array_
37: 0000000000000000 0 FILE LOCAL DEFAULT ABS test.c
38: 0000000000000000 0 FILE LOCAL DEFAULT ABS crtstuff.c
39: 00000000004006b8 0 OBJECT LOCAL DEFAULT 18 __FRAME_END__
40: 00000000006006d0 0 OBJECT LOCAL DEFAULT 21 __JCR_END__
41: 0000000000000000 0 FILE LOCAL DEFAULT ABS
42: 00000000006006c8 0 NOTYPE LOCAL DEFAULT 19 __init_array_end
43: 00000000006006d8 0 OBJECT LOCAL DEFAULT 22 _DYNAMIC
44: 00000000006006c0 0 NOTYPE LOCAL DEFAULT 19 __init_array_start
45: 0000000000400590 0 NOTYPE LOCAL DEFAULT 17 __GNU_EH_FRAME_HDR
46: 00000000006008b0 0 OBJECT LOCAL DEFAULT 24 _GLOBAL_OFFSET_TABLE_
47: 0000000000400570 2 FUNC GLOBAL DEFAULT 14 __libc_csu_fini
48: 00000000006008d8 0 NOTYPE WEAK DEFAULT 25 data_start
49: 00000000006008e8 0 NOTYPE GLOBAL DEFAULT 25 _edata
50: 0000000000400574 0 FUNC GLOBAL DEFAULT 15 _fini
51: 0000000000000000 0 FUNC GLOBAL DEFAULT UND printf@@GLIBC_2.2.5
52: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __libc_start_main@@GLIBC_
53: 00000000006008d8 0 NOTYPE GLOBAL DEFAULT 25 __data_start
54: 0000000000000000 0 NOTYPE WEAK DEFAULT UND __gmon_start__
55: 00000000006008e0 0 OBJECT GLOBAL HIDDEN 25 __dso_handle
56: 0000000000400580 4 OBJECT GLOBAL DEFAULT 16 _IO_stdin_used
57: 0000000000400500 101 FUNC GLOBAL DEFAULT 14 __libc_csu_init
58: 00000000006008f0 0 NOTYPE GLOBAL DEFAULT 26 _end
59: 00000000004003f0 42 FUNC GLOBAL DEFAULT 14 _start
60: 00000000006008e8 0 NOTYPE GLOBAL DEFAULT 26 __bss_start
61: 00000000004004e6 26 FUNC GLOBAL DEFAULT 14 main
62: 00000000006008e8 0 OBJECT GLOBAL HIDDEN 25 __TMC_END__
63: 0000000000400390 0 FUNC GLOBAL DEFAULT 11 _init
Version symbols section '.gnu.version' contains 4 entries:
Addr: 0000000000400320 Offset: 0x000320 Link: 5 (.dynsym)
000: 0 (*local*) 2 (GLIBC_2.2.5) 2 (GLIBC_2.2.5) 0 (*local*)
Version needs section '.gnu.version_r' contains 1 entries:
Addr: 0x0000000000400328 Offset: 0x000328 Link: 6 (.dynstr)
000000: Version: 1 File: libc.so.6 Cnt: 1
0x0010: Name: GLIBC_2.2.5 Flags: none Version: 2
Displaying notes found at file offset 0x0000021c with length 0x00000020:
Owner Data size Description
GNU 0x00000010 NT_GNU_ABI_TAG (ABI version tag)
OS: Linux, ABI: 2.6.32
Displaying notes found at file offset 0x0000023c with length 0x00000024:
Owner Data size Description
GNU 0x00000014 NT_GNU_BUILD_ID (unique build ID bitstring)
Build ID: 854ac116e151f58ae6dc574453a7fa3498216a33
Much of which you as the developer of the hello world program did not have to think about and which got optimized without your doing.
In PHP a very similar thing happens, the interpreter does not generate an executable that can be run directly, but it compiles your code into OP-Codes that can be interpreted by the machine and then eventually turned into direct machine code instructions like those in the above output.
The important difference between the compiler and the interpreter in this case though is, that the compiler can take it’s time and optimize as best as it can from whatever code it gets. The PHP interpreter cannot do that to the same extend. An easy to understand example would be this code:
<?php
if(strlen($string) > 5)
vs:
<?php
if(isset($string[5])
The latter is one of these generally known performance micro optimizations in PHP. It is somewhat easy to understand why the latter is faster than the former: The former computes the the length of a string and then checks if it is longer than 5 chars, the latter simply looks if the string has a char at index 5 (position 6) and hence concludes that the string cannot be longer than 5 chars. Arguably the former is easier to read as it is clear to anyone what is going on, the latter needs some interpretation as to why we’re checking index 5 on a string, but is faster.
Now you could argue that PHP shouldn’t discriminate here and simply handle both code snippets the same way, given their logical equivalency. Surely this would make the former run as fast as the latter, the guy implementing this in the interpreter would surely know, that the latter is running faster and have PHP create teh OP Code for it instead of for the strlen version. But even though it’s 2016 now … reality is not a picnic and establishing logical equivalency obviously costs CPU cycles too. Sure we do have Opcaches now and all that, but to a certain degree the interpreter still has to decide whether it will be cheaper to optimize code snippet X before compiling it into OP Code or if it isn’t eventually just cheaper to simply run the code.
If you think this example through, the interpreter would have to decide along these lines for example:
- I compare the output of strlen with 5.
- The output of strlen is only used for this comparison and not subject to a nested assignment.
- Hence I will turn the strlen call into an isset.
In order for it to do this though, the interpreter would need to constantly keep looking ahead and/or behind when turning tokenized syntax into Op-Codes. Depending on the code in question this could be very expensive compared to simple execution of the code. In essence this is the reason interpreted languages are generally slower than compiled ones in terms of their CPU cycle use.
Adding a strong transpiler into the mix can, depending on the code in question, do some of the optimizations the interpreter cannot do at runtime and lessen the negative performance impact of interpreter use and writing readable, testable code.
Additionally PHP has another angle a transpiler can work on, it’s just a way to trivial language. PHP probably is as easy as it gets in terms of just doing thing X and not having to worry about many details or conventions. Often times following a few rules you don’t have to follow, will greatly improve the results you’re getting in terms of performance though.
Hence the motivation behind the PHP-Transpiler is to remove testability and readability complications from code going into production and after they have served their purpose in development as well as correcting design flaws, that incur a performance penalty in any case.
Let’s look at two examples …
Example One: Including and Requiring Files
Once your PHP project has grown beyond entirely trivial size, you are likely going to use multiple files to hold your code. Oftentimes though these files will still all be loaded on every run of your script. Their presence is merely a means of convenience for you while coding. Loading those additional files does come with some performance drawbacks and can be optimized away by the PHP-Transpiler. An example of this in action would be the following code consisting of two files in the same folder:
parent.php:
<?php
$start = microtime(true);
for($i=0; $i < 1000000; $i++){
include 'child.php';
}
echo 'Running me took ' . (microtime(true) - $start) . 's';
child.php:
<?php
'a' . 'b';
Lets run this:
On my machine it’ll come out to 0.274s. Not to bad for 1M runs through that loop, but still lets see what the transpiler makes out of this. I put the files for this in the folder ‘include’ and want my transpilation result to end up in ‘include-out’, so I run:
php-transpiler transpile include include-out
Now my parent.php looks like this:
<?php $start = microtime(true);for ($i = 0;$i < 1000000;$i++){'a'.'b';}echo 'Running me took ' . (microtime(true) - $start) . 's';
Not very readable … but now the whole thing takes 0.018s all of a sudden. A more than 10x speedup in execution!. We paid a lot for having that include in this admittedly academic example, think about your huge projects though, these things do add up in some cases.
Example Two: Failing to Define Object Properties
An example of where sloppy coding can be optimized away automatically would be failing to define object properties that are eventually used at runtime.
<?php
class SloppyClass {
public function __construct($a, $b){
$this->a = $a;
$this->b = $b;
}
public function concat(){
return $this->a . $this->b;
}
}
$start = microtime(true);
for($i=0; $i < 1000000; $i++){
(new SloppyClass('a', 'b'))->concat();
}
echo 'Running me took ' . (microtime(true)- $start ) . 's';
This code will run, but your PHP bytecode compiler will not be able to optimize the hashtables it sets up for the sloppy class, instead it will have to setup a dynamic hashtable for $a and $b. So at any rate, running this takes 0.508s on my machine.
Lets run it through the transpiler :) The transpiler produces:
<?php class SloppyClass{public $b;public $a;public function __construct($a, $b){$this->a = $a;$this->b = $b;}public function concat(){return $this->a . $this->b;}}$start = microtime(true);for ($i = 0;$i < 1000000;$i++){(new SloppyClass('a', 'b'))->concat();}echo 'Running me took ' . (microtime(true) - $start) . 's';
not all that nice looking, but lets run it … 0.415s … just because you couldn’t be bothered to declare $a and $b you wasted that tenth of a second … The transpiler caught the issue though and fixed it.
Running it, it also tells you about your mistake in the crude CLI output currently implemented.
That’s it for Now
Hope this was somewhat interesting for the time being, let’s see how far I can take this project :)