Development

Generic Syntax Highlighting with Regular Expressions in pure PHP

Due to Google AMP (Accelerated Mobile Pages) , I have been looking for a way to effectively do Syntax Highlighting without Javascript in pure PHP.

I was about to write my own, when I found an older article from phoboslab. Thanks Dominic for saving me some time ;) Its not perfect, but close enough.

A simple Syntax Highlighting Class that does just that. The class was not working with PHP 5.4.x+, as it uses  preg_replace() with the /e modifier.

It will not cover all, but its better than nothing :) I will also add a section to my my AMP tweaks article to showcase the integration of Geshi.

Here an updated version using the preg_replace_callback() function.

THE SYNTAX HIGHLIGHT CLASS

  1. class SyntaxHighlight {
  2.    
  3.     static $tokens = array();// This array will be filled from the regexp-callback
  4.  
  5.     public static function process($s) {
  6.         $s = htmlspecialchars($s);
  7.  
  8.         // Workaround for escaped backslashes
  9.         $s = str_replace('\\\\','\\\\<e>', $s);
  10.  
  11.         $regexp = array(
  12.  
  13.             // Punctuations
  14.             '/([\-\!\%\^\*\(\)\+\|\~\=\`\{\}\[\]\:\"\'<>\?\,\.\/]+)/'
  15.             => '<span class="P">$1</span>',
  16.  
  17.             // Numbers (also look for Hex)
  18.             '/(?<!\w)(
  19.                (0x|\#)[\da-f]+|
  20.                \d+|
  21.                \d+(px|em|cm|mm|rem|s|\%)
  22.            )(?!\w)/ix'
  23.             => '<span class="N">$1</span>',
  24.  
  25.             // Make the bold assumption that an
  26.             // all uppercase word has a special meaning
  27.             '/(?<!\w|>|\#)(
  28.                [A-Z_0-9]{2,}
  29.            )(?!\w)/x'
  30.             => '<span class="D">$1</span>',
  31.  
  32.             // Keywords
  33.             '/(?<!\w|\$|\%|\@|>)(
  34.                and|or|xor|for|do|while|foreach|as|return|die|exit|if|then|else|
  35.                elseif|new|delete|try|throw|catch|finally|class|function|string|
  36.                array|object|resource|var|bool|boolean|int|integer|float|double|
  37.                real|string|array|global|const|static|public|private|protected|
  38.                published|extends|switch|true|false|null|void|this|self|struct|
  39.                char|signed|unsigned|short|long
  40.            )(?!\w|=")/ix'
  41.             => '<span class="K">$1</span>',
  42.  
  43.             // PHP/Perl-Style Vars: $var, %var, @var
  44.             '/(?<!\w)(
  45.                (\$|\%|\@)(\->|\w)+
  46.            )(?!\w)/ix'
  47.             => '<span class="V">$1</span>'
  48.  
  49.         );      
  50.        
  51.         $s = preg_replace_callback( '/(
  52.                \/\*.*?\*\/|
  53.                \/\/.*?\n|
  54.                \#.[^a-fA-F0-9]+?\n|
  55.                \<\!\-\-[\s\S]+\-\-\>|
  56.                (?<!\\\)&quot;.*?(?<!\\\)&quot;|
  57.                (?<!\\\)\'(.*?)(?<!\\\)\'
  58.            )/isx' , array('SyntaxHighlight', 'replaceId'),$s);
  59.                        
  60.         $s = preg_replace(array_keys($regexp), array_values($regexp), $s);
  61.  
  62.         // Paste the comments and strings back in again
  63.         $s = str_replace(array_keys(SyntaxHighlight::$tokens), array_values(SyntaxHighlight::$tokens), $s);
  64.  
  65.         // Delete the "Escaped Backslash Workaround Token" (TM)
  66.         // and replace tabs with four spaces.
  67.         $s = str_replace(array('<e>', "\t"), array('', '    '), $s);
  68.  
  69.         return '<pre>'.$s.'</pre>' ;
  70.     }
  71.  
  72.     // Regexp-Callback to replace every comment or string with a uniqid and save
  73.     // the matched text in an array
  74.     // This way, strings and comments will be stripped out and wont be processed
  75.     // by the other expressions searching for keywords etc.
  76.      static function replaceId($match) {
  77.         $id = "##r" . uniqid() . "##";
  78.        
  79.         // String or Comment?
  80.         if(substr($match[1], 0, 2) == '//' || substr($match[1], 0, 2) == '/*' || substr($match[1], 0, 2) == '##' || substr($match[1], 0, 7) == '<!--') {
  81.             SyntaxHighlight::$tokens[$id] = '<span class="C">' . $match[1] . '</span>';
  82.         } else {
  83.            SyntaxHighlight::$tokens[$id] = '<span class="S">' . $match[1] . '</span>';
  84.         }
  85.                
  86.         return $id;
  87.     }
  88. }

THE CSS

  1. pre {
  2.         font-family: 'Courier New', 'Bitstream Vera Sans Mono', 'monospace';
  3.         font-size: 9pt;
  4.         border-top: 1px solid #333;
  5.         border-bottom: 1px solid #333;
  6.         padding: 0.4em;
  7.         color: #fff;
  8. }
  9. pre span.N{ color:#f2c47f; } /* Numbers */
  10. pre span.S{ color:#42ff00; } /* Strings */
  11. pre span.C{ color:#838383; } /* Comments */
  12. pre span.K{ color:#ff0078; } /* Keywords */
  13. pre span.V{ color:#70d6ff; } /* Vars */
  14. pre span.D{ color:#ff9a5d; } /* Defines */

USAGE

  1. echo SyntaxHighlight::process( $your_code );

@GitHub portalzine/UtilityBelt/SyntaxHighlight

Enjoy coding …

Alex

I am a full-stack developer. I love programming,  design and know my way around server architecture as well.  I would never feel complete, with one of these missing. I have a broad range of interests, that’s why I constantly dive into new technologies and expand my knowledge where ever required. Technologies are evolving fast and I enjoy using the latest. Apart from that, I am a peace loving guy who tries to have people around him that think the same.  I truly believe in the principle: “If you help someone, someone will help you, when you need it."

Recent Posts

Particle Network Animations in Javascript

What are particle animations? Particle network animations in JavaScript typically involve creating visual representations of… Read More

1 day ago

B&B / Hotel Booking Solutions for WordPress | 2024

BOOKING SOLUTIONS 202x This is my take on a subset of booking, appointment, PMS or… Read More

4 weeks ago

WordPress Cron + WP-CLI + Ntfy

THE GOAL Create a system cron for WordPress, that is accessible and can be easily… Read More

2 months ago

2024 is here and now :)

2024, what's cooking? Slowly getting into the 2024 spirit. 3 projects coming to a close… Read More

4 months ago

2023 ends and whats next !

Short look back at 2023 This has been a busy and interesting year. I am… Read More

4 months ago

cubicFUSION Grid Tweaker – Elementor Grid made easy.

Elementor Pro provides grid containers as an experimental feature. The options provided are limited, when… Read More

5 months ago