Ive been in a long journey trying to find a great code highlighter, Ive been using a lot of them that I cant even remember. These are the ones I can remember right now: SyntaxHighlighter Google Prettifier highlighter.js Geshi Right now Im
i’ve been in a long journey trying to find a great code highlighter, i’ve been using a lot of them that i can’t even remember. these are the ones i can remember right now:
Right now I’m using highlighter.js but it wasn’t exactly what I want, what I want is to be able to highlight most “words” or reserved words, such as built in function, objects, etc. that this highlighter and most of them are missing. I know is not an important thing, unfortunately this was stuck in my head, until now.
Finally, I’ve found Pygments the perfect one that match with what I’ve been looking for and it’s the same used by GitHub. The only obstacle right now is that it’s a python based syntax highlighter and I’m using WordPress, and WordPress is built on PHP.
But hey, we can get over it, there is a solution, first, we need to get python installed on our server so we can use Pygments.
We aren’t going to go too deep on installation due to the fact that there are so many OS Flavors out there and it could be slightly different on each one of them.
立即学习“PHP免费学习笔记(深入)”;
First of all you have to check if you already have python installed by typing python on your command line.
If not is installed, you should take a look at Python Downloads page and download your OS installer.
To install pip installer according to its site, there are two ways to install it:
First and recommended way is downloading get-pip.py and run it on your command line:
python get-pip.py
Second way is using package managers, by running one of these possible two commands, like it have been mentioned before, this would depends on your server OS.
sudo apt-get install python-pip
Or:
sudo yum install python-pip
NOTE: you can use any package manager you prefer, such as easy_install, for the sake of example and because is the one used on Pygments site I used pip.
To install pygments you need to run this command:
pip install Pygments
If you are on server where the user don’t have root access, you would be unable to install it with the previous command, if that is the case you have to run it with --user flag to install the module on the user directory.
pip install --user Pygments
Everything is installed now, so what we got to do is work with PHP and some Python code
The way it’s going to work, it’s by executing a python script via php using exec() sending the language name and a filename of the file containing the code to be highlighted.
The first thing we are going to do is create the python script that is going to convert plain code into highlighted code using Pygments.
So let’s go step by step on how to create the python script.
First we import all the required modules:
import sys from pygments import highlight from pygments.formatters import HtmlFormatter
sys module provide the argv list which contains all the arguments passed to the python script.
highlight from pygments is in fact the main function along with a lexer would generate the highlighted code. You would read a bit more about lexer below.
HtmlFormatter is how we want the code generated be formatted, and we are going to use HTML format. Here is a list of available formatters in case of wondering.
# Get the code language = (sys.argv[1]).lower() filename = sys.argv[2] f = open(filename, 'rb') code = f.read() f.close()
This block of code what it does is that it takes the second argument (sys.argv[1]) and transform it to lowercase text just to make sure it always be lowercase. Because "php" !== "PHP". The third argument sys.argv[2] is the filename path of the code, so we open, read its contents and close it. The first argument is the python’s script name.
# Importing Lexers # PHP if language == 'php': from pygments.lexers import PhpLexer lexer = PhpLexer(startinline=True) # GUESS elif language == 'guess': from pygments.lexers import guess_lexer lexer = guess_lexer( code ) # GET BY NAME else: from pygments.lexers import get_lexer_by_name lexer = get_lexer_by_name( language )
So it’s time to import the lexer, this block of code what it does is create a lexer depending on the language we need to analyze. A lexer what it does it analyze our code and grab each reserved words, symbols, built-in functions, and so forth.
In this case after the lexer analyze all the code would formatted into HTML wrapping all the “words” into an HTML element with a class. By the way the classes name are not descriptive at all, so a function is not class “function”, but anyways this is not something to be worried about right now.
The variable language contains the string of the language name we want to convert the code, we use lexer = get_lexer_by_name( language ) to get any lexer by their names, well the function it self explanatory. But why we check for php and guess first you may ask, well, we check for php because if we use get_lexer_by_name('php') and the php code does not have the required opening php tag is not going to highlight the code well or as we expected and we need to create a the specific php lexer like this lexer = PhpLexer(startinline=True) passing startinline=True as parameter, so this opening php tag is not required anymore. guess is a string we pass from php letting it know to pygments we don’t know which language is it, or the language is not provided and we need it to be guessed.
There is a list of available lexers on their site.
The final step on python is creating the HTML formatter, performing the highlighting and outputing the HTML code containing the highlighted code.
formatter = HtmlFormatter(linenos=False, encoding='utf-8', nowrap=True) highlighted = highlight(code, lexer, formatter) print highlighted
For the formatter it’s passed linenos=False to not generate lines numbers and nowrap=True to not allow div wrapping the generate code. This is a personal decision, the code would be wrapped using PHP.
Next it’s passed code containing the actual code, lexer containing the language lexer and the formatter we just create in the line above which tell the highlight how we want our code formatted.
Finally it’s output the code.
That’s about it for python, that the script that is going to build the highlight.
Here is the complete file: build.py
import sys from pygments import highlight from pygments.formatters import HtmlFormatter # If there isn't only 2 args something weird is going on expecting = 2; if ( len(sys.argv) != expecting + 1 ): exit(128) # Get the code language = (sys.argv[1]).lower() filename = sys.argv[2] f = open(filename, 'rb') code = f.read() f.close() # PHP if language == 'php': from pygments.lexers import PhpLexer lexer = PhpLexer(startinline=True) # GUESS elif language == 'guess': from pygments.lexers import guess_lexer lexer = guess_lexer( code ) # GET BY NAME else: from pygments.lexers import get_lexer_by_name lexer = get_lexer_by_name( language ) # OUTPUT formatter = HtmlFormatter(linenos=False, encoding='utf-8', nowrap=True) highlighted = highlight(code, lexer, formatter) print highlighted
Let’s jump to WordPress and create a basic plugin to handle the code that needs to be highlighted.
追格企业官网小程序开源版(又称企业官网小程序Free),专为中小企业开发的轻量级企业建站小程序(基于Uniapp+WordPress+php+mysql),后台操作简单,维护方便,无需过多配置就能搭建一个企业小程序。
0
It’s does not matter if you have never create a plugin for WordPress in your entire life, this plugin is just a file with php functions in it, so you would be just fine without the WordPress plugin development knowledge, but you need knowledge on WordPress development though.
Create a folder inside wp-content/plugins named wp-pygments (can be whatever you want) and inside it copy build.py the python script we just created and create a new php file named wp-pygments.php (maybe the same name as the directory).
The code below just let WordPress know what’s the plugin’s name and other informations, this code is going to be at the top of wp-pygments.php.
<?php /* * Plugin Name: WP Pygments * Plugin URI: http://wellingguzman.com/wp-pygments * Description: A brief description of the Plugin. * Version: 0.1 * Author: Welling Guzman * Author URI: http://wellingguzman.com * License: MEH */ ?>
Add a filter on the_content to look for <div class="code" style="position:relative; padding:0px; margin:0px;"><pre class="brush:php;toolbar:false;"></code> tags. the code expected is: </p>
<div class="code" style="position:relative; padding:0px; margin:0px;"><pre class="brush:php;toolbar:false;">
$name = "World";
echo "Hello, " . $name;
</pre>
Where class is the language of the code inside pre tags, if there is not class or is empty would pass guess to build.py.
add_filter( 'the_content', 'mb_pygments_content_filter' );
function mb_pygments_content_filter( $content )
{
$content = preg_replace_callback('/]?.*?>.*?(.*?)<\/code>.*?<\/pre>/sim', 'mb_pygments_convert_code', $content);
return $content;
}
preg_replace_callback function would execute mb_pygments_convert_code callback function every time there's a match on the content using the regex pattern provided: /<div class="code" style="position:relative; padding:0px; margin:0px;"><pre class="brush:php;toolbar:false;"(\s?class\="(.*?)")?[^>]?.*?>.*?<code>(.*?)<\/code>.*?<\/pre>/sim</code>, it should match on any <em><pre class="brush:php;toolbar:false;"><code></em> on a post/page content.</p>
<p>What about <strong>sim</strong>?, these are three pattern modifiers flags. From php.net:
<ul>
<li><strong>s</strong>: If this modifier is set, a dot metacharacter in the pattern matches all characters, including newlines.</li>
<li><strong>i</strong>: If this modifier is set, letters in the pattern match both upper and lower case letters.</li>
<li><strong>m</strong>: By default, PCRE treats the subject string as consisting of a single "line" of characters (even if it actually contains several newlines).</li>
</ul>
</p>
<p>This can be done with <code>DOMDocument();</code> as well. replace <code>/<pre class="brush:php;toolbar:false;"(\s?class\="(.*?)")?[^>]?.*?>.*?<code>(.*?).*?/sim</code> with this: </p>
<pre class="brush:php;toolbar:false;">
// This prevent throwing error
libxml_use_internal_errors(true);
// Get all pre from post content
$dom = new DOMDocument();
$dom->loadHTML($content);
$pres = $dom->getElementsByTagName('pre');
foreach ($pres as $pre) {
$class = $pre->attributes->getNamedItem('class')->nodeValue;
$code = $pre->nodeValue;
$args = array(
2 => $class, // Element at position [2] is the class
3 => $code // And element at position [2] is the code
);
// convert the code
$new_code = mb_pygments_convert_code($args);
// Replace the actual pre with the new one.
$new_pre = $dom->createDocumentFragment();
$new_pre->appendXML($new_code);
$pre->parentNode->replaceChild($new_pre, $pre);
}
// Save the HTML of the new code.
$content = $dom->saveHTML();
</pre> function.
define( 'MB_WPP_BASE', dirname(__FILE__) );
function mb_pygments_convert_code( $matches )
{
$pygments_build = MB_WPP_BASE . '/build.py';
$source_code = isset($matches[3])?$matches[3]:'';
$class_name = isset($matches[2])?$matches[2]:'';
// Creates a temporary filename
$temp_file = tempnam(sys_get_temp_dir(), 'MB_Pygments_');
// Populate temporary file
$filehandle = fopen($temp_file, "w");
fwrite($filehandle, html_entity_decode($source_code, ENT_COMPAT, 'UTF-8') );
fclose($filehandle);
// Creates pygments command
$language = $class_name?$class_name:'guess';
$command = sprintf('python %s %s %s', $pygments_build, $language, $temp_file);
// Executes the command
$retVal = -1;
exec( $command, $output, $retVal );
unlink($temp_file);
// Returns Source Code
$format = '%s';
if ( $retVal == 0 )
$source_code = implode("\n", $output);
$highlighted_code = sprintf($format, $language, $source_code);
return $highlighted_code;
}
Reviewing the code above:
define( 'MB_WPP_BASE', dirname(__FILE__) );
define a absolute plugin's directory path constant.
$pygments_build = MB_WPP_BASE . '/build.py'; $source_code = isset($matches[3])?$matches[3]:''; $class_name = isset($matches[2])?$matches[2]:'';
$pygments_build is the full path where the python script is located. Every time there is a match an array called $matches is passed containing 4 element. Take this as an example of a matched code from post/page content:
$name = "World"; echo "Hello, " . $name;
The element at position [0] is the whole
match, and its value is:
$name = "World"; echo "Hello, " . $name;
The element at position [1] is the class attribute name with its value, and its value is:
class="php"
The element at position [2] is the class attribute value without its name, and its value is:
php
The element at position [3] is the code itself without its pre tags, and its value is:
$name = "World"; echo "Hello, " . $name;
// Creates a temporary filename $temp_file = tempnam(sys_get_temp_dir(), 'MB_Pygments_');
it creates a temporary file containing the code that would be passed to the python script. it's a better way to handle the code would be passed. instead of passing this whole thing as a parameters it would be a mess.
// Populate temporary file $filehandle = fopen($temp_file, "wb"); fwrite($filehandle, html_entity_decode($source_code, ENT_COMPAT, 'UTF-8') ); fclose($filehandle);
It creates the file of the code, but we decode all the HTML entities, so pygments can convert them properly.
// Creates pygments command
$language = $class_name?$class_name:'guess';
$command = sprintf('python %s %s %s', $pygments_build, $language, $temp_file);
It creates the python command to be used, it outputs:
python /path/to/build.py php /path/to/temp.file
// Executes the command
$retVal = -1;
exec( $command, $output, $retVal );
unlink($temp_file);
// Returns Source Code
$format = '%s';
if ( $retVal == 0 )
$source_code = implode("\n", $output);
$highlighted_code = sprintf($format, $language, $source_code);
Executes the command just created and if returns 0 everything worked fine on the python script. exec(); return an array of the lines outputs from python script. so we join the array outputs into one string to be the source code. If not, we are going to stick with the code without highlight.
So by now with work fine, but we have to save time and processing, imagine 100 <div class="code" style="position:relative; padding:0px; margin:0px;"><pre class="brush:php;toolbar:false;"></code> tags on a content it would creates 100 files and call 100 times the python script, so let's cache this baby.</p>
<h3>Transient API</h3>
<p>WordPress provide the ability of storing data on the database temporarily with the Transient API.</p>
<p>First, let's add a action to <code>save_post</code> hook, so every time the post is saved we convert the code and cache it.</p>
<pre class="brush:php;toolbar:false;">
add_action( 'save_post', 'mb_pygments_save_post' );
function mb_pygments_save_post( $post_id )
{
if ( wp_is_post_revision( $post_id ) )
return;
$content = get_post_field( 'post_content', $post_id );
mb_pygments_content_filter( $content );
}
</pre> add some lines to check if there is a cached for the post.
function mb_pygments_content_filter( $content )
{
if ( FALSE !== ( $cached_post = get_post_cache() ) && !post_cache_needs_update() )
return $cached_post['content'];
clear_post_cache();
And at the end of mb_pygments_content_filter() add a line to save the post cache.
save_post_cache( $content );
Finally, when the plugin is uninstall we need to remove all the cache we created, this is a bit tricky, so we use $wpdb object to delete all using this a query.
register_uninstall_hook(__FILE__, 'mb_wp_pygments_uninstall');
function mb_wp_pygments_uninstall() {
global $wpdb;
$wpdb->query( "DELETE FROM `wp_options` WHERE option_name LIKE '_transient_post_%_content' " );
}
Read the full article at: Pygments on PHP & WordPress


全网最新最细最实用WPS零基础入门到精通全套教程!带你真正掌握WPS办公! 内含Excel基础操作、函数设计、数据透视表等
Copyright 2014-2025 https://www.php.cn/ All Rights Reserved | php.cn | 湘ICP备2023035733号