Syntax Highlight Module

screen shot

UPDATE: Now that I'm playing around in the developer part of Drupal.org, I do see module projects similar to this. I'm going to look those over and possibly hop on board to help out.

I wrote a Syntax Highlight Drupal module that incorporates lessons learned from Highlight Test One and Highlight Test Two. This module is a filter that may be used with the "Filtered HTML" text format (or similar formats). It makes choosing either a client-side solution or a server-side solution easy for the Drupal administrator.

The client-side highlighter being used is SyntaxHighlighter. The server-side highlighter being used is GeSHi. This filter searches node content for "---" wrapped code, and then replaces it with either GeSHi or SyntaxHighlighter markup calls, depending on your text format configuration.

Simply wrap your code like this:

--- languagename
your
code
---

Optional non-code text comes afterwards. Multiple blocks of code (of possibly different languages) is legal.

This "Syntax highlighting" filter must run after the "Limit allowed HML tags" filter runs and before the "Convert line breaks into HTML" filter runs (as shown in the image above). Additionally, this filter can be configured (by admin or anyone with the "Administer text formats and filters" permission) to run either server-side (PHP) or client-side (JavaScript). This setting is a site-wide setting affecting all content of the administered text format type (e.g. all "Filtered HTML" content nodes).

DONE: Supply configuration options to specify the paths to your GeSHi and/or SyntaxHighlighter installs.

DONE: Use leading underscores for private methods.

TODO: Provide a list of available language names (supported languages). Don't call output functions on unknown languages. (DONE for client-side.)

Example Highlighting:

<?php

// Copyright (c) 2014 Jay Glascoe, GNU Public License
//
// http://www.gnu.org/licenses/gpl-3.0.txt
//

/**
 * Implements hook_menu().
 */
function syntax_highlight_menu() {
  $items['syntax_highlight'] = array(
    'title' => 'Syntax Highlight',
    'page callback' => '_syntax_highlight_information',
    'access callback' => TRUE,
  );
  return $items;
}

/**
 * Implements hook_help().
 */
function syntax_highlight_help($path, $arg) {
  switch ($path) {
    case 'admin/help#syntax_highlight':
      return _syntax_highlight_information();
  }
}

/**
 * Returns information about Syntax Highlight filter.
 */
function _syntax_highlight_information() {
  return t("<p>This filter marks up delimited code embedded in text using either SyntaxHighlighter (!syntax_highlighter) or GeSHi (!geshihighlighter)</p>
    <p>Your site administrator must install at least one of these libraries.</p>
    <p>Code to be marked up should be surrounded by lines beginning with three dashes.  The top delimiter should be something like \"--- javascript\", the bottom delimiter simply looks like \"---\":</p>
    <pre>--- languagename\nyour\ncode\n---</pre>
    <p>To use this filter, go to !link and configure an input format, or create a new one.</p>",
    array(
      '!link' => l(t('admin/config/content/formats'), 'admin/config/content/formats'),
      '!syntax_highlighter' => l(t('alexgorbatchev.com/SyntaxHighlighter/'), 'alexgorbatchev.com/SyntaxHighlighter/'),
      '!geshihighlighter' => l(t('qbnz.com/highlighter/'), 'qbnz.com/highlighter/'),
    )
  );
}

/**
 * Implements hook_filter_info().
 */
function syntax_highlight_filter_info()
{
  $filters = array();
  $filters['syntax_highlight'] = array(
    'title' => t('Syntax Highlight'),
    'description' => t('Embedded code delimited by triple dashes (---) is syntax highlighted by either the client or the server.  See the <a href="/admin/help/syntax_highlight">Syntax Highlight Help</a> page.'),
    'process callback' => '_syntax_highlight_filter',
    'settings callback' => '_syntax_highlight_settings',
    'tips callback' => '_syntax_highlight_tips',
    'default settings' => array(
        'clientorserver' => 'server',
        'geshidirectory' => 'includes',
        'shdirectory' => 'scripts/syntaxhighlighter',
    ),
  );
  return $filters;
}

/**
 * Settings callback for Syntax Highlight filter.
 */
function _syntax_highlight_settings($form, &$form_state, $filter, $format, $defaults, $filters) {
  $root = DRUPAL_ROOT;
  $elements = array();
  $elements['clientorserver'] = array(
    '#type' => 'select',
    '#title' => t('Client or Server'),
    '#options' => array(
      'client' => 'client-side (JavaScript)',
      'server' => 'server-side (PHP)'
    ),
    '#default_value' => isset($filter->settings['clientorserver']) ?
      $filter->settings['clientorserver'] : $defaults['clientorserver'],
  );
  $elements['shdirectory'] = array(
    '#type' => 'textfield',
    '#title' => t('Client-side: Location of SyntaxHighlighter install'),
    '#default_value' => isset($filter->settings['shdirectory']) ?
      $filter->settings['shdirectory'] : $defaults['shdirectory'],
    '#description' => t("Location of SyntaxHighlighter install relative to DRUPAL_ROOT ($root).  This should be a relative URL to the described directory.  The directory should have both scripts and styles as subdirectories."),
  );
  $elements['geshidirectory'] = array(
    '#type' => 'textfield',
    '#title' => t('Server-side: Location of GeSHi install'),
    '#default_value' => isset($filter->settings['geshidirectory']) ?
      $filter->settings['geshidirectory'] : $defaults['geshidirectory'],
    '#description' => t("Location of GeSHi install relative to DRUPAL_ROOT ($root).  A geshi.php file must be in this location as well as a geshi subdirectory."),
  );
  return $elements;
}

/**
 * Filter tips callback for Syntax Highlight filter.
 */
function _syntax_highlight_tips($filter, $format, $long = FALSE) {
  return t("Delimited code is syntax highlighted.  Delimit your code like this:<pre>--- javascript\nyour\ncode\n---");
}

/**
 * Syntax highligting filter process callback.
 * The actual filtering work happens here and
 * inside FilterProcess objects created here.
 */
function _syntax_highlight_filter($text, $filter, $format, $langcode, $cache, $cache_id) {
  // create an object.
  $highlight_process = new HighlightProcess($text, $filter);

  // and then let it do all the work.
  return $highlight_process->process();
}

class HighlightProcess {
  protected $_text;
  protected $_filter;
  protected $_geshi_dir;
  protected $_synhi_dir;
  protected $_languages_used;
  protected $_callback;
  protected $_highlights_made;

  public function __construct(&$text, &$filter) {
    // objects hold references to these, but don't mutate them
    $this->_text = $text;
    $this->_filter = $filter;

    // _languages_used array will be
    // populated to look like this:
    // array(
    //   'shBrushPerl.js' => true,
    //   'shBrushJava.js' => true,
    // )
    $this->_languages_used = array();

    // keep track of whether any highlights
    // to code have been made
    $this->_highlights_made = false;

    // set _callback and _geshi_dir or _synhi_dir
    if ($this->_filter->settings['clientorserver'] == "client") {
      $this->_callback = array($this, '_client_side_callback');
      $set_synhi_dir = $this->_filter->settings['shdirectory'];
      $this->_synhi_dir = substr($set_synhi_dir, 0, 1) == "/" ?
        $set_synhi_dir : '/' . $set_synhi_dir;
    }
    else {
      $this->_callback = array($this, '_server_side_callback');
      $this->_geshi_dir = DRUPAL_ROOT . '/' . $this->_filter->settings['geshidirectory'];
    }
  }

  protected function _client_side_callback(&$array) {
    $this->_highlights_made = true;
    return $this->_get_highlighter_output($array[1], $array[2]);
  }

  protected function _server_side_callback(&$array) {
    return $this->_get_geshi_output($array[1], $array[2]);
  }

  public function process() {
    // we're looking out for all lines that begin with
    // three dashes (---).  the first text being
    // captured is the language name, the second
    // text captured is the code to be highlighted.
    $pattern = "/
      ^---\s+(\S+)\s*$
      (.*?)
      ^---\s*$
    /mxs";

    // we avoid a lot of unnecessary coding (loops,
    // if branches, regex matching, and error
    // handling) by using this higher-order
    // regular expression function and supplying
    // a suitable callback function.
    $result = preg_replace_callback($pattern, $this->_callback, $this->_text);

    // do whatever is left to do with result and return
    return $this->_post_handle_client_or_server($result);
  }

  protected function _post_handle_client_or_server(&$text) {
    if ($this->_filter->settings['clientorserver'] == "client") {
      if ($this->_highlights_made) {
        $header = $this->_get_client_side_header() .
        $footer = $this->_get_client_side_footer();
        return "$header$text$footer";
      }
      else {
        return $text;
      }
    }
    else {
      return $text;
    }
  }

  protected function _get_geshi_output(&$lang, &$code) {
    $code = trim($code);
    $lang = strtolower(trim($lang));
    include_once($this->_geshi_dir . '/geshi.php');
    $code = htmlspecialchars_decode($code);
    $geshi = new GeSHi($code, $lang);
    $geshi->enable_line_numbers(GESHI_FANCY_LINE_NUMBERS);
    $geshi->set_header_type(GESHI_HEADER_DIV);
    return $geshi->parse_code();
  }

  protected function _get_highlighter_output(&$lang, &$code) {
    $code = trim($code);
    $lang = strtolower(trim($lang));
    $is_unknown_language = !array_key_exists($lang, self::$_known_languages);
    if ($is_unknown_language) {
      $lang = self::$_default_language;
    }
    $this->_languages_used[self::$_known_languages[$lang]] = true;
    $output = "
      <pre class=\"brush: $lang\">$code</pre>
    ";
    return $output;
  }

  protected function _get_autoloader_brushes() {
    $brushes_rev = array();
    foreach (self::$_known_languages as $key => $value) {
      if (!array_key_exists($value, $brushes_rev)) {
        $brushes_rev[$value] = array( $key );
      }
      else {
        array_push($brushes_rev[$value], $key);
      }
    }
    $autoloader_brushes = array();
    foreach ($brushes_rev as $value => $keys) {
      $brush_dir = "$this->_synhi_dir/scripts/$value";
      array_push($keys, $brush_dir);
      array_push($autoloader_brushes, "[ '" . implode("', '", $keys) . "' ]");
    }
    return '[' . implode(', ', $autoloader_brushes) . ']';
  }

  protected function _get_client_side_header() {
    $jstext = file_get_contents(DRUPAL_ROOT . "$this->_synhi_dir/scripts/shCore.js");
    $brushes = array();
    foreach ($this->_languages_used as $key => $value) {
      array_push($brushes, "
        <script type=\"text/javascript\" src=\"$this->_synhi_dir/scripts/$key\"></script>
      ");
    }
    $brushes_string = implode("\n", $brushes);
    $output = "
      <script type=\"text/javascript\">
if (typeof(window._myfilegetcontents_shcorejs) === 'undefined') {
  window._myfilegetcontents_shcorejs = true;
  $jstext
  ;
}
      </script>
      <!-- BEGIN BRUSHES -->
      $brushes_string
      <!-- END BRUSHES -->
      <link href=\"$this->_synhi_dir/styles/shCore.css\" rel=\"stylesheet\" type=\"text/css\" />
      <link href=\"$this->_synhi_dir/styles/shThemeDefault.css\" rel=\"stylesheet\" type=\"text/css\" />
    ";
    return $output;
  }

  protected function _get_client_side_footer() {
    $output = "
      <script type=\"text/javascript\">
if (typeof(window._syntaxhighlighter_already_run) !== true) {
  window._syntaxhighlighter_already_run = true;
  SyntaxHighlighter.config.stripBrs = true;
  SyntaxHighlighter.all();
}
      </script>
    ";
    return $output;
  }

  // this must be a key in the _known_languages array.
  protected static $_default_language = 'plain';

  protected static $_known_languages = array(
    'applescript' => 'shBrushAppleScript.js',
    'as3' => 'shBrushAS3.js',
    'bash' => 'shBrushBash.js',
    'shell' => 'shBrushBash.js',
    'coldfusion' => 'shBrushColdFusion.js',
    // some languages have multiple recognized names
    'cpp' => 'shBrushCpp.js',
    'c++' => 'shBrushCpp.js',
    'csharp' => 'shBrushCSharp.js',
    'c#' => 'shBrushCSharp.js',
    'css' => 'shBrushCss.js',
    'delphi' => 'shBrushDelphi.js',
    'diff' => 'shBrushDiff.js',
    'erlang' => 'shBrushErlang.js',
    'groovy' => 'shBrushGroovy.js',
    'java' => 'shBrushJava.js',
    'javafx' => 'shBrushJavaFX.js',
    'jscript' => 'shBrushJScript.js',
    'javascript' => 'shBrushJScript.js',
    'perl' => 'shBrushPerl.js',
    'php' => 'shBrushPhp.js',
    // if a user tries to use an unknown language,
    // it will (silently) be recognized as "plain"
    'plain' => 'shBrushPlain.js',
    'powershell' => 'shBrushPowerShell.js',
    'pshell' => 'shBrushPowerShell.js',
    'python' => 'shBrushPython.js',
    'ruby' => 'shBrushRuby.js',
    'sass' => 'shBrushSass.js',
    'scala' => 'shBrushScala.js',
    'sql' => 'shBrushSql.js',
    'vb' => 'shBrushVb.js',
    'vbasic' => 'shBrushVb.js',
    'visualbasic' => 'shBrushVb.js',
    'xml' => 'shBrushXml.js',
  );
}

?>

See the comments below where authenticated user jglascoe highlights some JavaScript code using
--- javascipt
his code
---

Comments

jglascoe's picture

Example JavaScript (--- javascript) highlighting:

// factorial(4) => 24
function factorial(i)
{ 
  // tail-recursive helper
  var $fac_helper = function(i, acc)
  { 
    if (i < 2)
      return acc;
    else
      return fac_helper(i - 1, acc * i);
  }
  return $fac_helper(i, 1);
}

Multiple code blocks are legal.

Example Python (--- python) highlighting:

# addall(n) => 1 + 2 + ... + n = (n^2 + n) / 2
def addall(n):
  def addall_helper(n, acc):
    if n < 1:
      return acc
    else:
      return addall_helper(n - 1, acc + n) 
  return addall_helper(n, 0)

print while (<>); # perl
jglascoe's picture

class Foo:
    def bar(self): 
        print("hello\n")