Running with Code Like with scissors, only more dangerous

13Jan/080

A Reminder of Why We Like Object Orientation

I love Simple Machines Forum.  It’s a fantastic community software package, it’s open-source, it’s pretty fast, free, and it’s got boatloads of features.  It’s so great, in fact, that I’m basing one of my personal projects on it.

SMF is written in PHP (I’m running into too many acronyms already – this is, after all, a post about OOP).  Being that PHP is a scripting language, and among other things strong with string processing, it seems pretty natural that it performs the parsing of UBBC (Universal Bulletin Board Code) itself, without moving out to a library function or something like that.  (I guess that’s a superfluous statement anyway since there isn’t a library function). 

BBCode is some of the coolest stuff around – good way to prevent worrying about things like script attacks and other evil things people can do to other people in a community environment.  It looks something like this:

[b][i]This is important[/i][/b] – [url=http://geekswithblogs.net/robp/]Check out my blog[/url] and my latest info.

If I look at what SMF produces, I’ll see the following HTML:

<b><i>This is important</i></b> – <a href="http://geekswithblogs.net/robp/" target="_blank">Check out my blog</a> and my latest info.

Anyway, I’ve been trying to come up with an interesting way to implement this in C#.  My latest idea was to implement a stack tracking each BBCode tag and an implementation that allowed the combination of tags.  For instance, I’d like the [b] tag to render as a <span> with the CSS font-weight: bold; value.  If I have an [i] tag nested within it, I’d like the single <span> tag to have both of the CSS values. 

I was thinking that this might be kind of an annoying to do this; it would require a lot of classes – one for each tag – and while I thought it might be cool to implement this way for other reasons (configuration files, for example) – I was somewhat concerned.

I was also interested in taking a look at how the PHP works – particularly, I’d like to see the parameters and things like that.  My first guess was that it was in Sources/subs.php, and I was right.  They were kind enough to include a list of subroutines in the file in comments at the top – and sure enough, I found parse_bbc!

So I copied the function out into another file – it turned out the single function is 50kb.  One function.

Looking further into the function, I found an 81-line comment block explaining, basically, how they used associative arrays to create a series of objects.  Here’s an example:

$codes = array(
    array(
        'tag' => 'abbr',
        'type' => 'unparsed_equals',
        'before' => '<abbr title="$1">',
        'after' => '</abbr>',
        'quoted' => 'optional',
        'disabled_after' => ' ($1)',
    ),
    array(
        'tag' => 'acronym',
        'type' => 'unparsed_equals',
        'before' => '<acronym title="$1">',
        'after' => '</acronym>',
        'quoted' => 'optional',
        'disabled_after' => ' ($1)',
    ),
    array(
        'tag' => 'anchor',
        'type' => 'unparsed_equals',
        'test' => '[#]?([A-Za-z][A-Za-z0-9_\-]*)\]',
        'before' => '<span id="post_$1" />',
        'after' => '',
    ),
    array(
        'tag' => 'b',
        'before' => '<b>',
        'after' => '</b>',
    ),
);

As you can see, each "object" has a series of properties.  Each object also has a type (lack of the "type" attribute would probably indicate a type as well).  But I could do it easily enough with C#, maybe even using object initializers (item = new UnparsedEqualsBBCode() { Tag = "abbr", Before="<abbr title=\"$1\">", After="</abbr>", Quoted = QuoteRequirements.Optional, DisabledAfter=" ($1)" };).  Undoubtedly there are other ways to do this too. 

It’s just scary.  Looking through the function tonight has reminded me why I like my object-oriented programming and type safety.  Adding items to this – or changing it – seems like a maintenance nightmare.

Comments (0) Trackbacks (0)

No comments yet.


Leave a comment

ERROR: si-captcha.php plugin says GD image support not detected in PHP!

Contact your web host and ask them why GD image support is not enabled for PHP.

ERROR: si-captcha.php plugin says imagepng function not detected in PHP!

Contact your web host and ask them why imagepng function is not enabled for PHP.

No trackbacks yet.