Title photo
Ode is simple! (Simple means that you know how it works.)

Hello, and welcome to news.ode-is-simple.com.

This is a weblog dedicated to Ode (ode-is-simple.com) and other topics relevant to the project.

If you're looking for general info about Ode you may want to start at the project homepage at ode.simple.com/home

To stay up to date with the newest news and info related to Ode, subscribe to this site's RSS 2.0 using Google Reader or your preferred feed reader.

Posts

Fri, 24 Sep 2010

Introducing: M_rkd_wn

Before we get to M_rkd_wn let's discuss Markdown on the Markdown_addin on which it is based.

Download the M_rkd_wn addin (.zip)

Markdown

Ode has always supported the Markdown syntax via the Markdown addin, which is included (and enabled by default) with Ode itself. That addin is no more than a very thin wrapper around the original markdown perl script (markdown.pl).

All I did was include the necessary Ode interface bits to work Markdown conversion into Ode.

For anyone who might be interested, here's what that looks like:

The access_title_tags_and_body_early interface allow addins look at and modify the content of each post by providing separate references to the posts title, tags, and body.

sub access_title_tags_and_body_early
{

  $$body_ref_l = Markdown($$body_ref_l);

  1;
}

The Markdown addin simply passes the body of the post (as a reference) to the Markdown routine, which handles the Markdown conversion. Not so hard to understand, right? The actual routine isn't quite this simple, but everything else is just overhead. This is really all the Markdown addin is doing.

(That $1 at before the closing curly brace let's the thing that calls the routine -ode in this case- know that everything went as expected.)

I really like Markdown and I think it does just about everything we need it to do. Sure it doesn't provide a syntax for tables, definition lists, footnotes, abbreviations, and other sorts of things, but I'm not convinced that there is a substantial benefit to expanding the syntax to do any of that.

Here is what Markdown's creator has to say about it:

Markdown is not a replacement for HTML, or even close to it. Its syntax is very small, corresponding only to a very small subset of HTML tags. The idea is not to create a syntax that makes it easier to insert HTML tags. In my opinion, HTML tags are already easy to insert. The idea for Markdown is to make it easy to read, write, and edit prose. HTML is a publishing format; Markdown is a writing format. Thus, Markdown’s formatting syntax only addresses issues that can be conveyed in plain text.

That little blurb makes a lot of sense to me. Fortunately, Markdown can be used seamlessly with HTML, so what markdown doesn't do isn't impossible - and it's not any more difficult than it would be without Markdown.

I think we should appreciate that HTML really is a pretty good markup language itself and the fact that it's so prevalent (much more so than Markdown or any of the simplified markdown syntaxes) has it's advantages when it comes to doing fancier/more structural stuff.

Having said that, and for what it's worth, between Ode's themes which separate the structure of a page from post content, and given the subset of HTML that Markdown does support, I rarely find myself writing in a way that requires me to include a lot of explicit HTML.

There are a few very specific things about Markdown that bother me after using Markdown for years.

Here they are:

  1. I think the way underscores are used for emphasis is essentially broken.

  2. I sometimes like to add height and width attributes to images and Markdown's img syntax does not allow for that.

  3. I sometimes want Markdown conversion to happen for content inside block level HTML elements and under Markdown this never happens.

The M_rkd_wn addin is a markdown derivative that is very similar to markdown (and the markdown addin) but addresses these 3 issues.

I'll discuss each of these briefly.

1. M_rkd_wn changes the way underscores are interpreted for emphasis (both em and strong).

Underscores appear frequently in text documents normally (at least my text documents). They rarely if ever are intended to indicate emphasis (esp when I'm not intentionally writing to the Markdown syntax).

For example I tend to link multiple words together with underscores when naming things for which I cannot reliably use spaces (e.g. paths, variable names).

It's not so much that that I have a problem with underscores for emphasis but that as implemented Markdown's behavior was causing problems for me unnecessarily.

When I do emphasize text it tends to be whole words and phrases, for example:

_word_
_some words_

I almost never emphasize in the middle of words. (In fact I could probably drop the almost and just leave it at never.)

For example, I almost never intend the following examples to mean emphasis:

_wo_rd
wo_rd_
some_more_words

But Markdown considers all of this as emphasis.

With m_rkd_wn underscores are used for emphasis only if they appear at the beginning and end of a word or phrase. Internal underscores never indicate emphasis.

So the following DO indicate emphasis and are converted to the appropriate em or strong tags:

_word_
__word__
_some words_
_An entire sentence can be emphasized._
_multiple_words_connected_with_underscores_can_be_emphasized_like_this_

While the following underscores DO NOT mean emphasis (and are not converted):

_wo_rd
wo_rd_
some_more_words

It turns out that this is exactly the behavior I want. (I like it so much I named m_rkd_wn after it. Under markdown this would be mrkdwn)

2. M_rkd_wn Expands the syntax for images under Markdown, adding the ability to specify height and width attributes.

This works for both reference and inline style images.

Reference style images

For reference style images, the syntax of the reference is not changed:

![alt text][id]

But the corresponding link syntax is different.

The original link syntax:

^[id]: url "optional title"

The new link syntax:

| ^[id]: url "optional title", "optional height", "optional width"

Note that both height and width are optional. When both are excluded the syntax is exactly the same as the original. This means the new syntax should not break any existing Markdown formatted pages.

The following are all examples of valid links under the new definition:

[some_img]: link/to/something.png "title of some_img"
[some_img]: link/to/something.png "title of some_img", "100", "250"
[some_img]: link/to/something.png "title of some_img", "100px", "250px"
[some_img]: link/to/something.png "title of some_img", "100%", "250%"
[some_img]: link/to/something.png "title of some_img", "100", "-"
[some_img]: link/to/something.png "title of some_img", "-", "250"

A few things to note:

  • As already mentioned the height and width values are optional.

  • These new optional values are comma separated.

  • You can include units or leave them off. The units recognized by HTML are: px (pixels) and % (percentage).

The default unit for HTML is pixels - so if you do not specify a unit pixels will be used. This means the following two example are equivalent:

..., "100", "250"
..., "100px", "250px"
  • You can specify both values like: ..."100px", "250px" or just one of the values: "-", "250px"

  • Note the use of a single dash '-' in place of the numerical value to indicate that the value is unspecified.

Though height and width are optional, you must use both or neither, (substituting a dash for a value you don't want to provide). Why?

Because height must precede width (the values are indistinguishable except for this order), which means that if we allowed for just one of the values and not the other, a single value would always mean height, with no way to specify only the width (excluding height altogether).

  • IMPORTANT: It is not well-defined What happens when only one of height or width is specified, which is to say that the behavior may differ from one client to the next.

This is an issue with interpretation of HTML itself, and varying client implementations, not m_rkd_wn.

When you type:

[some_img]: link/to/something.png "title of some_img", "-", "250"

The resulting HTML is:

<img src="link/to/something.png" alt="alt text" title="title of some_img" width="250" />

Notice that height is left out altogether.

Many clients (all of the clients I've tried - including all modern web browsers) will automatically scale the image so that the height of the image will maintain the same aspect (as the original) ratio with the specified width.

So if the original image was 1000px wide x 561px high

Specifying a width of 250px would cause the client to display the image at 250px wide x 140px high.

Note that this does not resize the image. It only forces the larger image into a smaller box. Though this may be convenient because it doesn't require you to actually resize and resave the image, the image itself only appears smaller.

In fact the file size of the image is exactly the same (because the file is not changed).

To understand why this is a problem, think about a photo taken with a modern 10 - 14 MP camera which will be many MBs in size. Imagine fitting it into a 250px x 140px box using this technique. The file will still be many MBs in size, whereas if you had actually resized the image it would probably be less than 100KB.

Inline style images

The original syntax for inline images is:

![alt text](url "optional title")

m_rkd_wn extends markdown's link syntax to support optional height and width values for images.

The new link syntax looks like:

![alt text](url "optional title", "optional height", "optional width")

Again note that as with the syntax for reference style images, both height and width are optional. When both are excluded the syntax is exactly the same as the original. This means the new syntax should not break any existing Markdown formatted pages.

The following are all examples of valid links under the new definition:

[alt text](link/to/something.png "title of some_img")
[alt text](link/to/something.png "title of some_img", "100", "250")
[alt text](link/to/something.png "title of some_img", "100px", "250px")
[alt text](link/to/something.png "title of some_img", "100%", "250%")
[alt text](link/to/something.png "title of some_img", , "100", "-")
[alt text](link/to/something.png "title of some_img", "-", "250")

3. M_rkd_wn allows for markdown-like conversion within block type elements by including the attribute/value pair markdown="1" in the opening tag.

(The attribute is removed from the converted text before it is returned to the caller.)

Under markdown conversion does not take place within block level HTML elements.

More specifically conversion is skipped within the following:

p, div, h[1-6], blockquote, pre, table, dl, ol, ul, script, noscript, form, fieldset, iframe, math, ins, del

M_rkd_wn allows you to override this behavior by adding a 'markdown' attribute with a value of 1 to the opening tag for these elements.

For example

This __will not__ be markdown converted because it is within a block level element and does not include the markdown attribute.

<p>This __will not__ be markdown converted because it is within a block level element and does not include the markdown attribute.</p>

This will be markdown converted because it is within a block level element and includes the markdown attribute.

<p markdown="1">This __will__ be markdown converted because it is within a block level element and includes the markdown attribute.</p>

For the sake of symmetry, M_rkd_wn also recognizes a value of "0" which mimics the original behavior turning off markdown conversion.

IMPORTANT: The value must be one of '0' or '1' or else the markdown attribute will not be recognized. Not only will the text inside the block level element not be converted (which you might expect) but the attribute will not be removed from the text and will appear on the rendered page eventually returned to the browser (which you might not expect).

This will invalidate the resulting page - because markdown is not a valid attribute for these elements.

Note the attribute name discussed above is 'markdown' rather than 'm_rkd_wn'. It's not a typo. Both are recognized but I'd recommend you use markdown for a couple of reasons.

  1. It seems to be way other markdown directives that allow for conversion inside of block level elements do it (e.g. PHP Markdown Extra, MultiMarkdown)
  2. It's easier to type.

The second is probably more important to you than the first, unless you're coming to Ode and the M_rkd_wn addin with a bunch of Markdown type files which include markdown="1" already.

Even in this case, changing all of every markdown="1" to m_rkd_wn="1" across any number of files is easy to do. If the text editor doesn't allow you to do it there are any number of tools that will - including Perl (and if you're using Ode I know you have that :).

That's it, m_rkd_wn in a nutshell.

Installing the addin is as easy as dragging it to your addins directory. You'll want to disable the markdown addin while you're there. Remember you can disable an addin by moving it out of the addins directory or simply appending an underscore to the end of it's name.

For example to disable the markdown addin you'd change the name from 'markdown' to 'markdown_'.

You'll find complete installation and usage instructions in the readme included with the addin.

I should also mention that I also have a Javascrpt version of m_rkd_wn based on John Fraser's Showdown (which is itself a port of the original markdown.pl). The Javascript version, which I call 'Sh_wd_wn' is intended to be used in combination with Ode's soon to be released Editedit addin. Together Sh_wd_wn and Editedit allow users to create and edit posts via browser based form with live previews which include m_rkd_wn conversion. When M_rkd_wn and Sh_wd_wn are used together it's a nice WYSIWYG complement to using a text editor for posts.

If you'd like to get a sense of how Editedit works, here's a working demo.

If you have any questions or comments please don't hesitate to contact me. You can email me directly or connect with other Ode users on the community forum.

Enjoy!