Python Markdown Syntax



  • Overview
  • Block Elements
  • Span Elements
  • Miscellaneous

Python Markdown Syntax Pdf

Note: This document is itself written using Markdown; youcan see the source for it by adding ‘.text’ to the URL.

Overview

Philosophy

Markdown is intended to be as easy-to-read and easy-to-write as is feasible.

Readability, however, is emphasized above all else. A Markdown-formatteddocument should be publishable as-is, as plain text, without lookinglike it’s been marked up with tags or formatting instructions. WhileMarkdown’s syntax has been influenced by several existing text-to-HTMLfilters — including Setext, atx, Textile, reStructuredText,Grutatext, and EtText — the single biggest source ofinspiration for Markdown’s syntax is the format of plain text email.

How to Use Markdown Syntax with Python For most things you can think of, there is a python library available for it. So is naturally also the case for Markdown. There is a library called Python-Markdown that can easily be installed with pip using the following command. Markdown is great because of its support of code blocks. We've tied this in with Codebase's powerful syntax highlighting to provide language specific code blocks To use the syntax highlighting, you'll need to specify the language that you're using. PDF - Download Markdown for free Previous Next This modified text is an extract of the original Stack Overflow Documentation created by following contributors and released under CC BY-SA 3.0.

To this end, Markdown’s syntax is comprised entirely of punctuationcharacters, which punctuation characters have been carefully chosen soas to look like what they mean. E.g., asterisks around a word actuallylook like *emphasis*. Markdown lists look like, well, lists. Evenblockquotes look like quoted passages of text, assuming you’ve everused email.

Inline HTML

Markdown’s syntax is intended for one purpose: to be used as aformat for writing for the web.

Markdown is not a replacement for HTML, or even close to it. Itssyntax is very small, corresponding only to a very small subset ofHTML tags. The idea is not to create a syntax that makes it easierto insert HTML tags. In my opinion, HTML tags are already easy toinsert. The idea for Markdown is to make it easy to read, write, andedit prose. HTML is a publishing format; Markdown is a writingformat. Thus, Markdown’s formatting syntax only addresses issues thatcan be conveyed in plain text.

For any markup that is not covered by Markdown’s syntax, you simplyuse HTML itself. There’s no need to preface it or delimit it toindicate that you’re switching from Markdown to HTML; you just usethe tags.

The only restrictions are that block-level HTML elements — e.g. <div>,<table>, <pre>, <p>, etc. — must be separated from surroundingcontent by blank lines, and the start and end tags of the block shouldnot be indented with tabs or spaces. Markdown is smart enough notto add extra (unwanted) <p> tags around HTML block-level tags.

For example, to add an HTML table to a Markdown article:

Note that Markdown formatting syntax is not processed within block-levelHTML tags. E.g., you can’t use Markdown-style *emphasis* inside anHTML block.

Span-level HTML tags — e.g. <span>, <cite>, or <del> — can beused anywhere in a Markdown paragraph, list item, or header. If youwant, you can even use HTML tags instead of Markdown formatting; e.g. ifyou’d prefer to use HTML <a> or <img> tags instead of Markdown’slink or image syntax, go right ahead.

Unlike block-level HTML tags, Markdown syntax is processed withinspan-level tags.

Automatic Escaping for Special Characters

In HTML, there are two characters that demand special treatment: <and &. Left angle brackets are used to start tags; ampersands areused to denote HTML entities. If you want to use them as literalcharacters, you must escape them as entities, e.g. &lt;, and&amp;.

Ampersands in particular are bedeviling for web writers. If you want towrite about ‘AT&T’, you need to write ‘AT&amp;T’. You even need toescape ampersands within URLs. Thus, if you want to link to:

you need to encode the URL as:

in your anchor tag href attribute. Needless to say, this is easy toforget, and is probably the single most common source of HTML validationerrors in otherwise well-marked-up web sites.

Markdown allows you to use these characters naturally, taking care ofall the necessary escaping for you. If you use an ampersand as part ofan HTML entity, it remains unchanged; otherwise it will be translatedinto &amp;.

So, if you want to include a copyright symbol in your article, you can write:

and Markdown will leave it alone. But if you write:

Markdown will translate it to:

Similarly, because Markdown supports inline HTML, if you useangle brackets as delimiters for HTML tags, Markdown will treat them assuch. But if you write:

Markdown will translate it to:

However, inside Markdown code spans and blocks, angle brackets andampersands are always encoded automatically. This makes it easy to useMarkdown to write about HTML code. (As opposed to raw HTML, which is aterrible format for writing about HTML syntax, because every single <and & in your example code needs to be escaped.)

Block Elements

Paragraphs and Line Breaks

A paragraph is simply one or more consecutive lines of text, separatedby one or more blank lines. (A blank line is any line that looks like ablank line — a line containing nothing but spaces or tabs is consideredblank.) Normal paragraphs should not be indented with spaces or tabs.

The implication of the “one or more consecutive lines of text” rule isthat Markdown supports “hard-wrapped” text paragraphs. This differssignificantly from most other text-to-HTML formatters (including MovableType’s “Convert Line Breaks” option) which translate every line breakcharacter in a paragraph into a <br /> tag.

When you do want to insert a <br /> break tag using Markdown, youend a line with two or more spaces, then type return.

Yes, this takes a tad more effort to create a <br />, but a simplistic“every line break is a <br />” rule wouldn’t work for Markdown.Markdown’s email-style blockquoting and multi-paragraph list itemswork best — and look better — when you format them with hard breaks.

Headers

Markdown supports two styles of headers, Setext and atx.

Setext-style headers are “underlined” using equal signs (for first-levelheaders) and dashes (for second-level headers). For example:

Color in markdown language

Any number of underlining =’s or -’s will work.

Atx-style headers use 1-6 hash characters at the start of the line,corresponding to header levels 1-6. For example:

Optionally, you may “close” atx-style headers. This is purelycosmetic — you can use this if you think it looks better. Theclosing hashes don’t even need to match the number of hashesused to open the header. (The number of opening hashesdetermines the header level.) :

Blockquotes

Markdown uses email-style > characters for blockquoting. If you’refamiliar with quoting passages of text in an email message, then youknow how to create a blockquote in Markdown. It looks best if you hardwrap the text and put a > before every line:

Markdown allows you to be lazy and only put the > before the firstline of a hard-wrapped paragraph:

Blockquotes can be nested (i.e. a blockquote-in-a-blockquote) byadding additional levels of >:

Blockquotes can contain other Markdown elements, including headers, lists,and code blocks:

Any decent text editor should make email-style quoting easy. Forexample, with BBEdit, you can make a selection and choose IncreaseQuote Level from the Text menu.

Lists

Markdown supports ordered (numbered) and unordered (bulleted) lists.

Unordered lists use asterisks, pluses, and hyphens — interchangably — as list markers:

is equivalent to:

and:

Ordered lists use numbers followed by periods:

Python Markdown Syntax Interview

It’s important to note that the actual numbers you use to mark thelist have no effect on the HTML output Markdown produces. The HTMLMarkdown produces from the above list is:

If you instead wrote the list in Markdown like this:

or even:

you’d get the exact same HTML output. The point is, if you want to,you can use ordinal numbers in your ordered Markdown lists, so thatthe numbers in your source match the numbers in your published HTML.But if you want to be lazy, you don’t have to.

If you do use lazy list numbering, however, you should still start thelist with the number 1. At some point in the future, Markdown may supportstarting ordered lists at an arbitrary number.

List markers typically start at the left margin, but may be indented byup to three spaces. List markers must be followed by one or more spacesor a tab.

To make lists look nice, you can wrap items with hanging indents:

But if you want to be lazy, you don’t have to:

If list items are separated by blank lines, Markdown will wrap theitems in <p> tags in the HTML output. For example, this input:

will turn into:

But this:

will turn into:

List items may consist of multiple paragraphs. Each subsequentparagraph in a list item must be indented by either 4 spacesor one tab:

It looks nice if you indent every line of the subsequentparagraphs, but here again, Markdown will allow you to belazy:

To put a blockquote within a list item, the blockquote’s >delimiters need to be indented:

To put a code block within a list item, the code block needsto be indented twice — 8 spaces or two tabs:

It’s worth noting that it’s possible to trigger an ordered list byaccident, by writing something like this:

In other words, a number-period-space sequence at the beginning of aline. To avoid this, you can backslash-escape the period:

Code Blocks

Pre-formatted code blocks are used for writing about programming ormarkup source code. Rather than forming normal paragraphs, the linesof a code block are interpreted literally. Markdown wraps a code blockin both <pre> and <code> tags.

To produce a code block in Markdown, simply indent every line of theblock by at least 4 spaces or 1 tab. For example, given this input:

Markdown will generate:

One level of indentation — 4 spaces or 1 tab — is removed from eachline of the code block. For example, this:

will turn into:

A code block continues until it reaches a line that is not indented(or the end of the article).

Within a code block, ampersands (&) and angle brackets (< and >)are automatically converted into HTML entities. This makes it veryeasy to include example HTML source code using Markdown — just pasteit and indent it, and Markdown will handle the hassle of encoding theampersands and angle brackets. For example, this:

will turn into:

Regular Markdown syntax is not processed within code blocks. E.g.,asterisks are just literal asterisks within a code block. This meansit’s also easy to use Markdown to write about Markdown’s own syntax.

Horizontal Rules

You can produce a horizontal rule tag (<hr />) by placing three ormore hyphens, asterisks, or underscores on a line by themselves. If youwish, you may use spaces between the hyphens or asterisks. Each of thefollowing lines will produce a horizontal rule:

Span Elements

Links

Markdown supports two style of links: inline and reference.

In both styles, the link text is delimited by [square brackets].

To create an inline link, use a set of regular parentheses immediatelyafter the link text’s closing square bracket. Inside the parentheses,put the URL where you want the link to point, along with an optionaltitle for the link, surrounded in quotes. For example:

Will produce:

If you’re referring to a local resource on the same server, you canuse relative paths:

Reference-style links use a second set of square brackets, insidewhich you place a label of your choosing to identify the link:

You can optionally use a space to separate the sets of brackets:

Python in markdown

Then, anywhere in the document, you define your link label like this,on a line by itself:

That is:

  • Square brackets containing the link identifier (optionallyindented from the left margin using up to three spaces);
  • followed by a colon;
  • followed by one or more spaces (or tabs);
  • followed by the URL for the link;
  • optionally followed by a title attribute for the link, enclosedin double or single quotes, or enclosed in parentheses.

The following three link definitions are equivalent:

Note: There is a known bug in Markdown.pl 1.0.1 which preventssingle quotes from being used to delimit link titles.

Markdown

The link URL may, optionally, be surrounded by angle brackets:

You can put the title attribute on the next line and use extra spacesor tabs for padding, which tends to look better with longer URLs:

Link definitions are only used for creating links during Markdownprocessing, and are stripped from your document in the HTML output.

Link definition names may consist of letters, numbers, spaces, andpunctuation — but they are not case sensitive. E.g. these twolinks:

are equivalent.

The implicit link name shortcut allows you to omit the name of thelink, in which case the link text itself is used as the name.Just use an empty set of square brackets — e.g., to link the word“Google” to the google.com web site, you could simply write:

And then define the link:

Because link names may contain spaces, this shortcut even works formultiple words in the link text:

And then define the link:

Link definitions can be placed anywhere in your Markdown document. Itend to put them immediately after each paragraph in which they’reused, but if you want, you can put them all at the end of yourdocument, sort of like footnotes.

Here’s an example of reference links in action:

Using the implicit link name shortcut, you could instead write:

Both of the above examples will produce the following HTML output:

For comparison, here is the same paragraph written usingMarkdown’s inline link style:

Python

The point of reference-style links is not that they’re easier towrite. The point is that with reference-style links, your documentsource is vastly more readable. Compare the above examples: usingreference-style links, the paragraph itself is only 81 characterslong; with inline-style links, it’s 176 characters; and as raw HTML,it’s 234 characters. In the raw HTML, there’s more markup than thereis text.

With Markdown’s reference-style links, a source document much moreclosely resembles the final output, as rendered in a browser. Byallowing you to move the markup-related metadata out of the paragraph,you can add links without interrupting the narrative flow of yourprose.

Emphasis

Markdown treats asterisks (*) and underscores (_) as indicators ofemphasis. Text wrapped with one * or _ will be wrapped with anHTML <em> tag; double *’s or _’s will be wrapped with an HTML<strong> tag. E.g., this input:

will produce:

You can use whichever style you prefer; the lone restriction is thatthe same character must be used to open and close an emphasis span.

Emphasis can be used in the middle of a word:

But if you surround an * or _ with spaces, it’ll be treated as aliteral asterisk or underscore.

To produce a literal asterisk or underscore at a position where itwould otherwise be used as an emphasis delimiter, you can backslashescape it:

Code

To indicate a span of code, wrap it with backtick quotes (`).Unlike a pre-formatted code block, a code span indicates code within anormal paragraph. For example:

will produce:

To include a literal backtick character within a code span, you can usemultiple backticks as the opening and closing delimiters:

which will produce this:

The backtick delimiters surrounding a code span may include spaces — one after the opening, one before the closing. This allows you to placeliteral backtick characters at the beginning or end of a code span:

will produce:

With a code span, ampersands and angle brackets are encoded as HTMLentities automatically, which makes it easy to include example HTMLtags. Markdown will turn this:

into:

You can write this:

to produce:

Images

Admittedly, it’s fairly difficult to devise a “natural” syntax forplacing images into a plain text document format.

Markdown uses an image syntax that is intended to resemble the syntaxfor links, allowing for two styles: inline and reference.

Inline image syntax looks like this:

That is:

  • An exclamation mark: !;
  • followed by a set of square brackets, containing the altattribute text for the image;
  • followed by a set of parentheses, containing the URL or path tothe image, and an optional title attribute enclosed in doubleor single quotes.

Reference-style image syntax looks like this:

Where “id” is the name of a defined image reference. Image referencesare defined using syntax identical to link references:

As of this writing, Markdown has no syntax for specifying thedimensions of an image; if this is important to you, you can simplyuse regular HTML <img> tags.

Miscellaneous

Automatic Links

Markdown supports a shortcut style for creating “automatic” links for URLs and email addresses: simply surround the URL or email address with angle brackets. What this means is that if you want to show the actual text of a URL or email address, and also have it be a clickable link, you can do this:

Markdown will turn this into:

Automatic links for email addresses work similarly, except thatMarkdown will also perform a bit of randomized decimal and hexentity-encoding to help obscure your address from address-harvestingspambots. For example, Markdown will turn this:

into something like this:

which will render in a browser as a clickable link to “[email protected]”.

(This sort of entity-encoding trick will indeed fool many, if notmost, address-harvesting bots, but it definitely won’t fool all ofthem. It’s better than nothing, but an address published in this waywill probably eventually start receiving spam.)

Backslash Escapes

Markdown allows you to use backslash escapes to generate literalcharacters which would otherwise have special meaning in Markdown’sformatting syntax. For example, if you wanted to surround a wordwith literal asterisks (instead of an HTML <em> tag), you can usebackslashes before the asterisks, like this:

Markdown provides backslash escapes for the following characters:

This is a Python implementation of John Gruber’s Markdown. It is almost completelycompliant with the reference implementation, though there are a few very minordifferences. See John’s Syntax Documentation for the syntax rules.

First and foremost, Python-Markdown is intended to be a python library moduleused by various projects to convert Markdown syntax into HTML.

The Basics¶

Python Markdown Formatting

To use markdown as a module:

The Details¶

Python-Markdown provides two public functions (markdown() and markdownFromFile()) both of which wrap thepublic class Markdown. If you’re processing onedocument at a time, the functions will serve your needs. However, if you needto process multiple documents, it may be advantageous to create a singleinstance of the class:`Markdown class and pass multiple documents throughit.

markdown.markdown(text[, **kwargs])

The following options are available on the markdown.markdown function:

  • text (required): The source text string.

    Note that Python-Markdown expects Unicode as input (althougha simple ASCII string may work) and returns output as Unicode.Do not pass encoded strings to it! If your input is encoded, (e.g. asUTF-8), it is your responsibility to decode it. For example:

    If you want to write the output to disk, you must encode it yourself:

  • extensions: A list of extensions.

    Python-Markdown provides an API for third parties to write extensions tothe parser adding their own additions or changes to the syntax. A fewcommonly used extensions are shipped with the markdown library. Seethe [extension documentation](extensions/index.html) for a list ofavailable extensions.

    The list of extensions may contain instances of extensions or strings ofextension names. If an extension name is provided as a string, theextension must be importable as a python module either within themarkdown.extensions package or on your PYTHONPATH with a name startingwith mdx_, followed by the name of the extension. Thus,extensions=[‘extra’] will first look for the modulemarkdown.extensions.extra, then a module named mdx_extra.

  • extension_configs: A dictionary of configuration settings for extensions.

    The dictionary must be of the following format:

    See the documentation specific to the extension you are using for help inspecifying configuration settings for that extension.

  • output_format: Format of output.

    Supported formats are:* “xhtml1”: Outputs XHTML 1.x. Default.* “xhtml5”: Outputs XHTML style tags of HTML 5* “xhtml”: Outputs latest supported version of XHTML (currently XHTML 1.1).* “html4”: Outputs HTML 4* “html5”: Outputs HTML style tags of HTML 5* “html”: Outputs latest supported version of HTML (currently HTML 4).

    Note that it is suggested that the more specific formats (“xhtml1”,“html5”, & “html4”) be used as “xhtml” or “html” may change in the futureif it makes sense at that time. The values can either be lowercase oruppercase.

  • safe_mode: Disallow raw html.

    If you are using Markdown on a web system which will transform textprovided by untrusted users, you may want to use the “safe_mode”option which ensures that the user’s HTML tags are either replaced,removed or escaped. (They can still create links using Markdown syntax.)

    The following values are accepted:

    • False (Default): Raw HTML is passed through unaltered.
    • replace: Replace all HTML blocks with the text assigned to
      html_replacement_text To maintain backward compatibility, settingsafe_mode=True will have the same effect as safe_mode=’replace’.

    To replace raw HTML with something other than the default, do:

  • remove: All raw HTML will be completely stripped from the text with

    no warning to the author.

  • escape: All raw HTML will be escaped and included in the document.

    For example, the following source:

    Will result in the following HTML:

    Note that “safe_mode” also alters the default value for theenable_attributes option.

  • html_replacement_text: Text used when safe_mode is set to replace. Defaults to [HTML_REMOVED].

  • tab_length: Length of tabs in the source. Default: 4

  • enable_attributes: Enable the conversion of attributes. Defaults to True, unless safe_mode is enabled, in which case the default is False.

    Note that safe_mode only overrides the default. If enable_attributesis explicitly set, the explicit value is used regardless of safe_mode.However, this could potentially allow an untrusted user to injectJavaScript into your documents.

  • smart_emphasis: Treat _connected_words_ intelligently Default: True

  • lazy_ol: Ignore number of first item of ordered lists. Default: True

    Given the following list:

    By default markdown will ignore the fact the the first line startedwith item number “4” and the HTML list will start with a number “1”.If lazy_ol is set to True, then markdown will output the followingHTML:

markdown.markdownFromFile(**kwargs)

With a few exceptions, markdownFromFile() accepts the same options asmarkdown(). It does not accept a text (or Unicode) string.Instead, it accepts the following required options:

  • input (required): The source text file.

    input may be set to one of three options:

    • a string which contains a path to a readable file on the file system,
    • a readable file-like object,
    • or None (default) which will read from stdin.
  • output: The target which output is written to.

    output may be set to one of three options:

    • a string which contains a path to a writable file on the file system,
    • a writable file-like object,
    • or None (default) which will write to stdout.
  • encoding: The encoding of the source text file. Defaults

    to “utf-8”. The same encoding will always be used for input and output.The ‘xmlcharrefreplace’ error handler is used when encoding the output.

    Note: This is the only place that decoding and encoding of unicodetakes place in Python-Markdown. If this rather naive solution does notmeet your specific needs, it is suggested that you write your own codeto handle your encoding/decoding needs.

Python Markdown Syntax Commands

class markdown.Markdown([**kwargs])

The same options are available when initializing the Markdown classas on the markdown() function, except that the class doesnot accept a source text string on initialization. Rather, the source textstring must be passed to one of two instance methods:

Markdown.convert(source)

The source text must meet the same requirements as the textargument of the markdown() function.

You should also use this method if you want to process multiple stringswithout creating a new instance of the class for each string.:

Python Markdown Syntax Example

Note that depending on which options and/or extensions are being used,the parser may need its state reset between each call to convert.:

Python Markdown Syntax

You can also change calls to reset together:

Python Markdown Syntax Highlighting

Markdown.convertFile(**kwargs)

The arguments of this method are identical to the arguments of the samename on the markdownFromFile() function (input, output, and encoding).As with the convert() method, this method should be used toprocess multiple files without creating a new instance of the class foreach document. State may need to be reset between each call toconvertFile() as is the case with convert().