Skip to content

Commit 5389174

Browse files
committed
HeaderId Extension marked as Pending Deprecation.
Use the Table of Contents Extension instead. The HeaderId Extension will raise a PendingDeprecationWarning. The last few features of the HeaderID extension were mirgrated to TOC including the baselevel and separator config options. Also, the marker config option of TOC can be set to an empty string to disable searching for a marker. The `slugify`, `unique` and `stashedHTML2text` functions are now defined in the TOC extension in preperation for the HeaderId extension being removed. All coresponding tests are now run against the TOC Extension. The meta-data support of the HeaderId Extension was not migrated and no plan exists to make that migration. The `forceid` config option makes no sense in the TOC Extension and the only other config setting supported by meta-data was the `header_level`. However, as that depends on the template, it makes more sense to not be defined at the document level.
1 parent fef6e28 commit 5389174

File tree

6 files changed

+289
-157
lines changed

6 files changed

+289
-157
lines changed

docs/extensions/header_id.txt

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,13 @@ elements (`h1`-`h6`) in the resulting HTML document.
1515

1616
This extension is included in the standard Markdown library.
1717

18+
!!! warning
19+
This extension is **Pending Deprecation**. The [Table of Contents][toc]
20+
Extension should be used instead, which offers most the features of this
21+
extension and more.
22+
23+
[toc]: toc.html
24+
1825
Syntax
1926
------
2027

@@ -55,7 +62,7 @@ The following options are provided to configure the output:
5562
>>> text = '''
5663
... #Some Header
5764
... ## Next Level'''
58-
>>> from markdown.extensions.headerid import HeaderIdExtension
65+
>>> from markdown.extensions.headerid import HeaderIdExtension
5966
>>> html = markdown.markdown(text, extensions=[HeaderIdExtension(level=3)])
6067
>>> print html
6168
<h3 id="some_header">Some Header</h3>

docs/extensions/toc.txt

Lines changed: 67 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,20 @@ This extension is included in the standard Markdown library.
1818
Syntax
1919
------
2020

21+
By default, all headers will automatically have unique `id` attributes
22+
generated based upon the text of the header. Note this example, in which all
23+
three headers would have the same `id`:
24+
25+
#Header
26+
#Header
27+
#Header
28+
29+
Results in:
30+
31+
<h1 id="header">Header</h1>
32+
<h1 id="header_1">Header</h1>
33+
<h1 id="header_2">Header</h1>
34+
2135
Place a marker in the document where you would like the Table of Contents to
2236
appear. Then, a nested list of all the headers in the document will replace the
2337
marker. The marker defaults to `[TOC]` so the following document:
@@ -41,6 +55,14 @@ would generate the following output:
4155
<h1 id="header-1">Header 1</h1>
4256
<h1 id="header-2">Header 2</h1>
4357

58+
Regardless of whether a `marker` is found in the document (or disabled), the Table of
59+
Contents is available as an attribute (`toc`) on the Markdown class. This allows
60+
one to insert the Table of Contents elsewhere in their page template. For example:
61+
62+
>>> md = markdown.Markdown(extensions=['markdown.extensions.toc'])
63+
>>> html = md.convert(text)
64+
>>> page = render_some_template(context={'body': html, 'toc': md.toc})
65+
4466
Usage
4567
-----
4668

@@ -53,37 +75,57 @@ configuring extensions.
5375
The following options are provided to configure the output:
5476

5577
* **`marker`**:
56-
Text to find and replace with the Table of Contents. Defaults
57-
to `[TOC]`.
78+
Text to find and replace with the Table of Contents. Defaults to `[TOC]`.
79+
80+
Set to an empty string to disable searching for a marker, which may save some time,
81+
especially on long documents.
5882

59-
Regardless of whether a `marker` is found in the document, the Table of Contents is
60-
also available as an attribute (`toc`) of the Markdown class. This allows one to insert
61-
the Table of Contents elsewhere in their page template. For example:
83+
* **`title`**:
84+
Title to insert in the Table of Contents' `<div>`. Defaults to `None`.
6285

63-
>>> text = '''
64-
# Header 1
86+
* **`anchorlink`**:
87+
Set to `True` to cause all headers to link to themselves. Default is `False`.
6588

66-
## Header 2
67-
'''
68-
>>> md = markdown.Markdown(extensions=['markdown.extensions.toc'])
69-
>>> html = md.convert(text)
70-
>>> render_some_template(context={'body': html, 'toc': md.toc})
89+
* **`permalink`**:
90+
Set to `True` or a string to generate permanent links at the end of each header.
91+
Useful with Sphinx stylesheets.
92+
93+
When set to `True` the paragraph symbol (&para; -- `&para;`) is used as the link
94+
text. When set to a string, the provided string is used as the link text.
95+
96+
* **`baselevel`**:
97+
Base level for headers.
98+
99+
Default: `1`
100+
101+
The `baselevel` setting allows the header levels to be automatically adjusted to
102+
fit within the hierarchy of your html templates. For example, suppose the
103+
Markdown text for a page should not contain any headers higher than level 3
104+
(`<h3>`). The following will accomplish that:
105+
106+
>>> text = '''
107+
... #Some Header
108+
... ## Next Level'''
109+
>>> from markdown.extensions.toc import TocExtension
110+
>>> html = markdown.markdown(text, extensions=[TocExtension(baselevel=3)])
111+
>>> print html
112+
<h3 id="some_header">Some Header</h3>
113+
<h4 id="next_level">Next Level</h4>'
71114

72115
* **`slugify`**:
73-
Callable to generate anchors based on header text. Defaults to a built in
74-
`slugify` method. The callable must accept two arguments, the first
75-
contains the text content of the header and the second contains the
76-
separator. It should then return a string which will be used as the anchor
77-
text.
116+
Callable to generate anchors.
78117

79-
* **`title`**:
80-
Title to insert in the Table of Contents' `<div>`. Defaults to `None`.
118+
Default: `markdown.extensions.headerid.slugify`
81119

82-
* **`anchorlink`**:
83-
Setting to `True` will cause the headers link to themselves. Default is
84-
`False`.
120+
In order to use a different algorithm to define the id attributes, define and
121+
pass in a callable which takes the following two arguments:
85122

86-
* **`permalink`**:
87-
Set to `True` to have this extension generate a Sphinx-style permanent links
88-
near the headers (for use with Sphinx stylesheets).
123+
* `value`: The string to slugify.
124+
* `separator`: The Word Separator.
125+
126+
The callable must return a string appropriate for use in HTML `id` attributes.
127+
128+
* **`separator`**:
129+
Word separator. Character which replaces whitespace in id.
89130

131+
Default: `-`

docs/release-2.6.txt

Lines changed: 35 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -96,6 +96,19 @@ Backwards-incompatible Changes
9696
be used instead. See the [documentation](reference.html#extension-configs)
9797
for a full explaination of the current behavior.
9898

99+
* The [HeaderId][hid] Extension is pending deprecation and will raise a
100+
**`PendingDeprecationWarning`** in version 2.6. The extension will be
101+
deprecated in version 2.7 and raise an error in version 2.8. Use the
102+
[Table of Contents][TOC] Extension instead, which offers most of the
103+
features of the HeaderId Extension and more (support for meta data is missing).
104+
105+
Extension authors who have been using the `slugify` and `unique` functions
106+
defined in the HeaderId Extension should note that those functions are now
107+
defined in the Table of Contents extension and should adjust their import
108+
statements accordingly (`from markdown.extensions.toc import slugify, unique`).
109+
110+
[hid]: extensions/headerid.html
111+
99112
What's New in Python-Markdown 2.6
100113
---------------------------------
101114

@@ -110,15 +123,29 @@ What's New in Python-Markdown 2.6
110123
[Meta-Data]: extensions/meta_data.html
111124
[YAML]: http://yaml.org/
112125

113-
* The [TOC] Extension has been refactored. Significantly, the extension now
114-
assigns the Table of Contents to the `toc` attrbibute of the Markdown class
115-
regardless of whether a "marker" was found in the document. Third party
116-
frameworks no longer need to insert a "marker," run the document through
117-
Markdown, then extract the TOC from the document.
126+
* The [Table fo Contents][TOC] Extension has been refactored and some new features
127+
have been added. See the documentation for a full explaination of each feature
128+
listed below:
129+
130+
* The extension now assigns the Table of Contents to the `toc` attribute of
131+
the Markdown class regardless of whether a "marker" was found in the document.
132+
Third party frameworks no longer need to insert a "marker," run the document
133+
through Markdown, then extract the TOC from the document.
118134

119-
Additionaly, the TOC Extension is now a "registered extension." Therefore,
120-
when the `reset` method of the Markdown class is called, the `toc` attribute
121-
on the Markdown class is cleared (set to an empty string).
135+
* The TOC Extension is now a "registered extension." Therefore, when the `reset`
136+
method of the Markdown class is called, the `toc` attribute on the Markdown
137+
class is cleared (set to an empty string).
138+
139+
* When the `marker` config option is set to an empty string, the parser completely
140+
skips the process of searching the document for markers. This should save parsing
141+
time when the TOC Extension is being used only to assign ids to headers.
142+
143+
* A `separator` config option has been added allowing users to override the
144+
separator character used by the slugify function.
145+
146+
* A `baselevel` config option has been added allowing users to set the base level
147+
of headers in their documents (h1-h6). This allows the header levels to be
148+
automatically adjusted to fit within the hierarchy of an html template.
122149

123150
[TOC]: extensions/toc.html
124151

markdown/extensions/headerid.py

Lines changed: 10 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -19,64 +19,13 @@
1919
from __future__ import unicode_literals
2020
from . import Extension
2121
from ..treeprocessors import Treeprocessor
22-
from ..util import HTML_PLACEHOLDER_RE, parseBoolValue
23-
import re
22+
from ..util import parseBoolValue
23+
from .toc import slugify, unique, stashedHTML2text
2424
import logging
25-
import unicodedata
25+
import warnings
2626

2727
logger = logging.getLogger('MARKDOWN')
28-
29-
IDCOUNT_RE = re.compile(r'^(.*)_([0-9]+)$')
30-
31-
32-
def slugify(value, separator):
33-
""" Slugify a string, to make it URL friendly. """
34-
value = unicodedata.normalize('NFKD', value).encode('ascii', 'ignore')
35-
value = re.sub('[^\w\s-]', '', value.decode('ascii')).strip().lower()
36-
return re.sub('[%s\s]+' % separator, separator, value)
37-
38-
39-
def unique(id, ids):
40-
""" Ensure id is unique in set of ids. Append '_1', '_2'... if not """
41-
while id in ids or not id:
42-
m = IDCOUNT_RE.match(id)
43-
if m:
44-
id = '%s_%d' % (m.group(1), int(m.group(2))+1)
45-
else:
46-
id = '%s_%d' % (id, 1)
47-
ids.add(id)
48-
return id
49-
50-
51-
def itertext(elem):
52-
""" Loop through all children and return text only.
53-
54-
Reimplements method of same name added to ElementTree in Python 2.7
55-
56-
"""
57-
if elem.text:
58-
yield elem.text
59-
for e in elem:
60-
for s in itertext(e):
61-
yield s
62-
if e.tail:
63-
yield e.tail
64-
65-
66-
def stashedHTML2text(text, md):
67-
""" Extract raw HTML, reduce to plain text and swap with placeholder. """
68-
def _html_sub(m):
69-
""" Substitute raw html with plain text. """
70-
try:
71-
raw, safe = md.htmlStash.rawHtmlBlocks[int(m.group(1))]
72-
except (IndexError, TypeError):
73-
return m.group(0)
74-
if md.safeMode and not safe:
75-
return ''
76-
# Strip out tags and entities - leaveing text
77-
return re.sub(r'(<[^>]+>)|(&[\#a-zA-Z0-9]+;)', '', raw)
78-
79-
return HTML_PLACEHOLDER_RE.sub(_html_sub, text)
28+
logging.captureWarnings(True)
8029

8130

8231
class HeaderIdTreeprocessor(Treeprocessor):
@@ -94,7 +43,7 @@ def run(self, doc):
9443
if "id" in elem.attrib:
9544
id = elem.get('id')
9645
else:
97-
id = stashedHTML2text(''.join(itertext(elem)), self.md)
46+
id = stashedHTML2text(''.join(elem.itertext()), self.md)
9847
id = slugify(id, sep)
9948
elem.set('id', unique(id, self.IDs))
10049
if start_level:
@@ -127,6 +76,11 @@ def __init__(self, *args, **kwargs):
12776

12877
super(HeaderIdExtension, self).__init__(*args, **kwargs)
12978

79+
warnings.warn(
80+
'The HeaderId Extension is pending deprecation. Use the TOC Extension instead.',
81+
PendingDeprecationWarning
82+
)
83+
13084
def extendMarkdown(self, md, md_globals):
13185
md.registerExtension(self)
13286
self.processor = HeaderIdTreeprocessor()

0 commit comments

Comments
 (0)