WackoWiki: WackoWiki Formatting Workflow

https://wackowiki.org/doc     Version: 4 (05/11/2026 09:17)

WackoWiki Formatting Workflow

Comprehensive documentation for the WackoWiki formatting workflow.


1. Overview


The WackoWiki text formatting system is a multi-stage pipeline that transforms raw wiki markup into rendered HTML. Understanding the order and purpose of each stage is critical to properly extending or debugging the formatter.

The complete workflow consists of 6 primary stages:

User Input
    ↓
[1] PRE_WACKO - Macro Resolution
    ↓
[2] WACKO FORMATTER - Wiki Markup Parsing
    ↓
[3] TYPOGRAFICA - Typography Enhancement
    ↓
[4] PARAGRAFICA - Paragraph Structuring
    ↓
[5] HIGHLIGHTER - Syntax Highlighting (if applicable)
    ↓
[6] POST_WACKO - Link & Action Resolution
    ↓
Final HTML Output	



2. Stage 1: Pre-Wacko (Macro Resolution)


File: src/formatter/pre_wacko.php
Class: src/formatter/class/pre_wacko.phpPreFormatter

2.1. Purpose

The Pre-Wacko stage processes user-defined macros and preserves special formatting markers before the main wiki parsing begins. This stage acts as a preprocessing layer that resolves temporal and user-specific macros.

2.2. Input

Raw text containing wiki markup with embedded macros, formatter markers, and escaped text.

2.3. Output

Text with macros expanded, but formatter markers and escaped text preserved for later stages.

2.4. Processing Details


The PreFormatter class uses regex patterns to identify and process four types of constructs:

2.4.1. 1. Formatter Text Preservation (`` ... ``)

2.4.2. 2. Formatter Text Preservation (%%...%%)

2.4.3. 3. Escaped Text (""..."")

2.4.4. 4. User Macros (Special ::sequences)

2.5. Workflow


php
// 1. Instantiate PreFormatter with current context
$parser = new PreFormatter($this);

// 2. Apply regex to find all macro/marker patterns
$text = preg_replace_callback($parser->PRE_REGEX, [&$parser, 'precallback'], $text);

// 3. precallback() processes each match:
//    - Identifies the type of construct (formatter, escaped, macro)
//    - Expands macros or preserves markers
//    - Returns the processed construct	

2.6. Key Points



3. Stage 2: Wacko Formatter (Wiki Markup Parsing)


File: src/formatter/wiki.php
Class: src/formatter/class/wackoformatter.phpWackoFormatter

3.1. Purpose

The main formatting engine that parses WackoWiki markup syntax and converts it to intermediate HTML. This stage handles:

3.2. Input

Text from Pre-Wacko stage (with macros expanded and markers preserved).

3.3. Output

Intermediate HTML with:

3.4. Processing Details


The WackoFormatter class is the most complex component. It:
  1. Tokenizes wiki markup syntax
  2. Builds an AST (Abstract Syntax Tree) of the document structure
  3. Traverses the tree to generate HTML
  4. Handles nesting of elements (lists within lists, emphasis within links, etc.)
  5. Preserves marked sections (formatter text, escaped text, code blocks)

3.5. Key Features

3.6. Example Transformations


**bold** → <strong>bold</strong>
//italic// → <em>italic</em>
= Header = → <h1>Header</h1>
[[link]] → <!--link:begin-->link==link<!--link:end-->
{{action}} → <!--action:begin-->action<!--action:end-->	

3.7. Workflow


php
// In wiki.php
$text = $this->format($text, 'wacko');

// This calls WackoFormatter which:
// 1. Parses entire wiki markup into HTML
// 2. Wraps links and actions in special markers
// 3. Preserves typography-sensitive sections
// 4. Returns intermediate HTML ready for next stage	

3.8. Key Points



4. Stage 3: Typografica (Typography Enhancement)


File: src/formatter/class/typografica.php
Class: Typografica

4.1. Purpose

Enhance typography and readability of text within HTML. This stage processes already-parsed HTML to apply language-specific typography rules and improve spacing/punctuation.

4.2. Input

Intermediate HTML from WackoFormatter stage.

4.3. Output

HTML with typography improvements applied:

4.4. Processing Details


The Typografica::correct() method applies 10 transformation phases:

4.4.1. Phase -2: Preserve Ignored Regions

4.4.2. Phase -1: HTML Tag Stripping (Optional)

4.4.3. Phase 0: HTML Tag Preservation

4.4.4. Phase 1: Spacing Corrections

4.4.5. Phase 2: Special Character Replacements

Depending on settings:


Pattern Replacement Code Point
" (inches) " U+0022
' (apostrophe) ' U+2019 (configurable)
“..." (English) “..." U+201C/U+201D
“..." (angle) «...» U+00AB/U+00BB
(en-dash) U+2013
(em-dash) U+2014
© © U+00A9
® ® U+00AE
U+2122
§ § U+00A7
± ± U+00B1
^C / ^F / ^K °C / °F / °K U+00B0

4.4.6. Phase 3: Short Word Spacing

4.4.7. Phase 4: Hyphenated Word Wrapping

4.4.8. Phase 5: Macro Processing

4.4.9. Phase ∞: Tag Restoration

4.4.10. Phase ∞+1: Ignored Region Restoration

4.5. Workflow


php
$typo = new Typografica($this, $options);
$text = $typo->correct($text);	

4.6. Key Points



5. Stage 4: Paragrafica (Paragraph Structuring)


File: src/formatter/paragrafica.php
Class: src/formatter/class/paragrafica.phpParagrafica

5.1. Purpose

Insert semantic <p> tags around text blocks that lack explicit block-level markup. Converts loose text and <br> sequences into proper paragraphs while respecting already-structured block elements.

5.2. Input

HTML from Typografica stage (with proper typography but possibly lacking paragraph tags).

5.3. Output

HTML with:

5.4. Processing Details


The Paragrafica::correct() method uses a sophisticated “terminator” system with special markers:

5.4.1. Marker System


Marker Purpose
<:t-> Start paragraph zone (left terminator)
<:-t> End paragraph zone (right terminator)
<:::> Wronginator: indicates problematic nesting (table cells, list items)
<:-:> Ultimate wronginator: never insert paragraphs here

5.4.2. Processing Steps


Step -2: Preserve Ignored Regions

Step -1: Remove Prefix

Step 1: Insert Terminators

Step 2: Clean Up Whitespace

Step 3: Generate Paragraph Tags

5.4.3. Example Transformation


html
Before:
Text without paragraph
<table>
  <tr><td>In table</td></tr>
</table>
More text

After:
<p id="p12345-1" class="auto">Text without paragraph</p>
<table>
  <tr><td>In table</td></tr>
</table>
<p id="p12345-2" class="auto">More text</p>	

5.5. TOC (Table of Contents) Extraction


After paragraph insertion, Paragrafica builds a TOC by:
  1. Finding all <h1>...<h6> headers with IDs 
  2. Extracting header depth from tag name
  3. Finding all auto-generated <p> tags with IDs 
  4. Identifying included pages via {{include page="..."}} actions
  5. Storing in $para->toc array for later use

5.6. Workflow


php
$para = new Paragrafica($this);
$result = $para->correct($text);
$this->set_toc_array($para->toc);  // Store TOC for later	

5.7. Key Points



6. Stage 5: Highlighter (Syntax Highlighting)


File: src/formatter/class/highlighter.php (if applicable based on configuration)
Purpose: Apply syntax highlighting to code blocks

6.1. Note

The highlighter stage is optional and depends on:

6.2. Input

HTML with <pre> or <code> blocks marked for highlighting.

6.3. Output

HTML with syntax-highlighted code blocks (language-specific color and markup).


7. Stage 6: Post-Wacko (Link & Action Resolution)


File: src/formatter/post_wacko.php
Class: src/formatter/class/post_wacko.phpPostWacko

7.1. Purpose

Process dynamically-deferred links and actions. By the time we reach this stage, the document is structurally complete; now we resolve references that depend on:

7.2. Input

HTML from previous stages containing:

7.3. Output

Final HTML with:

7.4. Processing Details


The PostWacko::postcallback() method handles three types of constructs:

7.4.1. 1. Forced Links

Marker Format: <!--link:begin-->URL==DESCRIPTION<!--link:end-->

Processing:

Example:
<!--link:begin-->http://example.com==Click here<!--link:end-->
↓
<a href="http://example.com">Click here</a>	

7.4.2. 2. Image Links

Marker Format: <!--imglink:begin-->LINK_TARGET==IMAGE_SOURCE<!--imglink:end-->

Processing:

Example:
<!--imglink:begin-->page.html==file:image.jpg<!--imglink:end-->
↓
<a href="page.html"><img src="/file/image.jpg" /></a>	

7.4.3. 3. Actions

Marker Format: <!--action:begin-->ACTION_NAME param1="value1" param2=value2<!--action:end-->

Processing:

Parameter Parsing:
{{include page="MyPage" notoc=1 limit=5}}
↓
$action = 'include'
$params = [
  0 => 'MyPage',       // positional
  'page' => 'MyPage',  // named
  1 => 1,              // positional
  'notoc' => 1,        // named
  2 => 5,              // positional
  'limit' => 5         // named
]	

7.4.4. 4. Formatter Marker Stripping (Optional)

If $options['strip_marker'] is true:

7.5. Workflow


php
// In wiki.php, if post_wacko processing requested
if (isset($options['post_wacko']))
{
    $options['strip_marker'] = true;
    include Ut::join_path(FORMATTER_DIR, 'post_wacko.php');
}

// In post_wacko.php
$parser = new PostWacko($this, $options);
$text = preg_replace_callback(
    '/(<!--link:begin-->...<!--link:end-->|' .
    '<!--imglink:begin-->...<!--imglink:end-->|' .
    '<!--action:begin-->...<!--action:end-->)/usm',
    [&$parser, 'postcallback'],
    $text
);

// Optionally strip temporary markers
if ($options['strip_marker'])
{
    $text = str_replace(['<!--noinclude-->', '<!--/noinclude-->', ...], '', $text);
}	

7.6. Key Points



8. Complete Flow Example


Let's trace a simple example through the entire pipeline:

8.1. Input

== My Header ==

This is a **bold** paragraph with [[a link]].

{{include page="Template"}}	

8.2. After Pre-Wacko

== My Header ==

This is a **bold** paragraph with [[a link]].

{{include page="Template"}}	

(unchanged; no macros or special markers)

8.3. After Wacko Formatter

<h2 id="h12345-1" class="heading">My Header<a class="self-link" href="#h12345-1"></a></h2>

This is a <strong>bold</strong> paragraph with <!--link:begin-->a link==a link<!--link:end-->.

<!--action:begin-->include page="Template"<!--action:end-->	

8.4. After Typografica

<h2 id="h12345-1" class="heading">My Header<a class="self-link" href="#h12345-1"></a></h2>

This is a <strong>bold</strong> paragraph with <!--link:begin-->a link==a link<!--link:end-->.

<!--action:begin-->include page="Template"<!--action:end-->	

(unchanged; no special typography rules triggered)

8.5. After Paragrafica

<h2 id="h12345-1" class="heading">My Header<a class="self-link" href="#h12345-1"></a></h2>

<p id="p12345-1" class="auto">This is a <strong>bold</strong> paragraph with <!--link:begin-->a link==a link<!--link:end-->.</p>

<!--action:begin-->include page="Template"<!--action:end-->	

8.6. After Post-Wacko

<h2 id="h12345-1" class="heading">My Header<a class="self-link" href="#h12345-1"></a></h2>

<p id="p12345-1" class="auto">This is a <strong>bold</strong> paragraph with <a href="/a link" class="link">a link</a>.</p>

<div class="template-content">
  <!-- included template content here -->
</div>	



9. Architecture Decisions & Trade-offs

9.1. Multi-Stage Design Benefits

  1. Separation of Concerns: Each stage has a specific responsibility
  2. Debuggability: Output of each stage can be inspected independently
  3. Extensibility: New stages can be inserted or existing ones modified
  4. Reusability: Stages can potentially be used in different contexts
  5. Performance: Some stages can be cached independently

9.2. Marker-Based Approach

9.3. Preserved Sections Strategy



10. Extending the Formatter

10.1. Adding a Custom Formatter Stage

  1. Create a new class following the pattern: class MyFormatter { public function process(&$wacko, $text) {...} }
  2. Insert in the pipeline after appropriate stage (consider data dependencies)
  3. Handle markers – respect existing formatter markers from previous stages
  4. Add to handler – modify the show handler or renderer to call your stage

10.2. Modifying Wiki Syntax

10.3. Adding Typography Rules



11. Performance Considerations

  1. Regex Complexity: Most stages are regex-heavy; optimize patterns if processing large documents
  2. Caching: Consider caching formatter output if documents are static
  3. Stage Skipping: If not using certain features (actions, syntax highlighting), skip those stages
  4. Memory: Large documents with many markers may consume significant memory


12. Debugging Tips

  1. Enable Stage Output: Echo output after each stage to see transformations
  2. Check Marker Integrity: Ensure markers remain balanced and properly nested
  3. Test Individual Stages: Create test input at each stage's interface
  4. Review Regex: Use online regex testers to verify complex patterns
  5. Trace Callback Functions: These are often where logic errors occur


13. Summary Table


Stage Input Output Purpose Key Class
1. Pre-Wacko Raw text + macros Text + expanded macros Resolve user macros PreFormatter
2. Wacko Text + markup Intermediate HTML Parse wiki syntax WackoFormatter
3. Typografica Intermediate HTML Typographic HTML Typography rules Typografica
4. Paragrafica Typographic HTML Structured HTML Add <p> tags + TOC  Paragrafica
5. Highlighter Structured HTML Highlighted HTML Syntax highlighting (Optional)
6. Post-Wacko Marked HTML Final HTML Resolve links/actions PostWacko