Typografica Revisited
see Project Typografica[link1]
In its current form Typografica focuses only on Latin and Kyrillic. With Unicode support it is turned off by default, because it has no rule sets for non-latin languages.
Rule sets for non-latin languages can be added and applied in the corresponding context.
1. Typografica
static- parsed once when saving a page (
body_r
) or file description
- when used in a handler or action like files
Transforms
"
&"
quotes into either English quotes or angle quotes when enabled.However which quotes are chosen, should be determined by the provided context, which can be the
- text language
- page language
- user language
Elar2000[link2]:
It took me some time to read about the quotes in various languages as there are not only Latin and Cyrillic quotes, there are lots of quote types that are default for various keyboard layouts:
https://en.wikipedia.org/wiki/Quotation_mark
https://en.wikipedia.org/wiki/Keyboard_layout
Additional options for typografica may include moving of native quote symbols (and maybe some other native symbols) to lang.XX.php variables for certain lanuages to ensure the proper typography in various languages.
And at the same time the regexps in Typorgafica should evolve to work with lang.XX.php variables. But this should be done carefully taking the performance into consideration.
Turn off English quotes for non-english Latin pages:
Spanish, German
diff --git a/src/formatter/class/typografica.php b/src/formatter/class/typografica.php index beeb383..ce607bb 100644 --- a/src/formatter/class/typografica.php +++ b/src/formatter/class/typografica.php @@ -222,7 +222,7 @@ } // 1. English quotes (\p{Latin} only) - if ($this->settings['quotes']) + if ($this->settings['quotes'] && $this->options['lang'] == 'en') { $data = str_replace('""', '""', $data); $data = str_replace('"."', '"."', $data);
2. Quotes
Input | Name | Sign | Example |
---|---|---|---|
english quotes | |||
“ | left double quotation mark | U+201C | “simple data” |
” | right double quotation mark | U+201D | |
„ | double low quotation mark | U+201E | „simple data‟ |
‟ | double high reversed quotation mark | U+201F | |
angle quotes | |||
« | left guillemet | U+00AB | «simple data» |
» | right guillemet | U+00BB |
- english quotes:
\p{Latin}
and English only - angle quotes:
\p{Cyrillic}
and\p{Latin}
only?
3. Tests
one "english" two "русский" three
"english"
"русский"
<text>
(123) 567-890
- dash
-- long dash
(c)
(r)
(tm)
(p)
+-
^C
^F
4. Settings
passlang
as option -> page_lang
-> user_lang
(English quotes, etc.)%%(notypo) %%
wrapper -> page setting to turn it off (?)(c)
, (r)
, (p)
are rather problematic because this syntax may conflict when used as inline counter or list.Suggestion: disable all tree options by default
4.1. Defaults
dashglue
is turned off by default (see formatter/typografica.php)public array $settings = [ 'inches' => 1, // convert inches into " 'apostroph' => 0, // apostrophe converter 'laquo' => 1, // angle quotes 'quotes' => 1, // English quotes 'dash' => 1, // (150) - middle dash 'emdash' => 1, // (151) - long dash by two minus '(c)' => 1, // special characters, as you know '(r)' => 1, '(tm)' => 1, '(p)' => 1, '+-' => 1, 'degrees' => 1, // degree character '[--]' => 1, // indents like $Indent* 'dashglue' => 1, // dash glue 'wordglue' => 1, // word glue 'spacing' => 1, // comma and spacing, exchange 'phones' => 1, // phone number processing 'html' => 0 // HTML tags ban ];
5. Special symbols
The main function of the page is to replace it with auto-correct characters. Pay attention to the quotes-Christmas trees and the processing of cons. Minuses are converted to dashes only if surrounded by delimiters (spaces, line breaks, tabs)Input | Output | HTML |
---|---|---|
"русский" | «русский» | « » |
"english" | “english” | “ ” |
- (minus) | – | – average dash |
-- (2 minus) | — | — long dash |
(c) | © | © |
(r) | ® | <sup>®</sup> |
(p) | § | § |
+- | ± | ± |
^C | °C | °C |
^F | °F | °F |
6. Macros
Red lineSmall red line: <-> (insert /z.gif length 25 pixels.)
Large red line: <-> (insert /z.gif length 50 pixels.)
7. Heuristics
Translating strings
Enabled heuristics replaces line feeds on
<br>
. You can only include the replacement of double transfers (as an alternative to paragraphs).Prepositions and nbsp;
Bold heuristics suggests that short words (1-3 characters) should not break away from following them. Therefore, it replaces the separators following a short word with a non-breaking space.
Dashes and nbsp;
Conclusion of all words separated by dashes into
<nobr>
tags.Commas and spaces
Control negligence of the author. Removes random spaces before commas and periods.
Prevent HTML tags
Prevents the use of HTML tags and special sequences
&xxxx;
in the text.- [link1] https://wackowiki.org/doc/Dev/Projects/Typografica
- [link2] https://wackowiki.org/doc/Users?profile=Elar2000