Customizing page addresses

Ask more experienced users if it is possible to use transliteration from Cyrillic to Latin in the address of the wackowiki page. Or the ability to set the address of a page with a different name than the page name? Version 6.0.10 Unicode. Thank you in advance.

Comments

  1. Re: Customizing page addresses

    The former usage of Transliteration had two purposes, one of which has been omitted with UTF-8.

    transliterator_transliterate(
    	"Any-Latin;
    	Latin-ASCII;
    	[\u0100-\u7fff] remove;
    	Lower()",
    	$text);

    http://userguide.icu-project.org/transforms/general

    It is not suited as supertag because it is not unique:
    /わびさび
    /WabiSabi
    same translit result -> wabisabi

    It is not suited for validation, but it can be used for normalization or sanitization for filenames, usernames, search or slugs. Terefore the supertag and the page tag transliteration has been removed, because the page tag must be unique.

    What worked within a specific charset for a language no longer works with Unicode.

    For old backlinks you can create redirects in your .htaccess file or add pages with redirects. Furthermore the page tag is now also accent and case-sensitive.
    • WikiAdmin
    • 12.08.2020 11:38 edited
  2. Re: Customizing page addresses

    The reasons for refusing transliteration when switching to Unicode are clear to me.
    I just created a new site and no redirection to the old pages is needed.
    But when copying the link
    https://example.com/Начало (rus)
    I get
    https://example.com/%D0%9D%D0%B0%D1%87%D0%B0%D0%BB%D0%B
    and it's awful!
    Even this view would suit me
    https://example.com/view.php?f=33&t=13270
    or
    https://example.com/Nachalo
    but I don't know how to implement it. : ((
    • megaego
    • 12.08.2020 14:05 edited
  3. Re: Customizing page addresses

    A. URL in the address bar of the browser
    There is a solution, but each browser may use different default settings regarding the IRI format.

    In Firefox you can set the browser.urlbar.decodeURLsOnCopy to true via about:config, so if you copy the URL it is not 
    https://ru.wikipedia.org/wiki/[...]81%D0%B0%D0%B1%D0%B8
    but
    https://ru.wikipedia.org/wiki/Ваби-саби
    When you copy an URL in the address bar of Firefox and leaving the first letter out, it also copies the string as is, e.g. ttp:/example.com/Начало
    Opera and Vivaldi for example use the IRI format by default.

    Chrome Extension
    Copy Unicode URLs

    B. HTML links
    However the mini_href() function in WackoWiki percent-encodes all URIs to ensure backwards-compatibility with systems that do not support the IRI format.
    // percent-encode the non-ASCII bytes (rfc3986)
    if ($encode)
    {
    	$tag = str_replace(['%2F', '%3F', '%3D'], ['/', '?', '='], rawurlencode($tag));
    }

    rawurlencode
    So you can try to turn the default of the $encode parameter in the href() function from true to false. It needs more feedback and evaluation, may be we can add a config option to disable the percent-encoding partially. Some functions require the rfc3986 compliant format mandatory. HTTP does not allow IRI, only URI. IRIs are a URI generalization that allows Unicode characters.

    Format Example
    URI  https://ru.wikipedia.org/wiki/%D0%92%D0%B0%D0%B1%D0%B8-%D1%81%D0%B0%D0%B1%D0%B8
    IRI  https://ru.wikipedia.org/wiki/Ваби-саби
    RFC 3986
    RFC 3987
    Uniform Resource Identifier
    Internationalized Resource Identifier

    C. Alternate links
    Via the URI router alternate links can be created like the permalink. These links can be created for any possible purpose.
    Request handling
    URI Router
    • WikiAdmin
    • 22.08.2020 05:42 edited
  4. Re: Customizing page addresses

    I'm sorry I'm a noob, where exactly is the file in which I have to change the configuration to enable IRI.
    Searching didn't help me.
    • megaego
    • 14.08.2020 10:40 edited
  5. Re: Customizing page addresses

    Do you want copy the IRI for the recent page or a random link in the page?
    Do you need the IRI as reference in a email, other application or database?

    Till we find a common solution, I suggest you can copy the IRI from the address bar of the browser. I also found some browser plugins which transform the URI into the IRI format. Isn't the IRI format conversion a much more common issue? From the user's point of view, the conversion to IRI format mainly serves the readability of the link.

    The sitemap.xml and the canonical link use the IRI format. Some search engines show the IRI formatted link others only the URI. It will take some time before we get a bigger picture, until then we can share our insights, work arounds and best practice. I hope for some feedback on how this has been solved elsewhere.

    Regarding WackoWiki we must evaluate where we can (optionally) use the IRI format without breaking the application.
    • WikiAdmin
    • 30.08.2020 16:58 edited