%% This is part of the OpTeX project, see http://petr.olsak.net/optex

\_codedecl \makeindex {Makeindex and sorting <2024-12-01>} % preloaded in format

   \_doc -----------------------------
   \^`\makeindex` implements sorting algorithm at \TeX/ macro-language level.
   You need not any external program. The sorting can be used for various
   other applications, see an example in
   \ulink[http://petr.olsak.net/optex/optex-tricks.html\#sort]{\OpTeX/ trick~0068}.

   There are two passes in the sorting algorithm. The primary pass does not
   distinguish between a group of letters (typically non-accented and
   accented). If the result of comparing two string is equal in primary pass
   then the secondary pass is started. It distinguishes between variously accented
   letters. Czech rules, for example, says: not accented before dieresis
   before acute before circumflex before ring. At less priority: lowercase
   letters must be before uppercase letters.

   The \`\_sortingdatalatin`
   implements these rules for the languages with latin alphabets.
   The groups between commas are not distinguished in the first
   pass. The second pass distinguishes all characters mentioned in the
   \^`\_sortingdatalatin` (commas are ignored). The order of letters
   in the \^`\_sortingdatalatin` macro is significant for the sorting algorithm.
   \_cod -----------------------------

\_def \_sortingdatalatin {%
  |,<,/,{ },_,-,&,@,%
  aAàÀâÂäÄáÁ,%
  ąĄ,%
  bB,%
  cC,%
  ćĆčČ,%
  dDďĎ,%
  eEèÈéÉëËêÊěĚ,%
  ęĘ,%
  fF,%
  gG,%
  hH,%
  ^^T^^U^^V,% ch Ch CH
  iIíÍïÏîÎ,%
  jJ,%
  kK,%
  lLĺĹľĽ,%
  łŁ,%
  mM,%
  nNňŇ,%
  ńŃñÑ,%
  oOöÖóÓôÔ,%
  pP,%
  qQ,%
  rRŕŔ,%
  řŘ,%
  sSß,%
  śŚšŠ,%
  tTťŤ,%
  uUùÙûÛüÜúÚůŮűŰ,%
  vV,%
  wW,%
  xX,%
  yYýÝÿŸ,%
  zZ,%
  žŽ,%
  źŹ,%
  żŻ,%
  ^^Z,% Hungarian: cz:c^^Z, etc., see \_compoundcharshu in lang-data.opm
  0,1,2,3,4,5,6,7,8,9,',>%
}

   \_doc -----------------------------
   Characters to be ignored during sorting are declared in  \`\_ignoredcharsgeneric`.
   These characters are ignored in the first pass
   without additional condition. All characters are taken into account in
   the second pass: ASCII characters with code $\string<65$ are sorted first if they
   are not mentioned in the `\_sortingdata...` macro.
   Others not mentioned characters have undefined behavior during sorting.
   \_cod -----------------------------

\_def \_ignoredcharsgeneric  {.,;?!:'"()[]=+-}

   \_doc -----------------------------
   Sorting is always processed by rules of a given language. The macros
   \`\_sortingdata``<lang-tag>`, \`\_ignoredchars``<lang-tag>`
   and \`\_compoundchars``<lang-tag>` declare these rules.
   The `<lang-tag>` is ISO code of the language: en, cs, de, pl, es for example.
   The English language is implemented here. Other languages are implemented
   in the `lang-data.opm` file (see section~\ref[langdata]).
   \_cod -----------------------------

\_let \_sortingdataen = \_sortingdatalatin   % English alphabet is subset of Latin
\_let \_ignoredcharsen = \_ignoredcharsgeneric
\_def \_compoundcharsen {}  % English doesn't have compound characters like DZ

   \_doc -----------------------------
   The \^`\_compoundchars``<lang-tag>` can declare changes performed before
   sorting. For example Czech language declares:
   \begtt
   \_let \_sortingdatacs = \_sortingdatalatin  % Czech alphabet is subset of Latin
   \_def \_compoundcharscs {ch:^^T Ch:^^U CH:^^V}
   \endtt
   It transforms two-letters `ch` to single character `^^T` because ch is
   treated as single compound character by Czech rules and CH is sorted between H and I.
   See \^`\_sortingdatalatin` where `^^T` is used. This declaration makes more
   transformations of Ch and CH too. The declarations of the form `x:y` in the
   \^`\_compoundchars``<lang-tag>` are separated by space.

   You can declare a transformation from single letter to more letters too. For
   example German rules sets ß equal to ss during sorting:
   \begtt
   \_let \_sortingdatade = \_sortingdatalatin  % German alphabet is subset of Latin
   \_def \_compoundcharsde {ß:ss}
   \endtt
   If there are two words equal after first pass of sorting: Masse (mass)
   and Maße (measures) for example, then second pass must decide about the
   order. DIN 5007, section 6.1 says: ss must be before ß in this case. So,
   we want to switch off the \^`\_compoundchars` declaration for the second
   pass and use the order of s and ß given in `\_sortingdata`. This is
   possible if the \`\_xcompoundchars``<lang-tag>` is defined.
   It has precedence in the second pass of sorting. We declare for German:
   \begtt
   \_def \_xcompoundcharsde {}
   \endtt
   Geman rules mention alternative sorting for phone-books or similar lists of
   names. The letters ä ö ü should be interpreted as ae, oe and ue. So we get
   Mueller $\lt$ Müller $\lt$ Muff. If this rule is not taken into account, we get
   Mueller $\lt$ Muff $\lt$ Müller. The rule can be implemented by:
   \begtt
   \_def \_compoundcharsde {ß:ss Ä:AE Ö:OE Ü:UE ä:ae ö:oe ü:ue}
   \endtt
   Because u $\lt$ ü in `\_sortingdata` and because `\_xcompoundcharsde` is
   empty, we have Mueller $\lt$ Müller after second pass of the sorting.

   You can declare these macros for more languages if you wish to use
   `\makeindex` with sorting rules with respect to your language.
   Note: if you need to map compound characters to a character, don't use
   `^^I`, `^^J` or `^^M` because these characters have very specific category codes.
   
   If you created `\_sortingdata` etc. for your language, please, send them
   to me. I am ready to add them to the file `lang-data.opm` in a new \OpTeX/ release.
   See also section~\ref[langdata].

   French sorting rule says: if the words are the same except for accents
   then accented letters are sorted after unaccented leters but
   read the words from their end in the second pass.
   For example correct sorting is: cote $\lt$ côte $\lt$ coté
   $\lt$ côté. This rule can be activated if the control sequence
   \`\_secondpass``<lang-tag>` is set to \^`\_reversewords`.
   For example, `lang-data.opm` declares `\_let\_secondpassfr=\_reversewords`.

   \bigskip
   Preparing to primary pass is performed by the \`\_setprimarysorting` macro
   implemented here.
   The <lang-tag> is saved to the \^`\_sortinglang` macro when sorting is
   initialized in \^`\_dosorting` (it is typically derived from current `\language` value).
   The \^`\_setprimarysorting` is called from \^`\_dosorting` macro and all
   processing of sorting is in a group. It sets actual
   \^`\_sortingdata`, \^`\_compoundchars` and \^`\_ignoredchars` if given language
   declares them. If not then warning will be printed using \`\_nold` macro
   and English data are used. The `\lccode` of all characters from
   \^`\_sortingdata` and \^`\_ignoredchars` are set. The sorted words will be converted
   using \^`\_compoundchars` followed by `\lowercase` before first pass is run.
   \_cod -----------------------------

\_def\_setprimarysorting {%
   \_ea\_let \_ea\_sortingdata \_csname _sortingdata\_sortinglang\_endcsname
   \_ea\_let \_ea\_compoundchars \_csname _compoundchars\_sortinglang\_endcsname
   \_ea\_let \_ea\_ignoredchars \_csname _ignoredchars\_sortinglang\_endcsname
   \_def\_nold{}%
   \_ifx \_sortingdata\_relax \_addto\_nold{ sortingdata}%
       \_let \_sortingdata = \_sortingdataen \_fi
   \_ifx \_compoundchars\_relax \_addto\_nold{ compoundchars}%
       \_let \_compoundchars = \_compoundcharsen \_fi
   \_ifx \_ignoredchars\_relax \_addto\_nold{ ignoredchars}%
       \_let \_ignoredchars = \_ignoredcharsen \_fi
   \_ifx\_nold\_empty\_else \_opwarning{Missing\_nold\_space for language (\_sortinglang)}\_fi
   \_ifx \_compoundchars\_empty \_else
      \_edef \_compoundchars {\_detokenize\_ea{\_compoundchars} }\_fi % all must be catcode 12
   \_def \_act ##1{\_ifx##1\_relax \_else
      \_ifx##1,\_advance\_tmpnum by1
      \_else \_lccode`##1=\_tmpnum \_fi
      \_ea\_act \_fi}%
   \_tmpnum=65 \_ea\_act \_sortingdata \_relax
   \_def \_act ##1{\_ifx##1\_relax \_else
      \_lccode`##1=`\^^I
      \_ea\_act \_fi}%
   \_ea\_act \_ignoredchars \_relax
}

   \_doc -----------------------------
   Preparing to secondary pass is implemented by the \`\_setsecondarysorting` macro.
   \_cod -----------------------------

\_def\_setsecondarysorting {%
   \_def \_act ##1{\_ifx##1\_relax \_else
      \_ifx##1,\_else \_advance\_tmpnum by1 \_lccode`##1=\_tmpnum \_fi
      \_ea\_act \_fi}%
  \_tmpnum=64 \_ea\_act \_sortingdata \_relax
}

   \_doc -----------------------------
   Strings to be sorted are prepared in `\,<string>` control sequences
   (to save `\TeX` memory).
   The \`\_preparesorting` `\,<string>` converts `<string>` to `\_tmpb`
   with respect to the data initialized in \^`\_setprimarysorting` or
   \^`\_setsecondarysorting`.\nl
   The part of the string after `^^^` is ignored (you can have the same
   sorting key for different things) and
   the compoud characters are converted by the \`\_docompound` macro.
   \_cod -----------------------------

\_def \_preparesorting #1{%
   \_edef \_tmpb {\_ea\_ignoreit\_csstring #1}%        \,<string> -> <string>
   \_edef\_tmpb{\_ea \_stripfromcaret \_tmpb ^^^\_fin}% <string>^^^<ignore> -> <string>
   \_ea \_docompound \_compoundchars \_relax:{}       % replace compound characters
   \_lowercase \_ea{\_ea\_def \_ea\_tmpb \_ea{\_tmpb}}% convert in respect to \_sortingdata
   \_ea\_replstring \_ea\_tmpb \_ea{\_csstring\^^I}{}% remove ignored characters
}
\_def \_docompound #1:#2 {%
   \_ifx\_relax#1\_else \_replstring\_tmpb {#1}{#2}\_ea\_docompound \_fi
}
\_def\_stripfromcaret #1^^^#2\_fin{#1}

   \_doc -----------------------------
   Macro \`\_isAleB` `\,<string1> \,<string2>` returns the result of comparison
   of given two strings to \`\_ifAleB` control sequence. Usage:
   `\_isAleB \,<string1> \,<string2> \_ifAleB ... \_else ... \_fi`
   The converted strings (in respect of the data prepared for first pass)
   must be saved as values of `\,<string1>` and `\,<string2>` macros.
   The reason is speed:  we don't want to convert them repeatedly in each
   comparison.
   \nl
   The macro
   \`\_testAleB` `<converted-string1>&\_relax<converted-string2>&\_relax \,<string1>\,<string2>`\nl
   does the real work. It reads the first character from both converted strings, compares them
   and if it is equal then calls itself recursively else gives the result.
   \_cod -----------------------------

\_newifi \_ifAleB

\_def\_isAleB #1#2{%
   \_edef\_tmpb {#1&\_relax#2&\_relax}%
   \_ea \_testAleB \_tmpb #1#2%
}
\_def\_testAleB #1#2\_relax #3#4\_relax #5#6{%
  \_if #1#3\_if #1&\_testAleBsecondary #5#6%   goto to the second pass::
          \_else \_testAleB #2\_relax #4\_relax #5#6%
          \_fi
  \_else \_ifnum `#1<`#3 \_AleBtrue \_else \_AleBfalse \_fi
  \_fi
}

   \_doc -----------------------------
   The \`\_testAleBsecondary` `\,<string1> \,<string2>` is run if the
   words are equal in the primary pass. It runs \^`\_setsecondarysorting` if it
   was not initialized already. Then prepares compared words to `\_tmpa` and
   `\_tmpb` and corrects them by \^`\_prepsecondpass` if needed.
   Finally, the test is recursively done by the macro \`\_testAleBsecondaryX`
   `<converted-string1>0\_relax<converted-string2>1\_relax`
   \_cod -----------------------------


\_def\_testAleBsecondary#1#2{%
  \_setsecondarysorting \_let\_setsecondarysorting=\_relax
  \_preparesorting#1\_let\_tmpa=\_tmpb \_preparesorting#2%
  \_prepsecondpass
  \_edef\_tmpb{\_tmpa0\_relax\_tmpb1\_relax}%
  \_ea\_testAleBsecondaryX \_tmpb
}
\_def\_testAleBsecondaryX #1#2\_relax #3#4\_relax {%
   \_if #1#3\_testAleBsecondaryX #2\_relax #4\_relax
   \_else \_ifnum `#1<`#3 \_AleBtrue \_else \_AleBfalse \_fi
   \_fi
}

   \_doc -----------------------------
   Merge sort is very effectively implemented by \TeX/ macros. The following
   code is created by my son Miroslav.
   The \`\_mergesort` macro expects that all items in `\_iilist` are separated
   by a comma when it starts. It ends with sorted items in `\_iilist` without commas.
   So \^`\_dosorting` macro must prepare commas between items.
   \_cod -----------------------------

\_def\_mergesort #1#2,#3{% by Miroslav Olsak
   \_ifx,#1%                      % prazdna-skupina,neco,  (#2=neco #3=pokracovani)
      \_toksapp\_tmptoks{#2,}%        % dvojice skupin vyresena
      \_sortreturn{\_fif\_mergesort#3}%   % \mergesort pokracovani
   \_fi
   \_ifx,#3%                      % neco,prazna-skupina,  (#1#2=neco #3=,)
      \_toksapp\_tmptoks{#1#2,}%      % dvojice skupin vyresena
      \_sortreturn{\_fif\_mergesort}% % \mergesort dalsi
   \_fi
   \_ifx\_fin#3%                  % neco,konec (#1#2=neco)
      \_if;\_the\_tmptoks;%           % neco=kompletni setrideny seznam
         \_tmptoks{#1#2}%
         \_sortreturn{\_fif\_fif\_gobbletoend}%   % koncim
      \_else                      % neco=posledni skupina nebo \end
         \_sortreturn{\_fif\_fif       % spojim \indexbuffer+necoa cele znova
                      \_tmptoks\_ea{\_ea}\_ea\_mergesort\_the\_tmptoks#1#2,#3}%
   \_fi\_fi                      % zatriduji: p1+neco1,p2+neco2, (#1#2=p1+neco1 #3=p2)
   \_isAleB #1#3\_ifAleB         % p1<p2
      \_toksapp\_tmptoks{#1}%    % p1 do bufferu
      \_sortreturn{\_fif\_mergesort#2,#3}%         % \mergesort neco1,p2+neco2,
   \_else                       % p1>p2
      \_toksapp\_tmptoks{#3}%   % p2 do bufferu
      \_sortreturn{\_fif\_mergesort#1#2,}%         % \mergesort p1+neco1,neco2,
   \_fi
   \_relax % zarazka, na ktere se zastavi \sortreturn
}
\_def\_sortreturn#1#2\_fi\_relax{#1} \_def\_fif{\_fi}
\_def\_gobbletoend #1\_fin{}

   \_doc -----------------------------
   The \`\_dosorting` `\list` macro redefines `\list` as sorted `\list`.
   The `\list` have to include control sequences in the form `\<c><string>`.
   These control sequences will be sorted with respect to <strings> without
   change of meanings of these control sequences. Their meanings are
   irrelevant when sorting. The first character <c> in `\<c><string>` should
   be whatever. It does not influence the sorting. \OpTeX/ uses comma at
   this place for sorting indexes: `\,<word1> \,<word2> \,<word3> ...`.

   The current language (chosen for hyphenation patterns) is used for
   sorting data. If the macro \`\_sortinglang`
   is defined as `<lang-tag>` (for example `\def\_sortinglang{de}` for German)
   then this has precedence and current language is not used.
   Moreover, if you specify \`\_asciisortingtrue` then ASCII
   sorting will be processed and all language sorting data will be ignored.
   \_cod -----------------------------

\_newifi \_ifasciisorting  \_asciisortingfalse
\_def\_dosorting #1{%
   \_begingroup
      \_ifasciisorting \_def\_sortinglang{ASCII}\_fi
      \_ifx\_sortinglang\_undefined \_edef\_sortinglang{\_cs{_lan:\_the\_language}}\_fi
      \_sortmessage{OpTeX: Sorting \_string#1 (\_sortinglang) ...^^J}%
      \_ismacro\_sortinglang{ASCII}\_iftrue
          \_def \_preparesorting##1{\_edef\_tmpb{\_ea\_ignoreit\_csstring##1}}%
          \_let \_setsecondarysorting=\_relax
      \_else
         \_setprimarysorting
      \_fi
      \_def \_act##1{\_preparesorting ##1\_edef##1{\_tmpb}}%
      \_ea\_xargs \_ea\_act #1;% \_preparesorting for first pass of sorting applied
      \_ifcsname _xcompoundchars\_sortinglang\_endcsname
          \_ea\_let \_ea\_compoundchars \_csname _xcompoundchars\_sortinglang\_endcsname
      \_fi  % \_compoundchars can differ in the second pass of sorting
      \_csname _secondpass\_sortinglang \_endcsname % activates \_reversewords if needed
      \_def \_act##1{\_toksapp\_tmptoks{##1,}}%
      \_tmptoks{}%
      \_edef #1{\_ea}\_ea\_xargs \_ea\_act #1;% commas between items added, mergesort initialized
      \_tmptoks\_ea{\_ea}\_ea\_mergesort \_the\_tmptoks\_fin,\_fin
   \_ea\_endgroup
   \_ea\_def\_ea#1\_ea{\_the\_tmptoks}%
}
\_let\_sortmessage=\_message % User can do \let\_sortmessage=\_ignoreit, for example

   \_doc -----------------------------
   French rules needs reverse reading the words in the second pass.
   The \`\_reversewords` is activated in this case and it adds
   new job to the macro \`\_prepsecondpass`: it reverses the letters in
   the compared words (saved in `\_tmpa` and `\_tmpb`) by the expandable
   \`\_sortrevers` macro. The \^`\_prepsecondpass` macro is used in the
   \^`\_testAleBsecondary` and it is empty by default.
   \_cod -----------------------------

\_def\_prepsecondpass{}
\_def\_reversewords{%
   \_addto\_prepsecondpass{\_edef\_tmpa{\_ea\_sortrevers\_tmpa\_relax}%
                           \_edef\_tmpb{\_ea\_sortrevers\_tmpb\_relax}}%
}
\_def\_sortrevers #1#2\_relax{\_ifx^#2^#1\_else \_sortrevers#2\_relax #1\_fi}

   \_doc -----------------------------
   The \`\makeindex` prints the index. First, it sorts the `\_iilist`
   second, it prints the sorted `\_iilist`, each item is printed
   using \^`\_printindexitem`.\nl
   We set `\leftskip=\iindent` and we suppose that each index entry
   starts by `\noindent\hskip-\iindent` (see the macro \~`\_printii`).
   Then the next lines of the same index
   entry (if the page list is broken to more pages) is indented by
   `\leftskip=\iindent`.
   \_cod -----------------------------

\_def\_makeindex{\_par
  \_ifx\_iilist\_empty \_opwarning{index data-buffer is empty. TeX me again}%
  \_incr\_unresolvedrefs
  \_else
    \_dosorting \_iilist % sorting \_iilist
    \_bgroup
       \_rightskip=0pt plus1fil \_exhyphenpenalty=10000 \_leftskip=\_iindent
       \_ea\_xargs \_ea\_printindexitem \_iilist ;\_par
    \_egroup
  \_fi
}
\_public \makeindex ;

   \_doc -----------------------------
   The \`\_printindexitem` `\,<word>` prints one item to the index.
   If `\_,<word>` is defined then this is used instead real <word>
   (this exception is declared by `\iis` macro). Else <word> is printed by
   \^`\_printii`. Finally, \^`\_printiipages` prints the value of `\,<word>`,
   i.e. the list of pages.
   \_cod -----------------------------

\_def\_printindexitem #1{%
   \_ifcsname _\_csstring #1\_endcsname
      \_ea\_ea\_ea \_printii \_csname _\_csstring #1\_endcsname &%
   \_else
      \_ea\_ea\_ea\_printii \_ea\_ignoreit \_csstring #1&%
   \_fi
   \_ea\_printiipages #1&
}

   \_doc -----------------------------
   \`\_printii` `<word>&` does more intelligent work because we are working with
    words in the form `<main-word>/<sub-word>/<sub-sub-word>`.
    The \^`\everyii` tokens register is applied before `\noindent`. User can
    declare something special here.

   The \`\_newiiletter``{<letter>}{<word>}` macro is empty by default. It is invoked
   if first letter of index entry is changed. You can declare a design between
   index entries here. You can try, for example:
   \begtt
   \def\_newiiletter#1#2{%
       \bigskip \hbox{\setfontsize{at15pt}\bf #1}\nobreak\medskip}
   \endtt
   \`\_definefirstii` `<word>&` macro defines \`\_firstii` which is used as the
   <letter> parameter of the macro \^`\_newiiletter` and for testing if the
   \"first letter" of the index entry was changed.
   The `\uppercase` of the real first letter is used by default here.
   You can re-implement \^`\_definefirstii` if you want. For example,
   you want to ignore accents above letters for index sub-headers:
   \begtt
   \def\_definefirstii#1#2&{%
      \uppercase{\bgroup \iicodes \uppercase{\egroup\def\_firstii{#1}}}}
   \def\iicodes{}
   \def\setiicodes #1#2,{\_ifx^#1^\_else
      \foreach #2\do{\_addto\iicodes{\uccode`##1=`#1}}
      \_ea\setiicodes \_fi
   }
   \setiicodes AÀÂÄÁ,ĆČ,DĎ,EÈÉËÊĚ,IÍÏÎ,LĹĽ,OÖÓÔ,RŔ,ŚŠ,TŤ,UÙÛÜÚŮŰ,YÝŸ,{},
   \endtt
   If the first character of the <word> is \string< or > then it isn't printed.
   It can be used as first character of the index entry in order to put the entry
   to a group entries sorted \string<before or >after whole normal entries.
   \_cod -----------------------------

\_def\_printii #1&{\_definefirstii #1&%
   \_ifx\_firstii\_lastii\_else
      \_ea\_newiiletter\_ea{\_firstii}{#1}\_let\_lastii=\_firstii\_fi
   \_gdef\_currii{#1}\_the\_everyii\_noindent
   \_hskip-\_iindent \_ignorespaces \_isnextchar<{\_ea\_printiiA\_ignoreit}%
      {\_isnextchar>{\_ea\_printiiA\_ignoreit}{\_printiiA}}#1//}
\_def\_printiiA #1/{\_if^#1^\_let\_previi=\_currii \_else
   \_ea\_scanprevii\_previi/&\_edef\_tmpb{\_detokenize{#1}}%
   \_ifx\_tmpa\_tmpb \_iiemdash \_else#1 \_gdef\_previi{}\_fi
   \_ea\_printiiA\_fi
}
\_def\_definefirstii #1#2&{\_uppercase{\_def\_firstii{#1}}}
\_def\_iiemdash{\_kern.1em---\_space}
\_def\_lastii{}
\_def\_newiiletter#1#2{}

\_def\_scanprevii#1/#2&{\_def\_previi{#2}\_edef\_tmpa{\_detokenize{#1}}}
\_def\_previi{} % previous index item

   \_doc -----------------------------
   \`\_printiipages` `<pglist>&` gets `<pglist>` in the form
   `<pg>:<type>,<pg>:<type>,...<pg>:<type>` and it converts them to
   `<pg>, <pg>, <from>--<to>, <pg>` etc. The same pages must be printed only once
   and continuous consequences of pages must be compressed to the form <from>-<to>.
   Moreover, the consequence is continuous only if all pages have the same <type>.
   Empty <type> is most common, pages with `b` <type> must be printed as bold
   and with `i` `<type>` as italics.
   Moreover, the `<pg>` mentioned here are <gpageno>, but we have to print
   <pageno>. The following macros solve these tasks.
   \_cod -----------------------------

\_def\_printiipages#1&{\_let\_pgtype=\_undefined \_tmpnum=0 \_printpages #1,:,\_par}
\_def\_printpages#1:#2,{%  state automaton for compriming pages
   \_ifx,#1,\_uselastpgnum
   \_else \_def\_tmpa{#2}%
      \_ifx\_pgtype\_tmpa \_else
         \_let\_pgtype=\_tmpa
         \_uselastpgnum \_usepgcomma \_pgprint#1:{#2}%
         \_tmpnum=#1 \_returnfi \_fi
      \_ifnum\_tmpnum=#1 \_returnfi \_fi
      \_advance\_tmpnum by1
      \_ifnum\_tmpnum=#1 \_ifx\_lastpgnum\_undefined \_usepgdash\_fi
                         \_edef\_lastpgnum{\_the\_tmpnum:{\_pgtype}}%
                         \_returnfi \_fi
      \_uselastpgnum \_usepgcomma \_pgprint#1:{#2}%
      \_tmpnum=#1
      \_relax
   \_ea\_printpages \_fi
}
\_def\_returnfi #1\_relax{\_fi}
\_def\_uselastpgnum{\_ifx\_lastpgnum\_undefined
   \_else \_ea\_pgprint\_lastpgnum \_let\_lastpgnum=\_undefined \_fi
}
\_def\_usepgcomma{\_ifnum\_tmpnum>0, \_fi} % comma+space between page numbers
\_def\_usepgdash{\_hbox{--}}               % dash in the <from>--<to> form

   \_doc -----------------------------
   You can re-define \`\_pgprint` `<gpageno>:{<iitype>}`
   if you need to implement more <iitypes>.
   \_cod -----------------------------

\_def\_pgprint #1:#2{%
   \_ifx ,#2,\_pgprintA{#1}\_returnfi \_fi
   \_ifx b#2{\_bf \_pgprintA{#1}}\_returnfi \_fi
   \_ifx i#2{\_it \_pgprintA{#1}}\_returnfi \_fi
   \_ifx u#2\_pgu{\_pgprintA{#1}}\_returnfi \_fi
  \_pgprintA{#1}\_relax
}
\_def\_pgprintA #1{\_ilink[pg:#1]{\_cs{_pgi:#1}}} % \ilink[pg:<gpageno>]{<pageno>}
\_def\_pgu#1{\_leavevmode\_vtop{\_hbox{#1}\kern.3ex\_hrule}}

   \_doc -----------------------------
   The \`\iindex``{<word>}` puts one <word> to the index. It writes
   \^`\_Xindex``{<word>}{<iitype>}` to the `.ref` file.
   All other variants of indexing macros expand internally to `\iindex`.
   \_cod -----------------------------

\_def\_iindex#1{\_isempty{#1}\_iffalse
   \_openref{\_def~{ }\_ewref\_Xindex{{#1}{\_iitypesaved}}}\_fi}
\_public \iindex ;

   \_doc -----------------------------
   The \`\_Xindex``{<word>}{<iitype>}` stores `\,<word>` to the `\_iilist` if
   there is the first occurrence of the <word>. The list of pages where `<word>`
   occurs, is the value of the macro `\,<word>`, so the `<gpageno>:<iitype>`
   is appended to this list.
   Moreover, we need a mapping from <gpageno> to `<pageno>`, because we print
   `<pageno>` in the index, but hyperlinks are implemented by `<gpageno>`.
   So, the macro `\_pgi:<gpageno>` is defined as `<pageno>`.
   \_cod -----------------------------

\_def \_iilist {}
\_def \_Xindex #1#2{\_ea\_XindexA \_csname ,#1\_ea\_endcsname \_currpage {#2}}
\_def \_XindexA #1#2#3#4{% #1=\,<word> #2=<gpageno> #3=<pageno> #4=<iitype>
   \_ifx#1\_relax \_global\_addto \_iilist {#1}%
                 \_gdef #1{#2:#4}%
   \_else \_global\_addto #1{,#2:#4}%
   \_fi
   \_sxdef{_pgi:#2}{#3}%
}

   \_doc -----------------------------
   The implementation of macros \`\ii`, \`\iid`, \`\iis` follows.
   Note that `\ii` works in the horizontal mode in order to the `\write` whatsit
   is not broken from the following word. If you need to keep vertical mode,
   use \^`\iindex``{<word>}` directly.
   \nl
   The \`\iitype` `{<type>}` saves the `<type>` to the \`\_iitypesaved` macro. It is
   used in the \^`\iindex` macro.
   \_cod -----------------------------

\_def\_ii #1 {\_leavevmode\_def\_tmp{#1}\_iiA #1,,\_def\_iitypesaved{}}

\_def\_iiA #1,{\_if$#1$\_else\_def\_tmpa{#1}%
   \_ifx\_tmpa\_iiatsign \_ea\_iiB\_tmp,,\_else\_iindex{#1}\_fi
   \_ea\_iiA\_fi}
\_def\_iiatsign{@}

\_def\_iiB #1,{\_if$#1$\_else \_iiC#1/\_relax \_ea\_iiB\_fi}
\_def\_iiC #1/#2\_relax{\_if$#2$\_else\_iindex{#2#1}\_fi}

\_def\_iid #1 {\_leavevmode\_iindex{#1}\_def\_iitypesaved{}%
   \_isnextchar<{\_ignoreit}{\_isnextchar>{\_ignoreit}{}}#1\_futurelet\_tmp\_iiD}
\_def\_iiD{\_ifx\_tmp,\_else\_ifx\_tmp.\_else\_space\_fi\_fi}

\_def\_iis #1 #2{{\_def~{ }\_global\_sdef{_,#1}{#2}}\_ignorespaces}

\_def\_iitypesaved{}
\_def\_iitype #1{\_def\_iitypesaved{#1}\_ignorespaces}

\_public \ii \iid \iis \iitype ;

\_endcode % -------------------------------------

2024-12-01 < first, > last character for sorting, unprinted by \makindex
2024-11-24 | set as first in sortingdata: it can be used as item separator
2024-10-31 \mergesort time-optimized using \toksapp instead \addto
2023-06-02 \_stripfromcaret introduced
2023-03-12 \_definefirstii introduced
2022-06-28 \_reversewords for French sorting introduced
2022-06-28 \_sortingdatalatin covers more languages
2022-06-28 \_xcompoundchars introduced, comments upgraded (German sorting mentioned)
2022-03-18 \iid didn't ignore space, bug fixed
2022-02-19 \_sotringdataen etc. moved to lang-data.opm file
2021-02-15 \_expandafter -> \_ea
2021-02-01 secondary sorting: start from code 65
2020-04-21 \isempty \iffalse ... \fi added to \iindex
2020-03-26 introduced
