1 %#! lualatex -shell-escape manual.ins
4 \documentclass[a4paper,titlepage]{article}
5 \usepackage[margin=20mm]{geometry}
8 \documentclass[a4paper,titlepage]{bxjsarticle}
9 \setpagelayout*{margin=20mm}
10 \def\headfont{\normalfont\bfseries}
11 % \def\headfont{\sffamily\gtfamily} is needed in ordinal documents
14 \usepackage{amsmath,amssymb,xcolor,pict2e}
15 \usepackage{booktabs,listings,lltjlisting,showexpl,multicol}
17 \usepackage[unicode=true]{hyperref}
22 \title{The Lua\TeX-ja package}
23 \author{The Lua\TeX-ja project team}
26 \title{Lua\TeX-jaパッケージ}
27 \author{Lua\TeX-jaプロジェクトチーム}
31 basicstyle=\ttfamily\small, pos=o, breaklines=true,
32 numbers=none, rframe={}
35 \parskip=\smallskipamount
38 \def<#1>{{\normalfont\rm\itshape$\langle$#1$\rangle$}}
45 {\Large\bf This documentation is far from complete. It may have many
46 grammatical (and contextual) errors.}
49 \textbf{\large 本ドキュメントはまだまだ未完成です.
50 また,英語版と日本語版をdocstripプログラムを用いることで一緒に生成している都合上,
57 \section{Introduction}
60 The Lua\TeX-ja package is a macro package for typesetting high-quality
61 Japanese documents in Lua\TeX.
64 Lua\TeX-jaパッケージは,次世代標準\TeX であるLua\TeX の上で,p\TeX と同等
65 /それ以上の品質の日本語組版を実現させようとするマクロパッケージである.
68 \subsection{Backgrounds}
69 Traditionally, ASCII p\TeX, an extension of \TeX, and its derivatives
70 are used to typeset Japanese documents in \TeX. p\TeX\ is an engine
71 extension of \TeX: so it can produce high-quality Japanese documents
72 without using very complicated macros. But this point is a mixed
73 blessing: p\TeX\ is left behind from other extensions of \TeX,
74 especially $\varepsilon$-\TeX\ and pdf\TeX, and from changes about
75 Japanese processing in computers (\textit{e.g.}, the UTF-8 encoding).
77 Recently extensions of p\TeX, namely up\TeX\ (Unicode-implementation
78 of p\TeX) and $\varepsilon$-p\TeX\ (merging of p\TeX and
79 $\varepsilon$-\TeX\ extension), have developed to fill those gaps to some
80 extent, but gaps are still exist.
82 However, the appearance of Lua\TeX\ changed the whole situation. With
83 using Lua `callbacks', users can customize the internal processing of
84 Lua\TeX. So there is no need to modify sources of engines to
85 support Japanese typesetting: to do this, we only have to write Lua
86 scripts for appropriate callbacks.
89 \subsection{Major Changes from p\TeX}
90 The Lua\TeX-ja package is under much influence of p\TeX\ engine. The initial
91 target of development was to implement features of p\TeX. However,
92 \emph{Lua\TeX-ja is not a just porting of p\TeX; unnatural
93 specifications/behaviors of p\TeX\ were not adopted}.
95 The followings are major changes from p\TeX:
97 \item A Japanese font is a tuple of a `real' font, a Japanese font
98 metric (\textbf{JFM}, for short), and an optional string called
101 \item In p\TeX, a linebreak after Japanese character is ignored (and
102 doesn't yield a space), since linebreaks (in source files) are
103 permitted almost everywhere in Japanese texts. However, Lua\TeX-ja
104 doesn't have this function completely, because of a specification
106 \item The insertion process of glues/kerns between two Japanese
107 characters and between a Japanese character and other characters
108 (we refer these glues/kerns as \textbf{JAglue}) is rewritten from
112 \item As Lua\TeX's internal character handling is `node-based'
113 (\textit{e.g.}, \verb+of{}fice+ doesn't prevent ligatures), the
114 insertion process of \textbf{JAglue} is now `node-based'.
115 \item Furthermore, nodes between two characters which have no effects in
116 linebreak (\textit{e.g.}, \verb+\special+ node) are ignored in the
118 \item In the process, two Japanese fonts which differ in their `real'
119 fonts only are identified.
121 \item At the present, vertical typesetting (\emph{tategaki}), is not
122 supported in Lua\TeX-ja.
125 For detailed information, see Part~\ref{part-imp}.
127 \subsection{Notations}
128 In this document, the following terms and notations are used:
130 \item Characters are divided into two types:
132 \item \textbf{JAchar}: standing for Japanese characters such as
133 Hiragana, Katakana, Kanji and other punctuation marks for
135 \item \textbf{ALchar}: standing for all other characters like alphabets.
137 We say `alphabetic fonts' for fonts used in \textbf{ALchar}, and `Japanese fonts' for fonts used in \textbf{JAchar}.
139 \item A word in a sans-serif font (like \textsf{prebreakpenalty})
140 represents an internal parameter for Japanese typesetting, and it
141 is used as a key in \verb+\ltjsetparameter+ command.
142 \item The word `primitive' is used not only for primitives in Lua\TeX,
143 but also for control sequences that defined in the core module of
145 \item In this document, natural numbers start from~0.
148 \subsection{About the project}
149 \paragraph{Project Wiki} Project Wiki is under construction.
151 \item \url{http://sourceforge.jp/projects/luatex-ja/wiki/FrontPage%28en%29} (English)
152 \item \url{http://sourceforge.jp/projects/luatex-ja/wiki/FrontPage} (Japanese)
155 This project is hosted by SourceForge.JP.
158 % \begin{multicols}{2}
160 % \item Hironori KITAGAWA
162 % \item Takayuki YATO
163 % \item Yusuke KUROKI
165 % \item Munehiro YAMAMOTO
166 % \item Tomoaki HONDA
172 \section{Getting Started}
173 \subsection{Installation}
174 To install the Lua\TeX-ja\ package, you will need:
176 \item Lua\TeX\ (version 0.65.0-beta or later) and its supporting packages.\\
177 If you are using \TeX~Live\ 2011 or W32\TeX, you don't have to worry.
178 \item The source archive of Lua\TeX-ja, of course{\tt:)}
181 The installation methods are as follows:
183 \item Download the source archive.
185 At the present, Lua\TeX-ja has no official release, so you have to retrieve
186 the archive from the repository.
187 You can retrieve the Git repository via
189 $ git clone git://git.sourceforge.jp/gitroot/luatex-ja/luatexja.git
191 or download the archive of HEAD in \texttt{master} branch from
193 \url{http://git.sourceforge.jp/view?p=luatex-ja/luatexja.git;a=snapshot;h=HEAD;sf=tgz}.
195 \item Extract the archive. You will see {\tt src/} and several other sub-directories.
196 \item Copy all the contents of {\tt src/} into one of your \texttt{TEXMF} tree.
197 \item If {\tt mktexlsr} is needed to update the filename database, make it so.
200 \subsection{Cautions}
202 \item The encoding of your source file must be UTF-8.
203 \item Not well-tested. In particular, the default setting of the range
204 of \textbf{JAchar} in the present version does not coexist with
205 other packages which use Unicode fonts.
208 \subsection{Using in plain \TeX}\label{ssec-plain}
209 To use Lua\TeX-ja in plain \TeX, simply put the following at the beginning of the document:
214 This does minimal settings (like {\tt ptex.tex}) for typesetting Japanese documents:
216 \item The following 6~Japanese fonts are preloaded:
218 \begin{tabular}{ccccc}
220 \textbf{classification}&\textbf{font name}&\textbf{13.5\,Q}&\textbf{9.5\,Q}&\textbf{7\,Q}\\\midrule
221 \emph{mincho}&Ryumin-Light &\verb+\tenmin+&\verb+\sevenmin+&\verb+\fivemin+\\
222 \emph{gothic}&GothicBBB-Medium&\verb+\tengt+ &\verb+\sevengt+ &\verb+\fivegt+\\
227 \item The `Q' is a unit used in Japanese phototypesetting, and
228 $1\,\textrm{Q}=0.25\,\textrm{mm}$. This length is stored in a
229 dimension \verb+\jQ+.
231 \item It is widely accepted that the font `Ryumin-Light' and
232 `GothicBBB-Medium' aren't embedded into PDF files, and PDF reader
233 substitute them by some external Japanese fonts (\textit{e.g.},
234 Kozuka Mincho is used in Adobe Reader). We adopt this custom to
236 \item You may notice that size of above fonts is slightly smaller than
237 their alphabetic counterparts: for example, the size
238 \verb+\texmin+ is $13.5\,\textrm{Q}\simeq 9.60444\,\textrm{pt}$. This is intensional: ...
240 \item A character in Unicode is treated as \textbf{JAchar} if and only
241 if its code-point has more than or equal to U+0100.
242 \item The amount of glue that are inserted between a \textbf{JAchar} and
243 an \textbf{ALchar} (the parameter \textsf{xkanjiskip}) is set to
245 0.25\,\hbox{\verb+\zw+}^{+1\,\text{pt}}_{-1\,\text{pt}} = \frac{27}{32}\,\mathrm{mm}^{+1\,\text{pt}}_{-1\,\text{pt}}.
247 Here \verb+\zw+ is the counterpart of \texttt{em} for Japanese fonts, that is, the length of `full-width' in current Japanese font.
250 \subsection{Using in \LaTeX}\label{ssec-ltx}
252 Using in \LaTeXe\ is basically same. To set up the minimal environment
253 for Japanese, you only have to load {\tt luatexja.sty}:
255 \usepackage{luatexja}
257 It also does minimal settings (counterparts in p\LaTeX\ are {\tt
258 plfonts.dtx} and {\tt pldefs.ltx}):
261 \item {\tt JY3} is the font encoding for Japanese fonts (in horizontal direction).\\
262 When vertical typesetting is supported by Lua\TeX-ja in the future, {\tt JT3} will be used for vertical fonts.
263 \item Two font families {\tt mc} and {\tt gt} are defined:
265 \begin{tabular}{ccccc}
267 \textbf{classification}&\textbf{family}&\verb+\mdseries+&\verb+\bfseries+&\textbf{scale}\\\midrule
268 \emph{mincho}&\tt mc&Ryumin-Light &GothicBBB-Medium&0.960444\\
269 \emph{gothic}&\tt gt&GothicBBB-Medium&GothicBBB-Medium&0.960444\\
273 \textbf{Note on fonts in bold series}
275 \item Japanese characters in math mode are typeset by the font family {\tt mc}.
278 However, above settings are not sufficient for Japanese-based
279 documents. To typeset Japanese-based documents, You are better to use
280 class files other than {\tt article.cls}, {\tt book.cls}, ... At the
281 present, BXjscls (\texttt{bxjsarticle.cls} and \texttt{bxjsbook.cls}, by
282 Takayuki Yato) are better alternative. It is not determined whether
283 Lua\TeX-ja will develop and contain counterparts of major classes used
284 in p\TeX\ (including jsclasses by Haruhiko Okumura).
286 \subsection{Changing Fonts}
287 \paragraph{Remark: Japanese Characters in Math Mode}
288 Since p\TeX\ supports Japanese characters in math mode, there are
289 sources like the following:
292 $f_{高温}$~($f_{\text{high temperature}}$).
293 \[ y=(x-1)^2+2\quad{}よって\quad y>0 \]
294 $5\in{}素:=\{\,p\in\mathbb N:\text{$p$ is a prime}\,\}$.
297 We (the project members of Lua\TeX-ja) think that using
298 Japanese characters in math mode are allowed if and only if these are used as identifiers.
299 In this point of view,
301 \item The lines 1~and~2 above are not correct, since `高温' in above is used as a textual label, and
302 `よって' is used as a conjunction.
303 \item However, the line~3 is correct, since `素' is used as an identifier.
305 Hence, in our opinion, the above input should be corrected as:
308 ($f_{\text{high temperature}}$).
310 \mathrel{\text{よって}}\quad y>0 \]
311 $5\in{}素:=\{\,p\in\mathbb N:\text{$p$ is a prime}\,\}$.
313 %BUG?: \{\}がなければ「素」がでない.上の段落の「よって」もでてない.
314 We also believe that using Japanese characters as identifiers is rare,
315 hence we don't describe how to change Japanese fonts in math mode in
316 this chapter. For the method, please see Part~\ref{part-ref}.
319 \paragraph{plain \TeX}
320 To change Japanese fonts in plain \TeX, you must use the primitive
321 \verb+\jfont+. So please see Part~\ref{part-ref}.
325 For \LaTeXe, Lua\TeX-ja simply adopted the font selection system from that
326 of p\LaTeXe\ (in {\tt plfonts.dtx}).
328 \item Two control sequences \verb+\mcdefault+ and \verb+\gtdefault+ are
329 used to specify the default font families for \emph{mincho} and
330 \emph{gothic}, respectively. Default values: \texttt{mc} for
331 \verb+\mcdefault+ and \texttt{gt} for \verb+\gtdefault+.
332 \item Commands \verb+\fontfamily+, \verb+\fontseries+,
333 \verb+\fontshape+ and \verb+\selectfont+ can be used to change
334 attributes of Japanese fonts.
336 \begin{tabular}{ccccc}
338 &\textbf{encoding}&\textbf{family}&\textbf{series}&\textbf{shape}\\\midrule
340 &\verb+\romanencoding+&\verb+\romanfamily+&\verb+\romanseries+&\verb+\romanshape+\\
342 &\verb+\kanjiencoding+&\verb+\kanjifamily+&\verb+\kanjiseries+&\verb+\kanjishape+\\
343 both&---&--&\verb+\fontseries+&\verb+\fontshape+\\
344 auto select&\verb+\fontencoding+&\verb+\fontfamily+&---&---\\
348 \item For defining a Japanese font family, use \verb+\DeclareKanjiFamily+
349 instead of \verb+\DeclareFontFamily+.
353 To coexist with \texttt{fontspec} package, it is needed to load
354 \texttt{luatexja-fontspec} package in the preamble. This additional
355 package automatically loads \texttt{luatexja} and \texttt{fontspec}
358 In \texttt{luatexja-fontspec} package, the following 7~commands are defined as
359 counterparts of original commands in \texttt{fontspec}:
361 \begin{tabular}{ccccc}
364 &\verb+\jfontspec+&\verb+\setmainjfont+&\verb+\setsansjfont+&\verb+\newjfontfamily+\\
366 &\verb+\fontspec+&\verb+\setmainfont+&\verb+\setsansfont+&\verb+\newfontfamily+\\
369 &\verb+\newjfontface+&\verb+\defaultjfontfeatures+&\verb+\addjfontfeatures+\\
371 &\verb+\newfontface+&\verb+\defaultfontfeatures+&\verb+\addfontfeatures+\\
378 Note that there is no command named \verb+\setmonojfont+, since it is
379 popular for Japanese fonts that nearly all Japanese glyphs have same widths.
382 \section{Changing Parameters}
383 There are many parameters in Lua\TeX-ja. And due to the behavior of Lua\TeX,
384 most of them are not stored as internal register of \TeX, but as an
385 original storage system in Lua\TeX-ja. Hence, to assign or acquire those
386 parameters, you have to use commands \verb+\ltjsetparameter+ and
387 \verb+\ltjgetparameter+.
389 \subsection{Editing the range of \textbf{JAchar}s}
390 As noted before, the default setting is:
392 A character in Unicode is treated as \textbf{JAchar},\\
394 code-point has more than or equal to U+0100.
396 $\uparrow$ TODO: CHANGE THIS!
399 To edit the range of \textbf{JAchar}s, You have to assign a non-zero
400 natural number which is less than 217 to the character range first. This
401 can be done by using \verb+\ltjdefcharrange+ primitive. For example, the
402 next line assigns whole characters in Supplementary Multilingual Plane
403 and the character `漢' to the range number~4.
405 \ltjdefcharrange{4}{"10000-"1FFFF,`漢}
407 This assignment of numbers to ranges are always global, so you should
408 not do this in the middle of a document. 上書き
410 After assigning numbers to ranges, ...
412 \subsection{\textsf{kanjiskip} and \textsf{xkanjiskip}}\label{subs-kskip}
413 \textbf{JAglue} is divided into the following three categories:
415 \item Glues/kerns specified in JFM. If \verb+\inhibitglue+ is issued around a Japanese character,
416 this glue will be not inserted at the place.
417 \item The default glue which inserted between two \textbf{JAchar}s ({\sf
419 \item The default glue which inserted between a \textbf{JAchar} and an
420 \textbf{ALchar} (\textsf{xkanjiskip}).
422 The value (a skip) of \textsf{kanjiskip} or \textsf{xkanjiskip} can be
423 changed as the following.
425 \ltjsetparameter{kanjiskip={0pt plus 0.4pt minus 0.4pt},
426 xkanjiskip={0.25\zw plus 1pt minus 1pt}}
430 It may occur that JFM contains the data of `ideal width of {\sf
431 kanjiskip}' and/or `ideal width of \textsf{xkanjiskip}'.
432 To use these data from JFM, set the value of \textsf{kanjiskip} or
433 \textsf{xkanjiskip} to \verb+\maxdimen+.
435 \subsection{Insertion Setting of \textsf{xkanjiskip}}
436 It is not desirable that \textsf{xkanjiskip} is inserted between every
437 boundary between \textbf{JAchar}s and \textbf{ALchar}s. For example,
438 \textsf{xkanjiskip} should not be inserted after opening parenthesis
439 (\textit{e.g.}, compare `(あ' and `(\hskip\ltjgetparameter{xkanjiskip}あ').
441 Lua\TeX-ja can control whether \textsf{xkanjiskip} can be inserted
442 before/after a character, by changing \textsf{jaxspmode} for \textbf{JAchar}s and
443 \textsf{alxspmode} parameters \textbf{ALchar}s respectively.
445 \ltjsetparameter{jaxspmode={`あ,preonly}, alxspmode={`\!,postonly}}
449 The second argument {\tt preonly} means `the insertion of
450 \textsf{xkanjiskip} is allowed before this character, but not after'.
451 the other possible values are {\tt postonly}, {\tt allow} and {\tt
452 inhibit}. For the compatibility with p\TeX, natural numbers between
453 0~and~3 are also allowed as the second argument\footnote{But we don't
454 recommend this: since numbers 1~and~2 have opposite meanings in
455 \textsf{jaxspmode} and \textsf{alxspmode}.}.
457 If you want to enable/disable all insertions of \textsf{kanjiskip} and
458 \textsf{xkanjiskip}, set \textsf{autospacing} and \textsf{autoxspacing}
459 parameters to {\tt false}, respectively.
462 \subsection{Shifting Baseline}
463 To make a match between a Japanese font and an alphabetic font, sometimes
464 shifting of the baseline of one of the pair is needed. In p\TeX, this is achieved
465 by setting \verb+\ybaselineshift+ to a non-zero length (the
466 baseline of alphabetic fonts is shifted below). However, for documents
467 whose main language is not Japanese, it is good to shift the baseline of
468 Japanese fonts, but not that of alphabetic fonts.
469 Because of this, Lua\TeX-ja can independently set the shifting amount
470 of the baseline of alphabetic fonts (\textsf{yalbaselineshift}
471 parameter) and that of Japanese fonts (\textsf{yjabaselineshift}
475 \vrule width 150pt height 0.4pt depth 0pt\hskip-120pt
476 \ltjsetparameter{yjabaselineshift=0pt, yalbaselineshift=0pt}abcあいう
477 \ltjsetparameter{yjabaselineshift=5pt, yalbaselineshift=2pt}abcあいう
479 Here the horizontal line in above is the baseline of a line.
481 There is an interesting side-effect: characters in different size can be
482 vertically aligned center in a line, by setting two parameters appropriately.
483 The following is an example (beware the value is not well tuned):
487 \ltjsetparameter{yjabaselineshift=-1pt,
488 yalbaselineshift=-1pt}
494 \subsection{`tombow'}
495 `tombow' is a mark for indicating 4~corners and horizontal/vertical
496 center of the paper. p\LaTeX and this Lua\TeX-ja support `tombow' by
497 their kernel. The following steps are needed to typeset tombow:
500 \item First, define the banner which will be printed at the upper left
501 of the paper. This is done by assigning a token list to
502 \verb+\@bannertoken+.
504 For example, the following sets banner as `{\tt filename (2012-01-01 17:01)}':
508 \hour\time \divide\hour by 60 \@tempcnta\hour \multiply\@tempcnta 60\relax
509 \minute\time \advance\minute-\@tempcnta
511 \jobname\space(\number\year-\two@digits\month-\two@digits\day
512 \space\two@digits\hour:\two@digits\minute)}%
519 \part{Reference}\label{part-ref}
520 \section{Font Metric and Japanese Font}
521 \subsection{\texttt{\char92jfont} primitive}
522 To load a font as a Japanese font, you must use the
523 \verb+\jfont+ primitive instead of~\verb+\font+, while
524 \verb+\jfont+ admits the same syntax used in~\verb+\font+.
525 Lua\TeX-ja automatically loads \texttt{luaotfload} package,
526 so TrueType/OpenType fonts with features can be used for Japanese fonts:
528 \jfont\tradgt={file:ipaexg.ttf:script=latn;%
529 +trad;jfm=ujis} at 14pt
533 Note that the defined control sequence
534 (\verb+\tradgt+ in the example above) using \verb+\jfont+ is not a
535 \textit{font\_def} token, hence the input like
536 \verb+\fontname\tradgt+ causes a error. We denote control sequences which are defined in \verb+\jfont+
540 Besides \texttt{file:}\ and \texttt{name:}\ prefixes, \texttt{psft:}\ can
541 be used a prefix in \verb+\jfont+ (and~\verb+\font+) primitive. Using
542 this prefix, you can specify a font that has its name only and is not
543 related to any real font.
545 Mainly, use of this \texttt{psft:}\ prefix is for using non-embedding `standard' Japanese fonts (Ryumin-Light and GothicBBB-Medium).
553 \subsection{Structure of JFM file}
554 A JFM file is a Lua script which has only one function call:
556 luatexja.jfont.define_jfm { ... }
558 Real data are stored in the table which indicated above by
559 \verb+{ ... }+. So, the rest of this subsection are devoted to describe the
560 structure of this table. Note that all lengths in a JFM file are
561 floating-point numbers in design-size unit.
563 \begin{list}{}{\def\makelabel{\ttfamily}\def\{{\char`\{}\def\}{\char`\}}}
564 \item[dir=<direction>] (required)
566 The direction of JFM. At the present, only \texttt{'yoko'} is supported.
568 \item[zw=<length>] (required)
570 The amount of the length of the `full-width'.
572 \item[zh=<length>] (required)
574 \item[kanjiskip=\{<natural>, <stretch>, <shrink>\}] (optional)
576 This field specifies the `ideal' amount of \textsf{kanjiskip}. As noted
577 in Subsection~\ref{subs-kskip}, if the parameter
578 \textsf{kanjiskip} is \verb+\maxdimen+, the value specified
579 in this field is actually used (if this field is not specified in
580 JFM, it is regarded as 0\,pt). Note that <stretch> and <shrink>
581 fields are in design-size unit too.
584 \item[xkanjiskip=\{<natural>, <stretch>, <shrink>\}] (optional)
586 Like the \texttt{kanjiskip} field, this field specifies the `ideal'
587 amount of \textsf{xkanjiskip}.
591 Besides from above fields, a JFM file have several sub-tables those
592 indices are natural numbers. The table indexed by~$i\in\omega$ stores
593 informations of `character class'~$i$. At least, the character class~0 is
594 always present, so each JFM file must have a sub-table whose index is
595 \texttt{[0]}. Each sub-table (its numerical index is denoted by $i$) has
596 the following fields:
598 \begin{list}{}{\def\makelabel{\ttfamily}\def\{{\char`\{}\def\}{\char`\}}}
599 \item[chars=\{<character>, ...\}] (required except character class~0)
601 This field is a list of characters which are in this character
602 type~$i$. This field is not required if $i=0$, since all
603 \textbf{JAchar} which are not in any character class other
604 than 0 (hence, the character class~0 contains most of
605 \textbf{JAchar}s). In the list, a character can be
606 specified by its code number, or by the character itself
607 (as a string of length~1).
609 In addition to those `real' characters, the following `imaginary
610 characters' can be specified in the list:
612 \item[width=<length>, height=<length>, depth=<length>, italic=<length>]\ (required)
614 Specify width of characters in character class~$i$, height, depth and
615 the amount of italic correction. All characters in character class~$i$ are regarded that its width, height and depth are
616 as values of these fields.
617 But there is one exception: if \texttt{'prop'} is specified in \texttt{width} field, width of a character becomes that of its `real' glyph
619 \item[left=<length>, down=<length>, align=<align>]\
621 These fields are for adjusting the position of the `real' glyph. Legal
622 values of \texttt{align} field are \texttt{'left'},
623 \texttt{'middle'} and \texttt{'right'}. If one of these
624 3~fields are omitted, \texttt{left} and \texttt{down} are
625 treated as~0, and \texttt{align} field is treated as
627 The effects of these 3~fields are indicated in Figure~\ref{fig-pos}.
629 In most cases, \texttt{left} and \texttt{down} fields are~0, while
630 it is not uncommon that the \texttt{align} field is \texttt{'middle'} or \texttt{'right'}.
631 For example, setting the \texttt{align} field to \texttt{'right'} is practically needed
632 when the current character class is the class for opening delimiters'.
634 \begin{minipage}{0.4\textwidth}%
635 \begin{center}\unitlength=10pt\small
636 \begin{picture}(15,12)(-1,-4)
637 \color{black!10!white}% real glyph :step1
638 \put(0,0){\vrule width 12\unitlength height 8\unitlength depth 3\unitlength}
640 \color{red!20!white}% real glyph :step1
641 \put(-1,-1.5){\vrule width 6\unitlength height 7\unitlength depth 2.5\unitlength}
643 \color{red}% real glyph
645 \put(-1,-1.5){\vector(0,1){7}\vector(0,-1){2.5}\vector(1,0){6}}
646 \put(5,-1.5){\line(0,1){7}\line(0,-1){2.5}}
647 \put(-1,5.5){\line(1,0){6}}
648 \put(-1,-4){\line(1,0){6}}
650 \color{green!20!white}% real glyph :step1
651 \put(3,0){\vrule width 6\unitlength height 7\unitlength depth 2.5\unitlength}
653 \color{black}% real glyph :step1
655 \put(0,0){\vector(0,1){8}\line(0,-1){3}\vector(1,0){12}}
656 \put(12,0){\line(0,1){8}\vector(0,-1){3}}
657 \put(0,8){\line(1,0){12}}
658 \put(0,-3){\line(1,0){12}}
659 \put(0.2,4){\makebox(0,0)[l]{\texttt{height}}}
660 \put(12.2,-1.5){\makebox(0,0)[l]{\texttt{depth}}}
661 \put(6,0.2){\makebox(0,0)[b]{\texttt{width}}}
663 \color{green!50!black}% real glyph :step1
665 \put(3,0){\vector(0,1){7}\vector(0,-1){2.5}\vector(1,0){6}}
666 \put(9,0){\line(0,1){7}\line(0,-1){2.5}}
667 \put(3,7){\line(1,0){6}}
668 \put(3,-2.5){\line(1,0){6}}
670 \savebox{\eqdist}(0,0)[b]{%
672 \put(-0.08,0.2){\line(0,-1){0.4}}%
673 \put(0.08,0.2){\line(0,-1){0.4}}}
674 \put(1.5,0){\usebox{\eqdist}}
675 \put(10.5,0){\usebox{\eqdist}}
677 \color{blue}% shifted
679 \put(3,-1.5){\vector(-1,0){4}}
680 \put(1,-1.7){\makebox(0,0)[t]{\texttt{left}}}
681 \put(3,0){\vector(0,-1){1.5}}
682 \put(3.2,-0.75){\makebox(0,0)[l]{\texttt{down}}}
686 \begin{minipage}{0.6\textwidth}%
687 Consider a node containing Japanese character whose value of the \texttt{align}
688 field is \texttt{'middle'}.
690 \item The black rectangle is a frame of the node.
691 Its width, height and depth are specified by JFM.
692 \item Since the \texttt{align} field is \texttt{'middle'},
693 the `real' glyph is centered horizontally (the green rectangle).
694 \item Furthermore, the glyph is shifted according to values of fields
695 \texttt{left} and \texttt{down}. The ultimate position of the real
696 glyph is indicated by the red rectangle.
699 \caption{The position of the `real' glyph}
704 \item[kern={\{[$j$]=<kern>, ...\}}]
706 \item[glue={\{[$j$]=\{<width>, <stretch>, <shrink>\}, ...\}}]
709 \subsection{Math Font Family}
710 \TeX\ handles fonts in math formulas by 16~font families\footnote{Omega,
711 Aleph, Lua\TeX~and $\varepsilon$-(u)p\TeX can handles 256~families, but
712 an external package is needed to support this in plain \TeX\ and
713 \LaTeX.}, and each family has three fonts:
714 \verb+\textfont+, \verb+\scriptfont+ and \verb+\scriptscriptfont+.
716 Lua\TeX-ja's handling of Japanese fonts in math formulas is similar;
717 Table~\ref{tab-math} shows counterparts to \TeX's primitives for math
722 \caption{Primitives for Japanese math fonts}
723 \begin{center}\def\{{\char`\{}\def\}{\char`\}}
726 &Japanese fonts&alphabetic fonts\\
727 font family&\verb+\jfam+${}\in [0,256)$&\verb+\fam+\\
728 text size&\tt\textsf{jatextfont}\,=\{<jfam>,<jfont\_cs>\}&\tt\verb+\textfont+<fam>=<font\_cs>\\
729 script size&\tt\textsf{jascriptfont}\,=\{<jfam>,<jfont\_cs>\}&\tt\verb+\scriptfont+<fam>=<font\_cs>\\
730 scriptscript size&\tt\textsf{jascriptscriptfont}\,=\{<jfam>,<jfont\_cs>\}&\tt\verb+\scriptscriptfont+<fam>=<font\_cs>\\
738 \subsection{{\tt\char92 ltjsetparameter} primitive}
739 As noted before, \verb+\ltjsetparameter+ and \verb+\ltjgetparameter+ are
740 primitives for accessing most parameters of Lua\TeX-ja. One of the main
741 reason that Lua\TeX-ja didn't adopted the syntax similar to that of p\TeX\
742 (\textit{e.g.},~\verb+\prebreakpenalty`)=10000+)
743 is the position of \verb+hpack_filter+ callback in the source
744 of Lua\TeX, see Section~\ref{sec-para}.
746 \verb+\ltjsetparameter+ and \verb+\ltjglobalsetparameter+ are primitives
747 for assigning parameters. These take one argument which is a
748 \texttt{<key>=<value>} list. Allowed keys are described in the next
750 The difference between
751 \verb+\ltjsetparameter+ and \verb+\ltjglobalsetparameter+ is only the
753 \verb+\ltjsetparameter+ does a local assignment and
754 \verb+\ltjglobalsetparameter+ does a global one.
755 They also obey the value of \verb+\globaldefs+,
756 like other assignment.
758 \verb+\ltjgetparameter+ is the primitive for acquiring parameters. It
759 always takes a parameter name as first argument, and also takes the
760 additional argument---a character code, for example---in some cases.
762 \ltjgetparameter{differentjfm},
763 \ltjgetparameter{autospacing},
764 \ltjgetparameter{prebreakpenalty}{`)}.
766 \emph{The return value of\/ {\normalfont\tt\char92ltjgetparameter} is
767 always a string}. This is outputted by \texttt{tex.write()}, so any
768 character other than space~`{\tt\char32}'~(U+0020) has the category code
769 12~(other), while the space has 10~(space).
771 \subsection{List of Parameters}
772 In the following list of parameters, [\verb+\cs+] indicates the counterpart in p\TeX, and each symbol has the following meaning:
774 \item No mark: values at the end of the paragraph or the hbox are
775 adopted in the whole paragraph/hbox.
776 \item `\ast' : local parameters, which can change everywhere inside a paragraph/hbox.
777 \item `\dagger': assignments are always global.
780 \begin{list}{}{\def\makelabel{\ttfamily}\def\{{\char`\{}\def\}{\char`\}}}
781 \item[\textsf{jcharwidowpenalty}\,=<penalty>] [\verb+\jcharwidowpenalty+]
783 Penalty value for supressing orphans. This penalty is inserted just
784 after the last \textbf{JAchar} which is not regarded as a
785 (Japanese) punctuation mark.
787 \item[\textsf{kcatcode}\,=\{<chr\_code>,<natural number>\}]\
789 An additional attributes having each character whose character code is <chr\_code>.
790 At the present version, the lowermost bit of <natural number> indicates
791 whether the character is considered as a punctuation mark
792 (see the description of \textsf{jcharwidowpenalty} above).
795 \item[\textsf{prebreakpenalty}\,=\{<chr\_code>,<penalty>\}] [\verb+\prebreakpenalty+]
796 \item[\textsf{postbreakpenalty}\,=\{<chr\_code>,<penalty>\}] [\verb+\postbreakpenalty+]
797 \item[\textsf{jatextfont}\,=\{<jfam>,<jfont\_cs>\}] [\verb+\textfont+ in \TeX]
798 \item[\textsf{jascriptfont}\,=\{<jfam>,<jfont\_cs>\}] [\verb+\scriptfont+ in \TeX]
799 \item[\textsf{jascriptscriptfont}\,=\{<jfam>,<jfont\_cs>\}] [\verb+\scriptscriptfont+ in \TeX]
800 \item[\textsf{yjabaselineshift}\,=<dimen>$^\ast$]\
801 \item[\textsf{yalbaselineshift}\,=<dimen>$^\ast$] [\verb+\ybaselineshift+]
803 \item[\textsf{jaxspmode}\,=\{<chr\_code>,<mode>\}] [\verb+\inhibitxspcode+]
805 Setting whether inserting \textsf{xkanjiskip} is allowed before/after a \textbf{JAchar} whose character code is <chr\_code>.
806 The followings are allowed for <mode>:
808 \item[0, \texttt{inhibit}] Insertion of \textsf{xkanjiskip} is inhibited before the charater, nor after the charater.
809 \item[2, \texttt{preonly}] Insertion of \textsf{xkanjiskip} is allowed before the charater, but not after.
810 \item[1, \texttt{postonly}] Insertion of \textsf{xkanjiskip} is allowed after the charater, but not before.
811 \item[3, \texttt{allow}] Insertion of \textsf{xkanjiskip} is allowed before the charater and after the charater.
812 This is the default value.
815 \item[\textsf{alxspmode}\,=\{<chr\_code>,<mode>\}] [\verb+\xspcode+]
817 Setting whether inserting \textsf{xkanjiskip} is allowed before/after a \textbf{ALchar} whose character code is <chr\_code>.
818 The followings are allowed for <mode>:
820 \item[0, \texttt{inhibit}] Insertion of \textsf{xkanjiskip} is inhibited before the charater, nor after the charater.
821 \item[1 \texttt{preonly}] Insertion of \textsf{xkanjiskip} is allowed before the charater, but not after.
822 \item[2 \texttt{postonly}] Insertion of \textsf{xkanjiskip} is allowed after the charater, but not before.
823 \item[3, \texttt{allow}] Insertion of \textsf{xkanjiskip} is allowed before the charater and after the charater.
824 This is the default value.
826 Note that parameters \textsf{jaxspmode} and \textsf{alxspmode} use a common table.
828 \item[\textsf{autospacing}\,=<bool>$^\ast$] [\verb+\autospacing+]
829 \item[\textsf{autoxspacing}\,=<bool>$^\ast$] [\verb+\autoxspacing+]
830 \item[\textsf{kanjiskip}\,=<skip>] [\verb+\kanjiskip+]
831 \item[\textsf{xkanjiskip}\,=<skip>] [\verb+\xkanjiskip+]
833 \item[\textsf{differentjfm}\,=<mode>$^\dagger$]
835 Specify how glues/kerns between two \textbf{JAchar}s whose JFM (or size) are different.
836 The allowed arguments are the followings:
838 \item[\texttt{average}]
840 \item[\texttt{large}]
841 \item[\texttt{small}]
844 \item[\textsf{jacharrange}\,=<ranges>$^\ast$]
845 \item[\textsf{kansujichar}\,=\{<digit>, <chr\_code>\}] [\verb+\kansujichar+]
849 \section{Other Primitives}
850 \subsection{Compatibility with p\TeX}
851 \begin{list}{}{\def\makelabel{\ttfamily\char92 }}
860 \section{Control Sequences for \LaTeXe}
861 \subsection{Patch for NFSS2}
862 As described in Subsection~\ref{ssec-ltx}, Lua\TeX-ja simply adopted \texttt{plfonts.dtx} in p\LaTeXe for the Japanese patch for NFSS2.
864 \subsection{`tombow'}
866 \part{Implementations}\label{part-imp}
867 \section{Storing Parameters}\label{sec-para}
868 \subsection{Used Dimensions and Attributes}
869 Here the following is the list of dimension and attributes which are used in Lua\TeX-ja.
871 \def\makelabel{\ttfamily}
872 \def\dim#1{\item[\char92 #1\ \textrm{(dimension)}]}
873 \def\attr#1{\item[\char92 #1\ \textrm{(attribute)}]}
877 As explained in Subsection~\ref{ssec-plain}, \verb+\jQ+ is equal to
878 $1\,\textrm{Q}=0.25\,\textrm{mm}$, where `Q'~(also called `級') is
879 a unit used in Japanese phototypesetting. So one should not change the value of this dimension.
881 There is also a unit called `歯' which equals to $0.25\,\textrm{mm}$ and
882 used in Japanese phototypesetting. The dimension
883 \verb+\jH+ stores this length, similar to \verb+\jQ+.
884 \dim{ltj@zw} A temporal register for the `full-width' of current Japanese font.
885 \dim{ltj@zh} A temporal register for the `full-height' (usually the sum of height of imaginary body and its depth) of current Japanese font.
886 \attr{jfam} Current number of Japanese font family for math formulas.
887 \attr{ltj@curjfnt} The font index of current Japanese font.
888 \attr{ltj@charclass} The character class of Japanese \textit{glyph\_node}.
889 \attr{ltj@yablshift} The amount of shifting the baseline of alphabetic
890 fonts in scaled point ($2^{-16}\,\textrm{pt}$).
891 \attr{ltj@ykblshift} The amount of shifting the baseline of Japanese
892 fonts in scaled point ($2^{-16}\,\textrm{pt}$).
893 \attr{ltj@autospc} Whether the auto insertion of \textsf{kanjiskip} is allowed at the node.
894 \attr{ltj@autoxspc} Whether the auto insertion of \textsf{xkanjiskip} is allowed at the node.
895 \attr{ltj@icflag} For distinguishing `kinds' of the node. To this
896 attribute, one of the following value is
899 \item[ITALIC (1)] Glues from an itaric correction
900 (\verb+\/+). This distinction of origins of glues
901 (from explicit \verb+\kern+, or from \verb+\/+)
902 is needed in the insertion process of \textsf{xkanjiskip}.
904 \item[KINSOKU (3)] Penalties inserted for the word-wrapping process of Japanese characters (\emph{kinsoku}).
905 \item[FROM\_JFM (4)] Glues/kerns from JFM.
906 \item[LINE\_END (5)] Kerns for ...
907 \item[KANJI\_SKIP (6)] Glues for \textsf{kanjiskip}.
908 \item[XKANJI\_SKIP (7)] Glues for \textsf{xkanjiskip}.
909 \item[PROCESSED (8)] Nodes which is already processed by ...
910 \item[IC\_PROCESSED (9)] Glues from an itaric correction, but also already processed.
911 \item[BOXBDD (15)] Glues/kerns that inserted just the beginning or the ending of an hbox or a paragraph.
913 \attr{ltj@kcat$i$} Where $i$~is a natural number which is less than~7.
914 These 7~attributes store bit~vectors indicating which character block is regarded as a block of \textbf{JAchar}s.
917 \subsection{Stack System of Lua\TeX-ja}
918 \paragraph{Background}
919 Lua\TeX-ja has its own stack system, and most parameters of Lua\TeX-ja
920 are stored in it. To clarify the reason, imagine the parameter
921 \textsf{kanjiskip} is stored by a skip, and consider the following
924 \ltjsetparameter{kanjiskip=0pt}ふがふが.%
925 \setbox0=\hbox{\ltjsetparameter{kanjiskip=5pt}ほげほげ}
929 As described in Part~\ref{part-ref}, the only effective value of
930 \textsf{kanjiskip} in an hbox is the latest value, so the value of
931 \textsf{kanjiskip} which applied in the entire hbox should be 5\,pt.
932 However, by the implementation method of Lua\TeX, this `5\,pt' cannot be
933 known from any callbacks. In the \texttt{tex/packaging.w} (which is a
934 file in the source of Lua\TeX), there are the following codes:
938 scaled h; /* height of box */
939 halfword p; /* first node in a box */
940 scaled d; /* max depth */
946 if (cur_list.mode_field == -hmode) {
947 cur_box = filtered_hpack(cur_list.head_field,
948 cur_list.tail_field, saved_value(1),
949 saved_level(1), grp, saved_level(2));
950 subtype(cur_box) = HLIST_SUBTYPE_HBOX;
952 Notice that \verb+unsave+ is executed \emph{before}
953 \verb+filtered_hpack+ (this is where \verb+hpack_filter+ callback is
954 executed): so `5\,pt' in the above source is orphaned at
955 \texttt+unsave+, and hence it can't be accessed from \verb+hpack_filter+
958 \paragraph{The method}
959 The code of stack system is based on that in a post of Dev-luatex mailing list\footnote{%
960 \texttt{[Dev-luatex] tex.currentgrouplevel}, a post at 2008/8/19 by Jonathan Sauer.}.
962 These are two \TeX\ count registers for maintaining informations:
963 \verb+\ltj@@stack+ for the stack level, and \verb+\ltj@@group@level+ for
964 the \TeX's group level when the last assignment was done. Parameters
965 are stored in one big table named \texttt{charprop\_stack\_table}, where
966 \texttt{charprop\_stack\_table[$i$]} stores data of stack level~$i$. If
967 a new stack level is created by \verb+\ltjsetparameter+, all data of the
968 previous level is copied.
970 To resolve the problem mentioned in `Background' above, Lua\TeX-ja uses
971 another thing: When a new stack level is about to be created, a whatsit
972 node whose type, subtype and value are 44~(\textit{user\_defined}),
973 30112, and current group level respectively is appended to the current
974 list (we refer this node by \textit{stack\_flag}). This enables us to
975 know whether assignment is done just inside a hbox. Suppose that the
976 stack level is~$s$ and the \TeX's group level is~$t$ just after the hbox
979 \item If there is no \textit{stack\_flag} node in the list of hbox, then
980 no assignment was occurred inside the hbox. Hence values of
981 parameters at the end of the hbox are stored in the stack
983 \item If there is a \textit{stack\_flag} node whose value is~$t+1$, then
984 an assignment was occurred just inside the hbox group. Hence
985 values of parameters at the end of the hbox are stored in the
987 \item If there are \textit{stack\_flag} nodes but all of their values
988 are more than~$t+1$, then an assignment was occurred in the box,
989 but it is done is `more internal' group. Hence values of
990 parameters at the end of the hbox are stored in the stack
994 Note that to work this trick correctly, assignments to
995 \verb+\ltj@@stack+ and \verb+\ltj@@group@level+ have to be local always,
996 regardless the value of \verb+\globaldefs+.
997 This problem is resolved by using
998 \hbox{\verb+\directlua{tex.globaldefs=0}+} (this assignment is local).
1001 \section{Linebreak after Japanese Character}\label{sec-lbreak}
1002 \subsection{Reference: Behavior in p\TeX}
1005 In~p\TeX, a linebreak after a Japanese character doesn't emit a space,
1006 since words are not separated by spaces in Japanese writings. However,
1007 this feature isn't fully implemented in Lua\TeX-ja due to the
1008 specification of callbacks in~Lua\TeX. To clarify the difference between
1009 p\TeX~and~Lua\TeX, We briefly describe the handling of a linebreak in~p\TeX, in
1012 p\TeX's input processor can be described in terms of a finite state
1013 automaton, as that of~\TeX\ in~Section~2.5 of~\cite{texbytopic}. The
1014 internal states are as follows:
1016 \item State~$N$: new line
1017 \item State~$S$: skipping spaces
1018 \item State~$M$: middle of line
1019 \item State~$K$: after a Japanese character
1021 The first three states---$N$, $S$~and~$M$---are as same as \TeX's input
1022 processor. State~$K$ is similar to state~$M$, and is entered after
1023 Japanese characters. The diagram of state transitions are indicated in
1024 Figure~\ref{fig-ptexipro}. Note that p\TeX\ doesn't leave state~$K$
1025 after `beginning/ending of a group' characters.
1027 \label{fig-ptexipro}
1029 \def\sp{\text{\tt\char32}}
1031 {\text{scan a cs}}\ar@(r,ul)[dr]&\\
1033 *++[o][F-]{N}\ar[ur]^0\ar[dd]_{d,\ g}\ar[u]^{5\ (\text{\tt\char92par})}
1034 \ar@{->}@(d,l)[ddrr]_(0.45){j}&&
1035 *++[o][F-]{S}\ar@(l,dr)[ul]^0\ar@(l,ur)[ddll]_{d,\ g}\ar[u]_{5}
1036 \ar@{->}@(r,r)[dd]^{j}\\&\\&
1037 *++[o][F-]{M}\ar[uuur]^0\ar@(r,dl)[uurr]_(0.55){10\ (\sp)}
1038 \ar[d]_{5\ ({\sp})}\ar@{->}@(dr,dl)[rr]_{j}&&
1039 *++[o][F-]{K}\ar@{->}@(ul,d)[uuul]^0\ar@{->}[ll]^{d}
1040 \ar@{->}@(ur,dr)[uu]^{10\ (\sp)}\ar@{->}[d]_5\\
1043 d:=\{3,4,6,7,8,11,12,13\},\quad g:=\{1,2\},\quad j:=(\text{Japanese characters})
1046 \item Numbers represent category codes.
1047 \item Category codes 9~(ignored), 14~(comment)~and~15~(invalid) are omitted in above diagram.
1049 \caption{State transitions of p\TeX's input processor}
1053 \subsection{Behavior in Lua\TeX-ja}
1054 States in the input processoe of Lua\TeX\ is the same as that of \TeX,
1055 and they can't be customized by any callbacks. Hence, we can only use
1056 \verb+process_input_buffer+ and \verb+token_filter+ callbacks for to
1057 suppress a space by a linebreak which is after Japanese characters.
1059 However, \verb+token_filter+ callback cannot be used either, since a
1060 character in category code 5~(end-of-line) is converted into an space
1061 token \emph{in the input processor}. So we can use only the
1062 \verb+process_input_buffer+ callback. This means that suppressing a
1063 space must be done \emph{just before} an input line is read.
1065 Considering these situations, handling of a end-of-line in Lua\TeX-ja are as follows:
1067 A character U+FFFFF (its category code is set to 14~(comment) by
1068 Lua\TeX-ja) is appended to an input line, before Lua\TeX\ actually
1069 process it, if and only if the following two conditions are satisfied:
1071 \item The category code of the character $\langle${return}$\rangle$
1072 (whose character code is 13) is 5~(end-of-line).
1073 \item The input line matches the following `regular expression':
1075 (\text{any char})^*(\textbf{JAchar})
1076 \bigl(\{\text{catcode}=1\}\cup\{\text{catcode}=2\}\bigr)^*
1082 \section{Insertion of JFM glues, \textsf{kanjiskip} and \textsf{xkanjiskip}}
1083 This is the longest section of the document.
1085 jfmglue.tex の内容をここに入れる