1 %#! lualatex -shell-escape manual.ins
4 \documentclass[a4paper,titlepage]{article}
5 \usepackage[margin=20mm]{geometry}
8 \documentclass[a4paper,titlepage]{bxjsarticle}
9 \setpagelayout*{margin=20mm}
10 \def\headfont{\normalfont\bfseries}
11 % \def\headfont{\sffamily\gtfamily} is needed in ordinal documents
14 \usepackage{amsmath,amssymb,xcolor,pict2e}
15 \usepackage{booktabs,listings,lltjlisting,showexpl,multicol}
17 \usepackage[unicode=true]{hyperref}
21 \DeclareRobustCommand\eTeX{\ensuremath{\varepsilon}-\kern-.125em\TeX}
22 \DeclareRobustCommand\LuaTeX{Lua\TeX}
23 \DeclareRobustCommand\pTeX{p\kern-.05em\TeX}
24 \DeclareRobustCommand\upTeX{p\kern-.05em\TeX}
25 \DeclareRobustCommand\pLaTeX{p\kern-.05em\LaTeX}
26 \DeclareRobustCommand\pLaTeXe{p\kern-.05em\LaTeXe}
27 \DeclareRobustCommand\epTeX{\ensuremath{\varepsilon}-\kern-.125em\pTeX}
31 \long\def\@makecaption#1#2{%
32 \vskip\abovecaptionskip
33 \sbox\@tempboxa{{\small #1. #2}}%
34 \ifdim \wd\@tempboxa >\hsize
37 \global \@minipagefalse
38 \hb@xt@\hsize{\hfil\box\@tempboxa\hfil}%
40 \vskip\belowcaptionskip}
44 \title{The \LuaTeX-ja package}
45 \author{The \LuaTeX-ja project team}
48 \title{\LuaTeX-jaパッケージ}
49 \author{\LuaTeX-jaプロジェクトチーム}
53 basicstyle=\ttfamily\small, pos=o, breaklines=true,
54 numbers=none, rframe={}, basewidth=0.5em
57 \parskip=\smallskipamount
60 \def<#1>{{\normalfont\rm\itshape$\langle$#1$\rangle$}}
67 {\Large\bf This documentation is far from complete. It may have many
68 grammatical (and contextual) errors.}
71 \textbf{\large 本ドキュメントはまだまだ未完成です.
72 また,英語版と日本語版をdocstripプログラムを用いることで一緒に生成している都合上,
79 \section{Introduction}
82 The \LuaTeX-ja package is a macro package for typesetting high-quality
83 Japanese documents when using \LuaTeX.
86 \LuaTeX-jaパッケージは,次世代標準\TeX である\LuaTeX の上で,\pTeX と同等
87 /それ以上の品質の日本語組版を実現させようとするマクロパッケージである.
90 \subsection{Backgrounds}
91 Traditionally, ASCII \pTeX, an extension of \TeX, and its derivatives
92 are used to typeset Japanese documents in \TeX. \pTeX\ is an engine
93 extension of \TeX: so it can produce high-quality Japanese documents
94 without using very complicated macros. But this point is a mixed
95 blessing: \pTeX\ is left behind from other extensions of \TeX,
96 especially \eTeX\ and pdf\TeX, and from changes about
97 Japanese processing in computers (\textit{e.g.}, the UTF-8 encoding).
99 Recently extensions of \pTeX, namely \upTeX\ (Unicode-implementation
100 of \pTeX) and \epTeX\ (merging of \pTeX\ and
101 \eTeX\ extension), have developed to fill those gaps to some
102 extent, but gaps still exist.
104 However, the appearance of \LuaTeX\ changed the whole situation. With
105 using Lua `callbacks', users can customize the internal processing of
106 \LuaTeX. So there is no need to modify sources of engines to
107 support Japanese typesetting: to do this, we only have to write Lua
108 scripts for appropriate callbacks.
111 \subsection{Major Changes from \pTeX}
112 The \LuaTeX-ja package is under much influence of \pTeX\ engine. The initial
113 target of development was to implement features of \pTeX. However,
114 \emph{\LuaTeX-ja is not a just porting of \pTeX; unnatural
115 specifications/behaviors of \pTeX\ were not adopted}.
117 The followings are major changes from \pTeX:
119 \item A Japanese font is a tuple of a `real' font, a Japanese font
120 metric (\textbf{JFM}, for short), and an optional string called
123 \item In \pTeX, a linebreak after Japanese character is ignored (and
124 doesn't yield a space), since linebreaks (in source files) are
125 permitted almost everywhere in Japanese texts. However, \LuaTeX-ja
126 doesn't have this function completely, because of a specification
128 \item The insertion process of glues/kerns between two Japanese
129 characters and between a Japanese character and other characters
130 (we refer these glues/kerns as \textbf{JAglue}) is rewritten from
134 \item As \LuaTeX's internal character handling is `node-based'
135 (\textit{e.g.}, \verb+of{}fice+ doesn't prevent ligatures), the
136 insertion process of \textbf{JAglue} is now `node-based'.
137 \item Furthermore, nodes between two characters which have no effects in
138 linebreak (\textit{e.g.}, \verb+\special+ node) are ignored in the
140 \item In the process, two Japanese fonts which differ in their `real'
141 fonts only are identified.
143 \item At the present, vertical typesetting (\emph{tategaki}), is not
144 supported in \LuaTeX-ja.
147 For detailed information, see Part~\ref{part-imp}.
149 \subsection{Notations}
150 In this document, the following terms and notations are used:
152 \item Characters are divided into two types:
154 \item \textbf{JAchar}: standing for Japanese characters such as
155 Hiragana, Katakana, Kanji and other punctuation marks for
157 \item \textbf{ALchar}: standing for all other characters like alphabets.
159 We say `alphabetic fonts' for fonts used in \textbf{ALchar}, and `Japanese fonts' for fonts used in \textbf{JAchar}.
161 \item A word in a sans-serif font (like \textsf{prebreakpenalty})
162 represents an internal parameter for Japanese typesetting, and it
163 is used as a key in \verb+\ltjsetparameter+ command.
164 \item The word `primitive' is used not only for primitives in \LuaTeX,
165 but also for control sequences that defined in the core module of
167 \item In this document, natural numbers start from~0.
170 \subsection{About the project}
171 \paragraph{Project Wiki} Project Wiki is under construction.
173 \item \url{http://sourceforge.jp/projects/luatex-ja/wiki/FrontPage%28en%29} (English)
174 \item \url{http://sourceforge.jp/projects/luatex-ja/wiki/FrontPage} (Japanese)
177 This project is hosted by SourceForge.JP.
180 % \begin{multicols}{2}
182 % \item Hironori KITAGAWA
184 % \item Takayuki YATO
185 % \item Yusuke KUROKI
187 % \item Munehiro YAMAMOTO
188 % \item Tomoaki HONDA
193 % \paragraph{Acknowledgments} -- 挿入するならここ
196 \section{Getting Started}
197 \subsection{Installation}
198 To install the \LuaTeX-ja\ package, you will need:
200 \item \LuaTeX\ (version 0.65.0-beta or later) and its supporting packages.\\
201 If you are using \TeX~Live\ 2011 or current W32\TeX, you don't have to worry.
202 \item The source archive of \LuaTeX-ja, of course{\tt:)}
205 The installation methods are as follows:
207 \item Download the source archive.
209 At the present, \LuaTeX-ja has no official release, so you have to retrieve
210 the archive from the repository.
211 You can retrieve the Git repository via
213 $ git clone git://git.sourceforge.jp/gitroot/luatex-ja/luatexja.git
215 or download the archive of HEAD in \texttt{master} branch from
217 \url{http://git.sourceforge.jp/view?p=luatex-ja/luatexja.git;a=snapshot;h=HEAD;sf=tgz}.
219 \item Extract the archive. You will see {\tt src/} and several other sub-directories.
220 \item Copy all the contents of {\tt src/} into one of your \texttt{TEXMF} tree.
221 \item If {\tt mktexlsr} is needed to update the filename database, make it so.
224 \subsection{Cautions}
226 \item The encoding of your source file must be UTF-8.
227 \item Not well-tested. In particular, the default setting of the range
228 of \textbf{JAchar} in the present version does not coexist with
229 other packages which use Unicode fonts.
232 \subsection{Using in plain \TeX}\label{ssec-plain}
233 To use \LuaTeX-ja in plain \TeX, simply put the following at the beginning of the document:
238 This does minimal settings (like {\tt ptex.tex}) for typesetting Japanese documents:
240 \item The following 6~Japanese fonts are preloaded:
242 \begin{tabular}{ccccc}
244 \textbf{classification}&\textbf{font name}&\textbf{13.5\,Q}&\textbf{9.5\,Q}&\textbf{7\,Q}\\\midrule
245 \emph{mincho}&Ryumin-Light &\verb+\tenmin+&\verb+\sevenmin+&\verb+\fivemin+\\
246 \emph{gothic}&GothicBBB-Medium&\verb+\tengt+ &\verb+\sevengt+ &\verb+\fivegt+\\
251 \item The `Q' is a unit used in Japanese phototypesetting, and
252 $1\,\textrm{Q}=0.25\,\textrm{mm}$. This length is stored in a
253 dimension \verb+\jQ+.
255 \item It is widely accepted that the font `Ryumin-Light' and
256 `GothicBBB-Medium' aren't embedded into PDF files, and PDF reader
257 substitute them by some external Japanese fonts (\textit{e.g.},
258 Kozuka Mincho is used for Ryumin-Light in Adobe Reader). We adopt this custom to
260 \item You may notice that size of above fonts is slightly smaller than
261 their alphabetic counterparts: for example, the size
262 \verb+\texmin+ is $13.5\,\textrm{Q}\simeq 9.60444\,\textrm{pt}$. This is intensional: ...
264 \item The amount of glue that are inserted between a \textbf{JAchar} and
265 an \textbf{ALchar} (the parameter \textsf{xkanjiskip}) is set to
267 0.25\,\hbox{\verb+\zw+}^{+1\,\text{pt}}_{-1\,\text{pt}} = \frac{27}{32}\,\mathrm{mm}^{+1\,\text{pt}}_{-1\,\text{pt}}.
269 Here \verb+\zw+ is the counterpart of \texttt{em} for Japanese fonts, that is, the length of `full-width' in current Japanese font.
272 \subsection{Using in \LaTeX}\label{ssec-ltx}
274 Using in \LaTeXe\ is basically same. To set up the minimal environment
275 for Japanese, you only have to load {\tt luatexja.sty}:
277 \usepackage{luatexja}
279 It also does minimal settings (counterparts in \pLaTeX\ are {\tt
280 plfonts.dtx} and {\tt pldefs.ltx}):
283 \item {\tt JY3} is the font encoding for Japanese fonts (in horizontal direction).\\
284 When vertical typesetting is supported by \LuaTeX-ja in the future, {\tt JT3} will be used for vertical fonts.
285 \item Two font families {\tt mc} and {\tt gt} are defined:
287 \begin{tabular}{ccccc}
289 \textbf{classification}&\textbf{family}&\verb+\mdseries+&\verb+\bfseries+&\textbf{scale}\\\midrule
290 \emph{mincho}&\tt mc&Ryumin-Light &GothicBBB-Medium&0.960444\\
291 \emph{gothic}&\tt gt&GothicBBB-Medium&GothicBBB-Medium&0.960444\\
295 \textbf{Note on fonts in bold series}
297 \item Japanese characters in math mode are typeset by the font family {\tt mc}.
300 However, above settings are not sufficient for Japanese-based
301 documents. To typeset Japanese-based documents, You are better to use
302 class files other than {\tt article.cls}, {\tt book.cls}, \ldots. At the
303 present, BXjscls (\texttt{bxjsarticle.cls} and \texttt{bxjsbook.cls}, by
304 Takayuki Yato) are better alternative. It is not determined whether
305 \LuaTeX-ja will develop and contain counterparts of major classes used
306 in \pTeX\ (including jsclasses by Haruhiko Okumura).
308 \subsection{Changing Fonts}
309 \paragraph{Remark: Japanese Characters in Math Mode}
310 Since \pTeX\ supports Japanese characters in math mode, there are
311 sources like the following:
314 $f_{高温}$~($f_{\text{high temperature}}$).
315 \[ y=(x-1)^2+2\quad{}よって\quad y>0 \]
316 $5\in{}素:=\{\,p\in\mathbb N:\text{$p$ is a prime}\,\}$.
319 We (the project members of \LuaTeX-ja) think that using
320 Japanese characters in math mode are allowed if and only if these are used as identifiers.
321 In this point of view,
323 \item The lines 1~and~2 above are not correct, since `高温' in above is used as a textual label, and
324 `よって' is used as a conjunction.
325 \item However, the line~3 is correct, since `素' is used as an identifier.
327 Hence, in our opinion, the above input should be corrected as:
330 ($f_{\text{high temperature}}$).
332 \mathrel{\text{よって}}\quad y>0 \]
333 $5\in{}素:=\{\,p\in\mathbb N:\text{$p$ is a prime}\,\}$.
335 %BUG?: \{\}がなければ「素」がでない.上の段落の「よって」もでてない.
336 We also believe that using Japanese characters as identifiers is rare,
337 hence we don't describe how to change Japanese fonts in math mode in
338 this chapter. For the method, please see Part~\ref{part-ref}.
341 \paragraph{plain \TeX}
342 To change Japanese fonts in plain \TeX, you must use the primitive
343 \verb+\jfont+. So please see Part~\ref{part-ref}.
347 For \LaTeXe, \LuaTeX-ja simply adopted the font selection system from that
348 of \pLaTeXe\ (in {\tt plfonts.dtx}).
350 \item Two control sequences \verb+\mcdefault+ and \verb+\gtdefault+ are
351 used to specify the default font families for \emph{mincho} and
352 \emph{gothic}, respectively. Default values: \texttt{mc} for
353 \verb+\mcdefault+ and \texttt{gt} for \verb+\gtdefault+.
354 \item Commands \verb+\fontfamily+, \verb+\fontseries+,
355 \verb+\fontshape+ and \verb+\selectfont+ can be used to change
356 attributes of Japanese fonts.
358 \begin{tabular}{ccccc}
360 &\textbf{encoding}&\textbf{family}&\textbf{series}&\textbf{shape}\\\midrule
362 &\verb+\romanencoding+&\verb+\romanfamily+&\verb+\romanseries+&\verb+\romanshape+\\
364 &\verb+\kanjiencoding+&\verb+\kanjifamily+&\verb+\kanjiseries+&\verb+\kanjishape+\\
365 both&---&--&\verb+\fontseries+&\verb+\fontshape+\\
366 auto select&\verb+\fontencoding+&\verb+\fontfamily+&---&---\\
370 \item For defining a Japanese font family, use \verb+\DeclareKanjiFamily+
371 instead of \verb+\DeclareFontFamily+.
375 To coexist with \texttt{fontspec} package, it is needed to load
376 \texttt{luatexja-fontspec} package in the preamble. This additional
377 package automatically loads \texttt{luatexja} and \texttt{fontspec}
380 In \texttt{luatexja-fontspec} package, the following 7~commands are defined as
381 counterparts of original commands in \texttt{fontspec}:
383 \begin{tabular}{ccccc}
386 &\verb+\jfontspec+&\verb+\setmainjfont+&\verb+\setsansjfont+&\verb+\newjfontfamily+\\
388 &\verb+\fontspec+&\verb+\setmainfont+&\verb+\setsansfont+&\verb+\newfontfamily+\\
391 &\verb+\newjfontface+&\verb+\defaultjfontfeatures+&\verb+\addjfontfeatures+\\
393 &\verb+\newfontface+&\verb+\defaultfontfeatures+&\verb+\addfontfeatures+\\
400 Note that there is no command named \verb+\setmonojfont+, since it is
401 popular for Japanese fonts that nearly all Japanese glyphs have same widths.
404 \section{Changing Parameters}
405 There are many parameters in \LuaTeX-ja. And due to the behavior of \LuaTeX,
406 most of them are not stored as internal register of \TeX, but as an
407 original storage system in \LuaTeX-ja. Hence, to assign or acquire those
408 parameters, you have to use commands \verb+\ltjsetparameter+ and
409 \verb+\ltjgetparameter+.
411 \subsection{Editing the range of \textbf{JAchar}s}
414 To edit the range of \textbf{JAchar}s, You have to assign a non-zero
415 natural number which is less than 217 to the character range first. This
416 can be done by using \verb+\ltjdefcharrange+ primitive. For example, the
417 next line assigns whole characters in Supplementary Multilingual Plane
418 and the character `漢' to the range number~100.
420 \ltjdefcharrange{100}{"10000-"1FFFF,`漢}
422 This assignment of numbers to ranges are always global, so you should
423 not do this in the middle of a document. 上書き
425 After assigning numbers to ranges, ...
427 \paragraph{Default Setting}
428 Lua\TeX-ja predefines eight character ranges for convinience. They are
429 determined from the following data:
431 \item Blocks in Unicode~6.0.
432 \item The \texttt{Adobe-Japan1-UCS2} mapping between a CID Adobe-Japan1-6 and Unicode.
433 \item The \texttt{PXbase} bundle for \upTeX\ by Takayuki Yato.
436 Now we describe these eight ranges. The alphabet `J' or `A' after the
437 number shows whether characters in the range is treated as
438 \textbf{JAchar}s or not by default. These settings are similar to \texttt{prefercjk} ...
440 \item[Range~8${}^{\text{J}}$] Symbols in the intersection of the upper half of ISO~8859-1
441 (Latin-1 Supplement) and JIS~X~0208 (a basic character set for Japanese). This character range
442 consists of the following charatcers:
445 \def\ch#1#2{\item \char"#1\ ({\tt U+00#1}, #2)}%"
446 \ch{A7}{Section Sign}
447 \ch{A8}{Umlaut or diaeresis}
449 \ch{B1}{Plus-minus sign}
450 \ch{B4}{Spacing acute}
451 \ch{B6}{Paragraph sign}
452 \ch{D7}{Multiplication sign}
453 \ch{F7}{Division Sign}
456 \item[Range~1${}^{\text{A}}$] Latin characters that some of them are included in Adobe-Japan1-6.
457 This range consist of the following Unicode ranges, \emph{except characters in the range~8 above}:
460 \item {\tt U+0080}--{\tt U+00FF}: Latin-1 Supplement
461 \item {\tt U+0100}--{\tt U+017F}: Latin Extended-A
462 \item {\tt U+0180}--{\tt U+024F}: Latin Extended-B
463 \item {\tt U+0250}--{\tt U+02AF}: IPA Extensions
464 \item {\tt U+02B0}--{\tt U+02FF}: Spacing Modifier Letters
465 \item {\tt U+0300}--{\tt U+036F}: Combining Diacritical Marks
466 \item {\tt U+1E00}--{\tt U+1EFF}: Latin Extended Additional
470 \item[Range~2${}^{\text{J}}$] Greek and Cyrillic letters. JIS~X~0208 (hence most of Japanese
471 fonts) has some of these characters.
474 \item {\tt U+0370}--{\tt U+03FF}: Greek and Coptic
475 \item {\tt U+0400}--{\tt U+04FF}: Cyrillic
476 \item {\tt U+1F00}--{\tt U+1FFF}: Greek Extended
480 \item[Range~3${}^{\text{J}}$] Punctuations and Miscellaneous symbols. The block list is
481 indicated in Table~\ref{table-rng3}.
483 \caption{Unicode blocks in predefined character range~3.}\label{table-rng3}
484 \catcode`\"=13\def"#1#2#3#4{{\tt U+#1#2#3#4}}%"
487 "2000--"206F&General Punctuation\\
488 "2070--"209F&Superscripts and Subscripts\\
489 "20A0--"20CF&Currency Symbols\\
490 "20D0--"20FF&Combining Diacritical Marks for Symbols\\
491 "2100--"214F&Letterlike Symbols\\
492 "2150--"218F&Number Forms\\
493 "2190--"21FF&Arrows\\
494 "2200--"22FF&Mathematical Operators\\
495 "2300--"23FF&Miscellaneous Technical\\
496 "2400--"243F&Control Pictures\\
497 "2500--"257F&Box Drawing\\
498 "2580--"259F&Block Elements\\
499 "25A0--"25FF&Geometric Shapes\\
500 "2600--"26FF&Miscellaneous Symbols\\
501 "2700--"27BF&Dingbats\\
502 "2900--"297F&Supplemental Arrows-B\\
503 "2980--"29FF&Miscellaneous Mathematical Symbols-B\\
504 "2B00--"2BFF&Miscellaneous Symbols and Arrows\\
505 "E000--"F8FF&Private Use Area\\
506 "FB00--"FB4F&Alphabetic Presentation Forms
510 \item[Range~4${}^{\text{A}}$] Characters usually not in Japanese fonts. This range consists
511 of almost all Unicode blocks which are not in other
512 predefined ranges. Hence, instead of showing the block list,
513 we put the definition of this range itself:
515 \ltjdefcharrange{4}{%
516 "500-"10FF, "1200-"1DFF, "2440-"245F, "27C0-"28FF, "2A00-"2AFF,
517 "2C00-"2E7F, "4DC0-"4DFF, "A4D0-"A82F, "A840-"ABFF, "FB50-"FE0F,
518 "FE20-"FE2F, "FE70-"FEFF, "10000-"1FFFF} % non-Japanese
520 \item[Range~5${}^{\text{A}}$] Surrogates and Supplementary Private Use Areas.
521 \item[Range~6${}^{\text{J}}$] Characters used in Japanese. The block list is indicated in Table~\ref{table-rng6}.
523 \caption{Unicode blocks in predefined character range~6.}\label{table-rng6}
524 \catcode`\"=13\def"#1#2#3#4{{\tt U+#1#2#3#4}}%"
527 "2460--"24FF&Enclosed Alphanumerics\\
528 "2E80--"2EFF&CJK Radicals Supplement\\
529 "3000--"303F&CJK Symbols and Punctuation\\
530 "3040--"309F&Hiragana\\
531 "30A0--"30FF&Katakana\\
532 "3190--"319F&Kanbun\\
533 "31F0--"31FF&Katakana Phonetic Extensions\\
534 "3200--"32FF&Enclosed CJK Letters and Months\\
535 "3300--"33FF&CJK Compatibility\\
536 "3400--"4DBF&CJK Unified Ideographs Extension A\\
537 "4E00--"9FFF&CJK Unified Ideographs\\
538 "F900--"FAFF&CJK Compatibility Ideographs\\
539 "FE10--"FE1F&Vertical Forms\\
540 "FE30--"FE4F&CJK Compatibility Forms\\
541 "FE50--"FE6F&Small Form Variants\\
542 "{20}000--"{2F}FFF&(Supplementary Ideographic Plane)
546 \item[Range~7${}^{\text{J}}$] Characters used in CJK languages, but not included in Adobe-Japan1-6.
547 The block list is indicated in Table~\ref{table-rng7}.
549 \caption{Unicode blocks in predefined character range~7.}\label{table-rng7}
550 \catcode`\"=13\def"#1#2#3#4{{\tt U+#1#2#3#4}}%"
553 "1100--"11FF&Hangul Jamo\\
554 "2F00--"2FDF&Kangxi Radicals\\
555 "2FF0--"2FFF&Ideographic Description Characters\\
556 "3100--"312F&Bopomofo\\
557 "3130--"318F&Hangul Compatibility Jamo\\
558 "31A0--"31BF&Bopomofo Extended\\
559 "31C0--"31EF&CJK Strokes\\
560 "A000--"A48F&Yi Syllables\\
561 "A490--"A4CF&Yi Radicals\\
562 "A830--"A83F&Common Indic Number Forms\\
563 "AC00--"D7AF&Hangul Syllables\\
564 "D7B0--"D7FF&Hangul Jamo Extended-B
571 \subsection{\textsf{kanjiskip} and \textsf{xkanjiskip}}\label{subs-kskip}
572 \textbf{JAglue} is divided into the following three categories:
574 \item Glues/kerns specified in JFM. If \verb+\inhibitglue+ is issued
575 around a Japanese character, this glue will be not inserted at the
577 \item The default glue which inserted between two \textbf{JAchar}s ({\sf
579 \item The default glue which inserted between a \textbf{JAchar} and an
580 \textbf{ALchar} (\textsf{xkanjiskip}).
582 The value (a skip) of \textsf{kanjiskip} or \textsf{xkanjiskip} can be
583 changed as the following.
585 \ltjsetparameter{kanjiskip={0pt plus 0.4pt minus 0.4pt},
586 xkanjiskip={0.25\zw plus 1pt minus 1pt}}
590 It may occur that JFM contains the data of `ideal width of {\sf
591 kanjiskip}' and/or `ideal width of \textsf{xkanjiskip}'.
592 To use these data from JFM, set the value of \textsf{kanjiskip} or
593 \textsf{xkanjiskip} to \verb+\maxdimen+.
595 \subsection{Insertion Setting of \textsf{xkanjiskip}}
596 It is not desirable that \textsf{xkanjiskip} is inserted between every
597 boundary between \textbf{JAchar}s and \textbf{ALchar}s. For example,
598 \textsf{xkanjiskip} should not be inserted after opening parenthesis
599 (\textit{e.g.}, compare `(あ' and `(\hskip\ltjgetparameter{xkanjiskip}あ').
601 \LuaTeX-ja can control whether \textsf{xkanjiskip} can be inserted
602 before/after a character, by changing \textsf{jaxspmode} for \textbf{JAchar}s and
603 \textsf{alxspmode} parameters \textbf{ALchar}s respectively.
605 \ltjsetparameter{jaxspmode={`あ,preonly}, alxspmode={`\!,postonly}}
609 The second argument {\tt preonly} means `the insertion of
610 \textsf{xkanjiskip} is allowed before this character, but not after'.
611 the other possible values are {\tt postonly}, {\tt allow} and {\tt
612 inhibit}. For the compatibility with \pTeX, natural numbers between
613 0~and~3 are also allowed as the second argument\footnote{But we don't
614 recommend this: since numbers 1~and~2 have opposite meanings in
615 \textsf{jaxspmode} and \textsf{alxspmode}.}.
617 If you want to enable/disable all insertions of \textsf{kanjiskip} and
618 \textsf{xkanjiskip}, set \textsf{autospacing} and \textsf{autoxspacing}
619 parameters to {\tt false}, respectively.
622 \subsection{Shifting Baseline}
623 To make a match between a Japanese font and an alphabetic font, sometimes
624 shifting of the baseline of one of the pair is needed. In \pTeX, this is achieved
625 by setting \verb+\ybaselineshift+ to a non-zero length (the
626 baseline of alphabetic fonts is shifted below). However, for documents
627 whose main language is not Japanese, it is good to shift the baseline of
628 Japanese fonts, but not that of alphabetic fonts.
629 Because of this, \LuaTeX-ja can independently set the shifting amount
630 of the baseline of alphabetic fonts (\textsf{yalbaselineshift}
631 parameter) and that of Japanese fonts (\textsf{yjabaselineshift}
635 \vrule width 150pt height 0.4pt depth 0pt\hskip-120pt
636 \ltjsetparameter{yjabaselineshift=0pt, yalbaselineshift=0pt}abcあいう
637 \ltjsetparameter{yjabaselineshift=5pt, yalbaselineshift=2pt}abcあいう
639 Here the horizontal line in above is the baseline of a line.
641 There is an interesting side-effect: characters in different size can be
642 vertically aligned center in a line, by setting two parameters appropriately.
643 The following is an example (beware the value is not well tuned):
647 \ltjsetparameter{yjabaselineshift=-1pt,
648 yalbaselineshift=-1pt}
654 \subsection{Cropmark}
655 Cropmark is a mark for indicating 4~corners and horizontal/vertical
656 center of the paper. In Japanese, we call cropmark as tombo(w).
657 \pLaTeX\ and this \LuaTeX-ja support `tombow' by their kernel.
658 The following steps are needed to typeset cropmark:
661 \item First, define the banner which will be printed at the upper left
662 of the paper. This is done by assigning a token list to
663 \verb+\@bannertoken+.
665 For example, the following sets banner as `{\tt filename (2012-01-01 17:01)}':
669 \hour\time \divide\hour by 60 \@tempcnta\hour \multiply\@tempcnta 60\relax
670 \minute\time \advance\minute-\@tempcnta
672 \jobname\space(\number\year-\two@digits\month-\two@digits\day
673 \space\two@digits\hour:\two@digits\minute)}%
680 \part{Reference}\label{part-ref}
681 \section{Font Metric and Japanese Font}
682 \subsection{\texttt{\char92jfont} primitive}
683 To load a font as a Japanese font, you must use the
684 \verb+\jfont+ primitive instead of~\verb+\font+, while
685 \verb+\jfont+ admits the same syntax used in~\verb+\font+.
686 \LuaTeX-ja automatically loads \texttt{luaotfload} package,
687 so TrueType/OpenType fonts with features can be used for Japanese fonts:
689 \jfont\tradgt={file:ipaexg.ttf:script=latn;%
690 +trad;jfm=ujis} at 14pt
694 Note that the defined control sequence
695 (\verb+\tradgt+ in the example above) using \verb+\jfont+ is not a
696 \textit{font\_def} token, hence the input like
697 \verb+\fontname\tradgt+ causes a error. We denote control sequences which are defined in \verb+\jfont+
701 Besides \texttt{file:}\ and \texttt{name:}\ prefixes, \texttt{psft:}\ can
702 be used a prefix in \verb+\jfont+ (and~\verb+\font+) primitive. Using
703 this prefix, you can specify a font that has its name only and is not
704 related to any real font.
706 Mainly, use of this \texttt{psft:}\ prefix is for using non-embedding `standard' Japanese fonts (Ryumin-Light and GothicBBB-Medium).
714 \subsection{Structure of JFM file}
715 A JFM file is a Lua script which has only one function call:
717 luatexja.jfont.define_jfm { ... }
719 Real data are stored in the table which indicated above by
720 \verb+{ ... }+. So, the rest of this subsection are devoted to describe the
721 structure of this table. Note that all lengths in a JFM file are
722 floating-point numbers in design-size unit.
724 \begin{list}{}{\def\makelabel{\ttfamily}\def\{{\char`\{}\def\}{\char`\}}}
725 \item[dir=<direction>] (required)
727 The direction of JFM. At the present, only \texttt{'yoko'} is supported.
729 \item[zw=<length>] (required)
731 The amount of the length of the `full-width'.
733 \item[zh=<length>] (required)
735 \item[kanjiskip=\{<natural>, <stretch>, <shrink>\}] (optional)
737 This field specifies the `ideal' amount of \textsf{kanjiskip}. As noted
738 in Subsection~\ref{subs-kskip}, if the parameter
739 \textsf{kanjiskip} is \verb+\maxdimen+, the value specified
740 in this field is actually used (if this field is not specified in
741 JFM, it is regarded as 0\,pt). Note that <stretch> and <shrink>
742 fields are in design-size unit too.
745 \item[xkanjiskip=\{<natural>, <stretch>, <shrink>\}] (optional)
747 Like the \texttt{kanjiskip} field, this field specifies the `ideal'
748 amount of \textsf{xkanjiskip}.
752 Besides from above fields, a JFM file have several sub-tables those
753 indices are natural numbers. The table indexed by~$i\in\omega$ stores
754 informations of `character class'~$i$. At least, the character class~0 is
755 always present, so each JFM file must have a sub-table whose index is
756 \texttt{[0]}. Each sub-table (its numerical index is denoted by $i$) has
757 the following fields:
759 \begin{list}{}{\def\makelabel{\ttfamily}\def\{{\char`\{}\def\}{\char`\}}}
760 \item[chars=\{<character>, ...\}] (required except character class~0)
762 This field is a list of characters which are in this character
763 type~$i$. This field is not required if $i=0$, since all
764 \textbf{JAchar} which are not in any character class other
765 than 0 (hence, the character class~0 contains most of
766 \textbf{JAchar}s). In the list, a character can be
767 specified by its code number, or by the character itself
768 (as a string of length~1).
770 In addition to those `real' characters, the following `imaginary
771 characters' can be specified in the list:
773 \item[width=<length>, height=<length>, depth=<length>, italic=<length>]\ (required)
775 Specify width of characters in character class~$i$, height, depth and
776 the amount of italic correction. All characters in character class~$i$ are regarded that its width, height and depth are
777 as values of these fields.
778 But there is one exception: if \texttt{'prop'} is specified in \texttt{width} field, width of a character becomes that of its `real' glyph
780 \item[left=<length>, down=<length>, align=<align>]\
782 These fields are for adjusting the position of the `real' glyph. Legal
783 values of \texttt{align} field are \texttt{'left'},
784 \texttt{'middle'} and \texttt{'right'}. If one of these
785 3~fields are omitted, \texttt{left} and \texttt{down} are
786 treated as~0, and \texttt{align} field is treated as
788 The effects of these 3~fields are indicated in Figure~\ref{fig-pos}.
790 In most cases, \texttt{left} and \texttt{down} fields are~0, while
791 it is not uncommon that the \texttt{align} field is \texttt{'middle'} or \texttt{'right'}.
792 For example, setting the \texttt{align} field to \texttt{'right'} is practically needed
793 when the current character class is the class for opening delimiters'.
795 \begin{minipage}{0.4\textwidth}%
796 \begin{center}\unitlength=10pt\small
797 \begin{picture}(15,12)(-1,-4)
798 \color{black!10!white}% real glyph :step1
799 \put(0,0){\vrule width 12\unitlength height 8\unitlength depth 3\unitlength}
801 \color{red!20!white}% real glyph :step1
802 \put(-1,-1.5){\vrule width 6\unitlength height 7\unitlength depth 2.5\unitlength}
804 \color{red}% real glyph
806 \put(-1,-1.5){\vector(0,1){7}\vector(0,-1){2.5}\vector(1,0){6}}
807 \put(5,-1.5){\line(0,1){7}\line(0,-1){2.5}}
808 \put(-1,5.5){\line(1,0){6}}
809 \put(-1,-4){\line(1,0){6}}
811 \color{green!20!white}% real glyph :step1
812 \put(3,0){\vrule width 6\unitlength height 7\unitlength depth 2.5\unitlength}
814 \color{black}% real glyph :step1
816 \put(0,0){\vector(0,1){8}\line(0,-1){3}\vector(1,0){12}}
817 \put(12,0){\line(0,1){8}\vector(0,-1){3}}
818 \put(0,8){\line(1,0){12}}
819 \put(0,-3){\line(1,0){12}}
820 \put(0.2,4){\makebox(0,0)[l]{\texttt{height}}}
821 \put(12.2,-1.5){\makebox(0,0)[l]{\texttt{depth}}}
822 \put(6,0.2){\makebox(0,0)[b]{\texttt{width}}}
824 \color{green!50!black}% real glyph :step1
826 \put(3,0){\vector(0,1){7}\vector(0,-1){2.5}\vector(1,0){6}}
827 \put(9,0){\line(0,1){7}\line(0,-1){2.5}}
828 \put(3,7){\line(1,0){6}}
829 \put(3,-2.5){\line(1,0){6}}
831 \savebox{\eqdist}(0,0)[b]{%
833 \put(-0.08,0.2){\line(0,-1){0.4}}%
834 \put(0.08,0.2){\line(0,-1){0.4}}}
835 \put(1.5,0){\usebox{\eqdist}}
836 \put(10.5,0){\usebox{\eqdist}}
838 \color{blue}% shifted
840 \put(3,-1.5){\vector(-1,0){4}}
841 \put(1,-1.7){\makebox(0,0)[t]{\texttt{left}}}
842 \put(3,0){\vector(0,-1){1.5}}
843 \put(3.2,-0.75){\makebox(0,0)[l]{\texttt{down}}}
847 \begin{minipage}{0.6\textwidth}%
848 Consider a node containing Japanese character whose value of the \texttt{align}
849 field is \texttt{'middle'}.
851 \item The black rectangle is a frame of the node.
852 Its width, height and depth are specified by JFM.
853 \item Since the \texttt{align} field is \texttt{'middle'},
854 the `real' glyph is centered horizontally (the green rectangle).
855 \item Furthermore, the glyph is shifted according to values of fields
856 \texttt{left} and \texttt{down}. The ultimate position of the real
857 glyph is indicated by the red rectangle.
860 \caption{The position of the `real' glyph.}
865 \item[kern={\{[$j$]=<kern>, ...\}}]
867 \item[glue={\{[$j$]=\{<width>, <stretch>, <shrink>\}, ...\}}]
870 \subsection{Math Font Family}
871 \TeX\ handles fonts in math formulas by 16~font families\footnote{Omega,
872 Aleph, \LuaTeX~and $\varepsilon$-\kern-.125em(u)\pTeX can handles 256~families, but
873 an external package is needed to support this in plain \TeX\ and
874 \LaTeX.}, and each family has three fonts:
875 \verb+\textfont+, \verb+\scriptfont+ and \verb+\scriptscriptfont+.
877 \LuaTeX-ja's handling of Japanese fonts in math formulas is similar;
878 Table~\ref{tab-math} shows counterparts to \TeX's primitives for math
883 \caption{Primitives for Japanese math fonts.}
884 \begin{center}\def\{{\char`\{}\def\}{\char`\}}
887 &Japanese fonts&alphabetic fonts\\
889 font family&\verb+\jfam+${}\in [0,256)$&\verb+\fam+\\
890 text size&\tt\textsf{jatextfont}\,=\{<jfam>,<jfont\_cs>\}&\tt\verb+\textfont+<fam>=<font\_cs>\\
891 script size&\tt\textsf{jascriptfont}\,=\{<jfam>,<jfont\_cs>\}&\tt\verb+\scriptfont+<fam>=<font\_cs>\\
892 scriptscript size&\tt\textsf{jascriptscriptfont}\,=\{<jfam>,<jfont\_cs>\}&\tt\verb+\scriptscriptfont+<fam>=<font\_cs>\\
900 \subsection{{\tt\char92 ltjsetparameter} primitive}
901 As noted before, \verb+\ltjsetparameter+ and \verb+\ltjgetparameter+ are
902 primitives for accessing most parameters of \LuaTeX-ja. One of the main
903 reason that \LuaTeX-ja didn't adopted the syntax similar to that of \pTeX\
904 (\textit{e.g.},~\verb+\prebreakpenalty`)=10000+)
905 is the position of \verb+hpack_filter+ callback in the source
906 of \LuaTeX, see Section~\ref{sec-para}.
908 \verb+\ltjsetparameter+ and \verb+\ltjglobalsetparameter+ are primitives
909 for assigning parameters. These take one argument which is a
910 \texttt{<key>=<value>} list. Allowed keys are described in the next
912 The difference between
913 \verb+\ltjsetparameter+ and \verb+\ltjglobalsetparameter+ is only the
915 \verb+\ltjsetparameter+ does a local assignment and
916 \verb+\ltjglobalsetparameter+ does a global one.
917 They also obey the value of \verb+\globaldefs+,
918 like other assignment.
920 \verb+\ltjgetparameter+ is the primitive for acquiring parameters. It
921 always takes a parameter name as first argument, and also takes the
922 additional argument---a character code, for example---in some cases.
924 \ltjgetparameter{differentjfm},
925 \ltjgetparameter{autospacing},
926 \ltjgetparameter{prebreakpenalty}{`)}.
928 \emph{The return value of\/ {\normalfont\tt\char92ltjgetparameter} is
929 always a string}. This is outputted by \texttt{tex.write()}, so any
930 character other than space~`{\tt\char32}'~(U+0020) has the category code
931 12~(other), while the space has 10~(space).
933 \subsection{List of Parameters}
934 In the following list of parameters, [\verb+\cs+] indicates the counterpart in \pTeX, and each symbol has the following meaning:
936 \item No mark: values at the end of the paragraph or the hbox are
937 adopted in the whole paragraph/hbox.
938 \item `\ast' : local parameters, which can change everywhere inside a paragraph/hbox.
939 \item `\dagger': assignments are always global.
942 \begin{list}{}{\def\makelabel{\ttfamily}\def\{{\char`\{}\def\}{\char`\}}}
943 \item[\textsf{jcharwidowpenalty}\,=<penalty>] [\verb+\jcharwidowpenalty+]
945 Penalty value for supressing orphans. This penalty is inserted just
946 after the last \textbf{JAchar} which is not regarded as a
947 (Japanese) punctuation mark.
949 \item[\textsf{kcatcode}\,=\{<chr\_code>,<natural number>\}]\
951 An additional attributes having each character whose character code is <chr\_code>.
952 At the present version, the lowermost bit of <natural number> indicates
953 whether the character is considered as a punctuation mark
954 (see the description of \textsf{jcharwidowpenalty} above).
957 \item[\textsf{prebreakpenalty}\,=\{<chr\_code>,<penalty>\}] [\verb+\prebreakpenalty+]
958 \item[\textsf{postbreakpenalty}\,=\{<chr\_code>,<penalty>\}] [\verb+\postbreakpenalty+]
959 \item[\textsf{jatextfont}\,=\{<jfam>,<jfont\_cs>\}] [\verb+\textfont+ in \TeX]
960 \item[\textsf{jascriptfont}\,=\{<jfam>,<jfont\_cs>\}] [\verb+\scriptfont+ in \TeX]
961 \item[\textsf{jascriptscriptfont}\,=\{<jfam>,<jfont\_cs>\}] [\verb+\scriptscriptfont+ in \TeX]
962 \item[\textsf{yjabaselineshift}\,=<dimen>$^\ast$]\
963 \item[\textsf{yalbaselineshift}\,=<dimen>$^\ast$] [\verb+\ybaselineshift+]
965 \item[\textsf{jaxspmode}\,=\{<chr\_code>,<mode>\}] [\verb+\inhibitxspcode+]
967 Setting whether inserting \textsf{xkanjiskip} is allowed before/after a \textbf{JAchar} whose character code is <chr\_code>.
968 The followings are allowed for <mode>:
970 \item[0, \texttt{inhibit}] Insertion of \textsf{xkanjiskip} is inhibited before the charater, nor after the charater.
971 \item[2, \texttt{preonly}] Insertion of \textsf{xkanjiskip} is allowed before the charater, but not after.
972 \item[1, \texttt{postonly}] Insertion of \textsf{xkanjiskip} is allowed after the charater, but not before.
973 \item[3, \texttt{allow}] Insertion of \textsf{xkanjiskip} is allowed before the charater and after the charater.
974 This is the default value.
977 \item[\textsf{alxspmode}\,=\{<chr\_code>,<mode>\}] [\verb+\xspcode+]
979 Setting whether inserting \textsf{xkanjiskip} is allowed before/after a \textbf{ALchar} whose character code is <chr\_code>.
980 The followings are allowed for <mode>:
982 \item[0, \texttt{inhibit}] Insertion of \textsf{xkanjiskip} is inhibited before the charater, nor after the charater.
983 \item[1 \texttt{preonly}] Insertion of \textsf{xkanjiskip} is allowed before the charater, but not after.
984 \item[2 \texttt{postonly}] Insertion of \textsf{xkanjiskip} is allowed after the charater, but not before.
985 \item[3, \texttt{allow}] Insertion of \textsf{xkanjiskip} is allowed before the charater and after the charater.
986 This is the default value.
988 Note that parameters \textsf{jaxspmode} and \textsf{alxspmode} use a common table.
990 \item[\textsf{autospacing}\,=<bool>$^\ast$] [\verb+\autospacing+]
991 \item[\textsf{autoxspacing}\,=<bool>$^\ast$] [\verb+\autoxspacing+]
992 \item[\textsf{kanjiskip}\,=<skip>] [\verb+\kanjiskip+]
993 \item[\textsf{xkanjiskip}\,=<skip>] [\verb+\xkanjiskip+]
995 \item[\textsf{differentjfm}\,=<mode>$^\dagger$]
997 Specify how glues/kerns between two \textbf{JAchar}s whose JFM (or size) are different.
998 The allowed arguments are the followings:
1000 \item[\texttt{average}]
1001 \item[\texttt{both}]
1002 \item[\texttt{large}]
1003 \item[\texttt{small}]
1006 \item[\textsf{jacharrange}\,=<ranges>$^\ast$]
1007 \item[\textsf{kansujichar}\,=\{<digit>, <chr\_code>\}] [\verb+\kansujichar+]
1011 \section{Other Primitives}
1012 \subsection{Compatibility with \pTeX}
1013 \begin{list}{}{\def\makelabel{\ttfamily\char92 }}
1022 \section{Control Sequences for \LaTeXe}
1023 \subsection{Patch for NFSS2}
1024 As described in Subsection~\ref{ssec-ltx}, \LuaTeX-ja simply adopted \texttt{plfonts.dtx} in \pLaTeXe\ for the Japanese patch for NFSS2.
1026 \subsection{Cropmark/`tombow'}
1028 \part{Implementations}\label{part-imp}
1029 \section{Storing Parameters}\label{sec-para}
1030 \subsection{Used Dimensions and Attributes}
1031 Here the following is the list of dimension and attributes which are used in \LuaTeX-ja.
1033 \def\makelabel{\ttfamily}
1034 \def\dim#1{\item[\char92 #1\ \textrm{(dimension)}]}
1035 \def\attr#1{\item[\char92 #1\ \textrm{(attribute)}]}
1039 As explained in Subsection~\ref{ssec-plain}, \verb+\jQ+ is equal to
1040 $1\,\textrm{Q}=0.25\,\textrm{mm}$, where `Q'~(also called `級') is
1041 a unit used in Japanese phototypesetting. So one should not change the value of this dimension.
1043 There is also a unit called `歯' which equals to $0.25\,\textrm{mm}$ and
1044 used in Japanese phototypesetting. The dimension
1045 \verb+\jH+ stores this length, similar to \verb+\jQ+.
1046 \dim{ltj@zw} A temporal register for the `full-width' of current Japanese font.
1047 \dim{ltj@zh} A temporal register for the `full-height' (usually the sum of height of imaginary body and its depth) of current Japanese font.
1048 \attr{jfam} Current number of Japanese font family for math formulas.
1049 \attr{ltj@curjfnt} The font index of current Japanese font.
1050 \attr{ltj@charclass} The character class of Japanese \textit{glyph\_node}.
1051 \attr{ltj@yablshift} The amount of shifting the baseline of alphabetic
1052 fonts in scaled point ($2^{-16}\,\textrm{pt}$).
1053 \attr{ltj@ykblshift} The amount of shifting the baseline of Japanese
1054 fonts in scaled point ($2^{-16}\,\textrm{pt}$).
1055 \attr{ltj@autospc} Whether the auto insertion of \textsf{kanjiskip} is allowed at the node.
1056 \attr{ltj@autoxspc} Whether the auto insertion of \textsf{xkanjiskip} is allowed at the node.
1057 \attr{ltj@icflag} For distinguishing `kinds' of the node. To this
1058 attribute, one of the following value is
1061 \item[ITALIC (1)] Glues from an itaric correction
1062 (\verb+\/+). This distinction of origins of glues
1063 (from explicit \verb+\kern+, or from \verb+\/+)
1064 is needed in the insertion process of \textsf{xkanjiskip}.
1066 \item[KINSOKU (3)] Penalties inserted for the word-wrapping process of Japanese characters (\emph{kinsoku}).
1067 \item[FROM\_JFM (4)] Glues/kerns from JFM.
1068 \item[LINE\_END (5)] Kerns for ...
1069 \item[KANJI\_SKIP (6)] Glues for \textsf{kanjiskip}.
1070 \item[XKANJI\_SKIP (7)] Glues for \textsf{xkanjiskip}.
1071 \item[PROCESSED (8)] Nodes which is already processed by ...
1072 \item[IC\_PROCESSED (9)] Glues from an itaric correction, but also already processed.
1073 \item[BOXBDD (15)] Glues/kerns that inserted just the beginning or the ending of an hbox or a paragraph.
1075 \attr{ltj@kcat$i$} Where $i$~is a natural number which is less than~7.
1076 These 7~attributes store bit~vectors indicating which character block is regarded as a block of \textbf{JAchar}s.
1079 \subsection{Stack System of \LuaTeX-ja}
1080 \paragraph{Background}
1081 \LuaTeX-ja has its own stack system, and most parameters of \LuaTeX-ja
1082 are stored in it. To clarify the reason, imagine the parameter
1083 \textsf{kanjiskip} is stored by a skip, and consider the following
1086 \ltjsetparameter{kanjiskip=0pt}ふがふが.%
1087 \setbox0=\hbox{\ltjsetparameter{kanjiskip=5pt}ほげほげ}
1091 As described in Part~\ref{part-ref}, the only effective value of
1092 \textsf{kanjiskip} in an hbox is the latest value, so the value of
1093 \textsf{kanjiskip} which applied in the entire hbox should be 5\,pt.
1094 However, by the implementation method of \LuaTeX, this `5\,pt' cannot be
1095 known from any callbacks. In the \texttt{tex/packaging.w} (which is a
1096 file in the source of \LuaTeX), there are the following codes:
1100 scaled h; /* height of box */
1101 halfword p; /* first node in a box */
1102 scaled d; /* max depth */
1108 if (cur_list.mode_field == -hmode) {
1109 cur_box = filtered_hpack(cur_list.head_field,
1110 cur_list.tail_field, saved_value(1),
1111 saved_level(1), grp, saved_level(2));
1112 subtype(cur_box) = HLIST_SUBTYPE_HBOX;
1114 Notice that \verb+unsave+ is executed \emph{before}
1115 \verb+filtered_hpack+ (this is where \verb+hpack_filter+ callback is
1116 executed): so `5\,pt' in the above source is orphaned at
1117 \texttt+unsave+, and hence it can't be accessed from \verb+hpack_filter+
1120 \paragraph{The method}
1121 The code of stack system is based on that in a post of Dev-luatex mailing list\footnote{%
1122 \texttt{[Dev-luatex] tex.currentgrouplevel}, a post at 2008/8/19 by Jonathan Sauer.}.
1124 These are two \TeX\ count registers for maintaining informations:
1125 \verb+\ltj@@stack+ for the stack level, and \verb+\ltj@@group@level+ for
1126 the \TeX's group level when the last assignment was done. Parameters
1127 are stored in one big table named \texttt{charprop\_stack\_table}, where
1128 \texttt{charprop\_stack\_table[$i$]} stores data of stack level~$i$. If
1129 a new stack level is created by \verb+\ltjsetparameter+, all data of the
1130 previous level is copied.
1132 To resolve the problem mentioned in `Background' above, \LuaTeX-ja uses
1133 another thing: When a new stack level is about to be created, a whatsit
1134 node whose type, subtype and value are 44~(\textit{user\_defined}),
1135 30112, and current group level respectively is appended to the current
1136 list (we refer this node by \textit{stack\_flag}). This enables us to
1137 know whether assignment is done just inside a hbox. Suppose that the
1138 stack level is~$s$ and the \TeX's group level is~$t$ just after the hbox
1141 \item If there is no \textit{stack\_flag} node in the list of hbox, then
1142 no assignment was occurred inside the hbox. Hence values of
1143 parameters at the end of the hbox are stored in the stack
1145 \item If there is a \textit{stack\_flag} node whose value is~$t+1$, then
1146 an assignment was occurred just inside the hbox group. Hence
1147 values of parameters at the end of the hbox are stored in the
1149 \item If there are \textit{stack\_flag} nodes but all of their values
1150 are more than~$t+1$, then an assignment was occurred in the box,
1151 but it is done is `more internal' group. Hence values of
1152 parameters at the end of the hbox are stored in the stack
1156 Note that to work this trick correctly, assignments to
1157 \verb+\ltj@@stack+ and \verb+\ltj@@group@level+ have to be local always,
1158 regardless the value of \verb+\globaldefs+.
1159 This problem is resolved by using
1160 \hbox{\verb+\directlua{tex.globaldefs=0}+} (this assignment is local).
1163 \section{Linebreak after Japanese Character}\label{sec-lbreak}
1164 \subsection{Reference: Behavior in \pTeX}
1167 In~\pTeX, a linebreak after a Japanese character doesn't emit a space,
1168 since words are not separated by spaces in Japanese writings. However,
1169 this feature isn't fully implemented in \LuaTeX-ja due to the
1170 specification of callbacks in~\LuaTeX. To clarify the difference between
1171 \pTeX~and~\LuaTeX, We briefly describe the handling of a linebreak in~\pTeX, in
1174 \pTeX's input processor can be described in terms of a finite state
1175 automaton, as that of~\TeX\ in~Section~2.5 of~\cite{texbytopic}. The
1176 internal states are as follows:
1178 \item State~$N$: new line
1179 \item State~$S$: skipping spaces
1180 \item State~$M$: middle of line
1181 \item State~$K$: after a Japanese character
1183 The first three states---$N$, $S$~and~$M$---are as same as \TeX's input
1184 processor. State~$K$ is similar to state~$M$, and is entered after
1185 Japanese characters. The diagram of state transitions are indicated in
1186 Figure~\ref{fig-ptexipro}. Note that \pTeX\ doesn't leave state~$K$
1187 after `beginning/ending of a group' characters.
1189 \label{fig-ptexipro}
1191 \def\sp{\text{\tt\char32}}
1193 {\text{scan a cs}}\ar@(r,ul)[dr]&\\
1195 *++[o][F-]{N}\ar[ur]^0\ar[dd]_{d,\ g}\ar[u]^{5\ (\text{\tt\char92par})}
1196 \ar@{->}@(d,l)[ddrr]_(0.45){j}&&
1197 *++[o][F-]{S}\ar@(l,dr)[ul]^0\ar@(l,ur)[ddll]_{d,\ g}\ar[u]_{5}
1198 \ar@{->}@(r,r)[dd]^{j}\\&\\&
1199 *++[o][F-]{M}\ar[uuur]^0\ar@(r,dl)[uurr]_(0.55){10\ (\sp)}
1200 \ar[d]_{5\ ({\sp})}\ar@{->}@(dr,dl)[rr]_{j}&&
1201 *++[o][F-]{K}\ar@{->}@(ul,d)[uuul]^0\ar@{->}[ll]^{d}
1202 \ar@{->}@(ur,dr)[uu]^{10\ (\sp)}\ar@{->}[d]_5\\
1205 d:=\{3,4,6,7,8,11,12,13\},\quad g:=\{1,2\},\quad j:=(\text{Japanese characters})
1208 \item Numbers represent category codes.
1209 \item Category codes 9~(ignored), 14~(comment)~and~15~(invalid) are omitted in above diagram.
1211 \caption{State transitions of \pTeX's input processor.}
1215 \subsection{Behavior in \LuaTeX-ja}
1216 States in the input processoe of \LuaTeX\ is the same as that of \TeX,
1217 and they can't be customized by any callbacks. Hence, we can only use
1218 \verb+process_input_buffer+ and \verb+token_filter+ callbacks for to
1219 suppress a space by a linebreak which is after Japanese characters.
1221 However, \verb+token_filter+ callback cannot be used either, since a
1222 character in category code 5~(end-of-line) is converted into an space
1223 token \emph{in the input processor}. So we can use only the
1224 \verb+process_input_buffer+ callback. This means that suppressing a
1225 space must be done \emph{just before} an input line is read.
1227 Considering these situations, handling of a end-of-line in \LuaTeX-ja are as follows:
1229 A character U+FFFFF (its category code is set to 14~(comment) by
1230 \LuaTeX-ja) is appended to an input line, before \LuaTeX\ actually
1231 process it, if and only if the following two conditions are satisfied:
1233 \item The category code of the character $\langle${return}$\rangle$
1234 (whose character code is 13) is 5~(end-of-line).
1235 \item The input line matches the following `regular expression':
1237 (\text{any char})^*(\textbf{JAchar})
1238 \bigl(\{\text{catcode}=1\}\cup\{\text{catcode}=2\}\bigr)^*
1244 \section{Insertion of JFM glues, \textsf{kanjiskip} and \textsf{xkanjiskip}}
1245 This is the longest section of the document.
1247 jfmglue.tex の内容をここに入れる