1 %#! lualatex -shell-escape manual.ins
3 %<en>\documentclass[a4paper,titlepage]{article}
4 %<ja>\documentclass[a4paper,titlepage]{ltjsarticle}
5 \usepackage[margin=20mm,footskip=5mm]{geometry}
7 \usepackage{amsmath,amssymb,xcolor,pict2e,multienum,amsthm,float}
8 \usepackage{booktabs,listings,lltjlisting,showexpl,multicol}
9 \usepackage{luatexja-otf}
10 \usepackage[unicode=false]{hyperref}
14 \DeclareRobustCommand\eTeX{\ensuremath{\varepsilon}-\kern-.125em\TeX}
15 \DeclareRobustCommand\LuaTeX{Lua\TeX}
16 \DeclareRobustCommand\pdfTeX{pdf\TeX}
17 \DeclareRobustCommand\pTeX{p\kern-.05em\TeX}
18 \DeclareRobustCommand\upTeX{p\kern-.05em\TeX}
19 \DeclareRobustCommand\pLaTeX{p\kern-.05em\LaTeX}
20 \DeclareRobustCommand\pLaTeXe{p\kern-.05em\LaTeXe}
21 \DeclareRobustCommand\epTeX{\ensuremath{\varepsilon}-\kern-.125em\pTeX}
23 \theoremstyle{definition}
24 \newtheorem{defn}{Definition}
26 \newenvironment{cslist}{%
27 \leftskip2em\parindent=0pt\def\makelabel##1{{\tt\char92##1}}
28 \def\{{\char`\{}\def\}{\char`\}}
30 \def\item[##1]{\par\smallskip\par\hskip-\leftskip\makelabel{##1}\par}
34 \long\def\@makecaption#1#2{%
35 \vskip\abovecaptionskip
36 \sbox\@tempboxa{{\small #1. #2}}%
37 \ifdim \wd\@tempboxa >\hsize
40 \global \@minipagefalse
41 \hb@xt@\hsize{\hfil\box\@tempboxa\hfil}%
43 \vskip\belowcaptionskip}
47 \title{The \LuaTeX-ja package}
48 \author{The \LuaTeX-ja project team}
51 \title{\LuaTeX-jaパッケージ}
52 \author{\LuaTeX-jaプロジェクトチーム}
56 basicstyle=\ttfamily\small, pos=o, breaklines=true,
57 numbers=none, rframe={}, basewidth=0.5em
60 \parskip=\smallskipamount
61 \protected\def\Param#1{\textsf{#1}} % parameter name
62 \protected\def\Pkg#1{\underline{\smash{\texttt{#1}}}} % packages/classes
66 \def<#1>{{\normalfont\rm\itshape$\langle$#1$\rangle$}}
73 {\Large\bf This documentation is far from complete. It may have many
74 grammatical (and contextual) errors.}
77 \textbf{\large 本ドキュメントはまだまだ未完成です.
78 また,英語版と日本語版をdocstripプログラムを用いることで一緒に生成している都合上,
85 \section{Introduction}
88 The \LuaTeX-ja package is a macro package for typesetting high-quality
89 Japanese documents when using \LuaTeX.
92 \LuaTeX-jaパッケージは,次世代標準\TeX である\LuaTeX の上で,\pTeX と同等
93 /それ以上の品質の日本語組版を実現させようとするマクロパッケージである.
96 \subsection{Backgrounds}
98 Traditionally, ASCII \pTeX, an extension of \TeX, and its derivatives
99 are used to typeset Japanese documents in \TeX. \pTeX\ is an engine
100 extension of \TeX: so it can produce high-quality Japanese documents
101 without using very complicated macros. But this point is a mixed
102 blessing: \pTeX\ is left behind from other extensions of \TeX,
103 especially \eTeX\ and pdf\TeX, and from changes about
104 Japanese processing in computers (\textit{e.g.}, the UTF-8 encoding).
106 Recently extensions of \pTeX, namely \upTeX\ (Unicode-implementation
107 of \pTeX) and \epTeX\ (merging of \pTeX\ and
108 \eTeX\ extension), have developed to fill those gaps to some
109 extent, but gaps still exist.
111 However, the appearance of \LuaTeX\ changed the whole situation. With
112 using Lua `callbacks', users can customize the internal processing of
113 \LuaTeX. So there is no need to modify sources of engines to
114 support Japanese typesetting: to do this, we only have to write Lua
115 scripts for appropriate callbacks.
118 従来,「\TeX を用いて日本語組版を行う」といったとき,エンジンとしては
119 ASCII \pTeX やそれの拡張物が用いられることが一般的であった.\pTeX は\TeX
120 のエンジン拡張であり,(少々仕様上不便な点はあるものの)商業印刷の分野に
121 も用いられるほどの高品質な日本語組版を可能としている.だが,それは弱点に
122 もなってしまった:\pTeX という(組版的に)満足なものがあったため,海外で
123 行われている数々の\TeX の拡張──例えば\eTeX や\pdfTeX ──や,TrueType,
124 OpenType, Unicodeといった計算機で日本語を扱う際の状況の変化に追従すること
127 ここ数年,若干状況は改善されてきた.現在手に入る大半の\pTeX バイナリでは
128 外部UTF-8入力が利用可能となり,さらにUnicode化を推進し,\pTeX の内部処理
129 までUnicode化した\upTeX も開発されている.また,\pTeX に\eTeX 拡張をマー
130 ジした\epTeX も登場し,\TeX\ Live\ 2011では\pLaTeX が\epTeX の上で動作す
131 るようになった.だが,\pdfTeX 拡張(pdf直接出力やmicro-typesetting)を
132 \pTeX に対応させようという動きはなく,海外とのgapは未だにあるのが現状であ
135 しかし,\LuaTeX の登場で,状況は大きく変わることになった.Luaコードで
136 `callback'を書くことにより,\LuaTeX の内部処理に割り込みをかけることが可
137 能となった.これは,エンジン拡張という真似をしなくても,Luaコードとそれに
138 関する\TeX マクロを書けば,エンジン拡張とほぼ同程度のことができるようになっ
139 たということを意味する.\LuaTeX-jaは,このアプローチによってLuaコード・
140 \TeX マクロによって日本語組版を\LuaTeX の上で実現させようという目的で開発
144 \subsection{Major Changes from \pTeX}
146 The \LuaTeX-ja package is under much influence of \pTeX\ engine. The initial
147 target of development was to implement features of \pTeX. However,
148 \emph{\LuaTeX-ja is not a just porting of \pTeX; unnatural
149 specifications/behaviors of \pTeX\ were not adopted}.
153 \LuaTeX-jaは,\pTeX に多大な影響を受けている.初期の開発目標は,\pTeX の機
154 能をLuaコードにより実装することであった.しかし,開発が進むにつれ,\pTeX
155 の完全な移植は不可能であり,また\pTeX における実装がいささか不可解になっ
156 ているような状況も発見された.そのため,\textbf{\LuaTeX-ja は,もはや
157 \pTeX の完全な移植は目標とはしない.\pTeX における不自然な仕様・挙動があ
161 The followings are major changes from \pTeX:
163 \item A Japanese font is a tuple of a `real' font, a Japanese font
164 metric (\textbf{JFM}, for short), and an optional string called
167 \item In \pTeX, a linebreak after Japanese character is ignored (and
168 doesn't yield a space), since linebreaks (in source files) are
169 permitted almost everywhere in Japanese texts. However, \LuaTeX-ja
170 doesn't have this function completely, because of a specification
172 \item The insertion process of glues/kerns between two Japanese
173 characters and between a Japanese character and other characters
174 (we refer these glues/kerns as \textbf{JAglue}) is rewritten from
178 \item As \LuaTeX's internal character handling is `node-based'
179 (\textit{e.g.}, \verb+of{}fice+ doesn't prevent ligatures), the
180 insertion process of \textbf{JAglue} is now `node-based'.
181 \item Furthermore, nodes between two characters which have no effects in
182 linebreak (\textit{e.g.}, \verb+\special+ node) are ignored in the
184 \item In the process, two Japanese fonts which differ in their `real'
185 fonts only are identified.
187 \item At the present, vertical typesetting (\emph{tategaki}), is not
188 supported in \LuaTeX-ja.
191 For detailed information, see Part~\ref{part-imp}.
193 \subsection{Notations}
194 In this document, the following terms and notations are used:
196 \item Characters are divided into two types:
198 \item \textbf{JAchar}: standing for Japanese characters such as
199 Hiragana, Katakana, Kanji and other punctuation marks for
201 \item \textbf{ALchar}: standing for all other characters like alphabets.
203 We say `alphabetic fonts' for fonts used in \textbf{ALchar}, and `Japanese fonts' for fonts used in \textbf{JAchar}.
205 \item A word in a sans-serif font (like \Param{prebreakpenalty})
206 means an internal parameter for Japanese typesetting, and it
207 is used as a key in \verb+\ltjsetparameter+ command.
208 \item A word in typewriter font with underline (like \Pkg{fontspec})
209 means a package of a class of \LaTeX.
210 \item The word `primitive' is used not only for primitives in \LuaTeX,
211 but also for control sequences that defined in the core module of
213 \item In this document, natural numbers start from~0.
216 \subsection{About the project}
217 \paragraph{Project Wiki} Project Wiki is under construction.
219 \item \url{http://sourceforge.jp/projects/luatex-ja/wiki/FrontPage%28en%29} (English)
220 \item \url{http://sourceforge.jp/projects/luatex-ja/wiki/FrontPage} (Japanese)
223 This project is hosted by SourceForge.JP.
227 \begin{multienumerate}
228 \def\labelenumi{$\bullet$}
229 \mitemxxx{Hironori KITAGAWA}{Kazuki MAEDA}{Takayuki YATO}
230 \mitemxxx{Yusuke KUROKI}{Noriyuki ABE}{Munehiro YAMAMOTO}
231 \mitemx{Tomoaki HONDA}
235 \begin{multienumerate}
236 \def\labelenumi{$\bullet$}
237 \mitemxxx{Hironori KITAGAWA}{Kazuki MAEDA}{Takayuki YATO}
238 \mitemxxx{Yusuke KUROKI}{Noriyuki ABE}{Munehiro YAMAMOTO}
239 \mitemx{Tomoaki HONDA}
244 % \paragraph{Acknowledgments} -- 挿入するならここ
247 \section{Getting Started}
248 \subsection{Installation}
249 To install the \LuaTeX-ja\ package, you will need:
251 \item \LuaTeX\ (version 0.65.0-beta or later) and its supporting packages.\\
252 If you are using \TeX~Live~2011 or current W32\TeX, you don't have to worry.
253 \item The source archive of \LuaTeX-ja, of course{\tt:)}
256 The installation methods are as follows:
258 \item Download the source archive.
260 At the present, \LuaTeX-ja has no official release, so you have to retrieve
261 the archive from the repository.
262 You can retrieve the Git repository via
264 $ git clone git://git.sourceforge.jp/gitroot/luatex-ja/luatexja.git
266 or download the archive of HEAD in \texttt{master} branch from
268 \url{http://git.sourceforge.jp/view?p=luatex-ja/luatexja.git;a=snapshot;h=HEAD;sf=tgz}.
271 Note that the forefront of development may not be in \texttt{master} branch.
272 \item Extract the archive. You will see {\tt src/} and several other sub-directories.
273 \item Copy all the contents of {\tt src/} into one of your \texttt{TEXMF} tree.
274 \item If {\tt mktexlsr} is needed to update the filename database, make it so.
277 \subsection{Cautions}
279 \item The encoding of your source file must be UTF-8. No other
280 encodings, such as EUC-JP or Shift-JIS, are not supported.
281 \item May be conflict with other packages.
283 For example, the default setting of \textbf{JAchar} in the present
284 version does not coexist with the \Pkg{unicode-math}
285 package. Putting the following line in preamble makes that
286 mathematical symbols will be typeset correctly, but several
287 Japanese characters will be treated as an \textbf{ALchar} as
290 \ltjsetparameter{jacharrange={-3, -8}}
294 \subsection{Using in plain \TeX}\label{ssec-plain}
295 To use \LuaTeX-ja in plain \TeX, simply put the following at the beginning of the document:
300 This does minimal settings (like {\tt ptex.tex}) for typesetting Japanese documents:
302 \item The following 6~Japanese fonts are preloaded:
304 \begin{tabular}{ccccc}
306 \textbf{classification}&\textbf{font name}&\bf `10\,pt'&\bf`7\,pt'&\bf`5\,pt'\\\midrule
307 \emph{mincho}&Ryumin-Light &\verb+\tenmin+&\verb+\sevenmin+&\verb+\fivemin+\\
308 \emph{gothic}&GothicBBB-Medium&\verb+\tengt+ &\verb+\sevengt+ &\verb+\fivegt+\\
313 \item The `Q' is a unit used in Japanese phototypesetting, and
314 $1\,\textrm{Q}=0.25\,\textrm{mm}$. This length is stored in a
315 dimension \verb+\jQ+.
317 \item It is widely accepted that the font `Ryumin-Light' and
318 `GothicBBB-Medium' aren't embedded into PDF files, and PDF reader
319 substitute them by some external Japanese fonts (\textit{e.g.},
320 Kozuka Mincho is used for Ryumin-Light in Adobe Reader). We adopt this custom to
322 \item A character in an alphabetic font is generally smaller than a
323 Japanese font in the same size. So actual size specification of
324 these Japanese fonts is in fact smaller than that of alphabetic
325 fonts, namely scaled by 0.962216.
327 \item The amount of glue that are inserted between a \textbf{JAchar} and
328 an \textbf{ALchar} (the parameter \Param{xkanjiskip}) is set to
330 (0.25\cdot 13.5\,\textrm{Q})^{+1\,\text{pt}}_{-1\,\text{pt}}
331 = {27\over 32}\,\mathrm{mm}^{+1\,\text{pt}}_{-1\,\text{pt}}.
335 \subsection{Using in \LaTeX}\label{ssec-ltx}
337 Using in \LaTeXe\ is basically same. To set up the minimal environment
338 for Japanese, you only have to load {\tt luatexja.sty}:
340 \usepackage{luatexja}
342 It also does minimal settings (counterparts in \pLaTeX\ are {\tt
343 plfonts.dtx} and {\tt pldefs.ltx}):
346 \item {\tt JY3} is the font encoding for Japanese fonts (in horizontal direction).\\
347 When vertical typesetting is supported by \LuaTeX-ja in the future, {\tt JT3} will be used for vertical fonts.
348 \item Two font families {\tt mc} and {\tt gt} are defined:
350 \begin{tabular}{ccccc}
352 \textbf{classification}&\textbf{family}&\verb+\mdseries+&\verb+\bfseries+&\textbf{scale}\\\midrule
353 \emph{mincho}&\tt mc&Ryumin-Light &GothicBBB-Medium&0.962216\\
354 \emph{gothic}&\tt gt&GothicBBB-Medium&GothicBBB-Medium&0.962216\\
358 Remark that the bold series in both family are same as the medium series of \emph{gothic} family.
359 This is a convention in \pLaTeX.
361 \item Japanese characters in math mode are typeset by the font family {\tt mc}.
364 However, above settings are not sufficient for Japanese-based
365 documents. To typeset Japanese-based documents, You are better to use
366 class files other than {\tt article.cls}, {\tt book.cls}, and so on. At
367 the present, we have the counterparts of \Pkg{jclasses} (standard
368 classes in \pLaTeX) and \Pkg jsclasses (classes by Haruhiko
369 Okumura), namely, \Pkg{ltjclasses} and \Pkg{ltjsclasses}.
371 \paragraph{{\tt\char92 CID, {\tt\char92 UTF}} and macros in OTF package}
372 Under \pTeX, \Pkg{otf} package (developed by Shuzaburo Saito) is
373 used for typesetting characters which is in Adobe-japan1-6 CID but not
374 in JIS~X~0208. Since this package is widely used, \LuaTeX-ja
375 supports some of functions in \Pkg{otf} package.
378 森\UTF{9DD7}外と内田百\UTF{9592}とが\UTF{9AD9}島屋に行く。
380 \CID{7652}飾区の\CID{13706}野家,
383 %lltjlisting.sty要修正?:↑「森」の直後で改行.
386 \subsection{Changing Fonts}\label{ssub-chgfnt}
387 \paragraph{Remark: Japanese Characters in Math Mode}
388 Since \pTeX\ supports Japanese characters in math mode, there are
389 sources like the following:
392 $f_{高温}$~($f_{\text{high temperature}}$).
393 \[ y=(x-1)^2+2\quad{}よって\quad y>0 \]
394 $5\in{}素:=\{\,p\in\mathbb N:\text{$p$ is a prime}\,\}$.
397 We (the project members of \LuaTeX-ja) think that using
398 Japanese characters in math mode are allowed if and only if these are used as identifiers.
399 In this point of view,
401 \item The lines 1~and~2 above are not correct, since `高温' in above is used as a textual label, and
402 `よって' is used as a conjunction.
403 \item However, the line~3 is correct, since `素' is used as an identifier.
405 Hence, in our opinion, the above input should be corrected as:
408 ($f_{\text{high temperature}}$).
410 \mathrel{\text{よって}}\quad y>0 \]
411 $5\in{}素:=\{\,p\in\mathbb N:\text{$p$ is a prime}\,\}$.
413 %BUG?: \{\}がなければ「素」がでない.上の段落の「よって」もでてない.
414 We also believe that using Japanese characters as identifiers is rare,
415 hence we don't describe how to change Japanese fonts in math mode in
416 this chapter. For the method, please see Part~\ref{part-ref}.
419 \paragraph{plain \TeX}
420 To change Japanese fonts in plain \TeX, you must use the primitive
421 \verb+\jfont+. So please see Part~\ref{part-ref}.
425 For \LaTeXe, \LuaTeX-ja simply adopted the font selection system from that
426 of \pLaTeXe\ (in {\tt plfonts.dtx}).
428 \item Two control sequences \verb+\mcdefault+ and \verb+\gtdefault+ are
429 used to specify the default font families for \emph{mincho} and
430 \emph{gothic}, respectively. Default values: \texttt{mc} for
431 \verb+\mcdefault+ and \texttt{gt} for \verb+\gtdefault+.
432 \item Commands \verb+\fontfamily+, \verb+\fontseries+,
433 \verb+\fontshape+ and \verb+\selectfont+ can be used to change
434 attributes of Japanese fonts.
436 \begin{tabular}{cccccc}
438 &\textbf{encoding}&\textbf{family}&\textbf{series}&\textbf{shape}&\textbf{selection}\\\midrule
440 &\verb+\romanencoding+&\verb+\romanfamily+&\verb+\romanseries+&\verb+\romanshape+
443 &\verb+\kanjiencoding+&\verb+\kanjifamily+&\verb+\kanjiseries+&\verb+\kanjishape+
445 both&---&--&\verb+\fontseries+&\verb+\fontshape+&---\\
446 auto select&\verb+\fontencoding+&\verb+\fontfamily+&---&---&\verb+\usefont+\\
452 ここで,\verb+\fontencoding{<encoding>}+は,引数により和文側か欧文側かの
453 どちらかが切り替わる.例えば,次の入力で最初の\verb+\fontencoding+
454 の呼び出しは和文フォントのエンコーディングを\texttt{JT3}に変更し,
455 2回目の呼びだしでは欧文フォント側を\texttt{T1}へと変更する.
457 \fontencoding{JY3}\fontencoding{T1}
459 \verb+\fontfamily+も引数により和文側,欧文側,\textbf{あるいは両方}のフォ
461 詳細はSubsection~\ref{ssub-nfsspat}を参照すること.
464 \item For defining a Japanese font family, use
465 \verb+\DeclareKanjiFamily+ instead of
466 \verb+\DeclareFontFamily+. However, in the present implementation,
467 using \verb+\DeclareFontFamily+ doesn't cause any problem.
471 To coexist with the \Pkg{fontspec} package, it is needed to load
472 \Pkg{luatexja-fontspec} package in the preamble. This additional
473 package automatically loads \Pkg{luatexja} and \Pkg{fontspec}
476 In \Pkg{luatexja-fontspec} package, the following 7~commands are defined as
477 counterparts of original commands in the \Pkg{fontspec} package:
479 \begin{tabular}{ccccc}
482 &\verb+\jfontspec+&\verb+\setmainjfont+&\verb+\setsansjfont+&\verb+\newjfontfamily+\\
484 &\verb+\fontspec+&\verb+\setmainfont+&\verb+\setsansfont+&\verb+\newfontfamily+\\
487 &\verb+\newjfontface+&\verb+\defaultjfontfeatures+&\verb+\addjfontfeatures+\\
489 &\verb+\newfontface+&\verb+\defaultfontfeatures+&\verb+\addfontfeatures+\\
496 Note that there is no command named \verb+\setmonojfont+, since it is
497 popular for Japanese fonts that nearly all Japanese glyphs have same
498 widths. Also note that the kerning feature is set off by default in
499 these 7~commands, since this feature and \textbf{JAglue} will clash (see
502 \section{Changing Parameters}
503 There are many parameters in \LuaTeX-ja. And due to the behavior of \LuaTeX,
504 most of them are not stored as internal register of \TeX, but as an
505 original storage system in \LuaTeX-ja. Hence, to assign or acquire those
506 parameters, you have to use commands \verb+\ltjsetparameter+ and
507 \verb+\ltjgetparameter+.
509 \subsection{Editing the range of \textbf{JAchar}s}
512 To edit the range of \textbf{JAchar}s, You have to assign a non-zero
513 natural number which is less than 217 to the character range first. This
514 can be done by using \verb+\ltjdefcharrange+ primitive. For example, the
515 next line assigns whole characters in Supplementary Multilingual Plane
516 and the character `漢' to the range number~100.
518 \ltjdefcharrange{100}{"10000-"1FFFF,`漢}
520 This assignment of numbers to ranges are always global, so you should
521 not do this in the middle of a document.
523 If some character has been belonged to some non-zero numbered range,
524 this will be overwritten by the new setting. For example, whole SMP
525 belong the range~4 in the default setting of \LuaTeX-ja, and if you
526 specify the above line, then SMP will belong the range~100 and be
527 removed from the range~4.
529 After assigning numbers to ranges, the {\sf jacharrange} parameter can
530 be used to customize which character range will be treated as ranges of
531 \textbf{JAchar}s, as the following line (this is just the default
532 setting of \LuaTeX-ja):
534 \ltjsetparameter{jacharrange={-1, +2, +3, -4, -5, +6, +7, +8}}
536 The argument to {\sf jacharrange} parameter is a list of integer. Negative interger $-n$ in the list means that `the character range~$n$ is ...'.
538 \paragraph{Default Setting}
539 Lua\TeX-ja predefines eight character ranges for convinience. They are
540 determined from the following data:
542 \item Blocks in Unicode~6.0.
543 \item The \texttt{Adobe-Japan1-UCS2} mapping between a CID Adobe-Japan1-6 and Unicode.
544 \item The \texttt{PXbase} bundle for \upTeX\ by Takayuki Yato.
547 Now we describe these eight ranges. The alphabet `J' or `A' after the
548 number shows whether characters in the range is treated as
549 \textbf{JAchar}s or not by default. These settings are similar to \texttt{prefercjk} ...
551 \item[Range~8${}^{\text{J}}$] Symbols in the intersection of the upper half of ISO~8859-1
552 (Latin-1 Supplement) and JIS~X~0208 (a basic character set for Japanese). This character range
553 consists of the following charatcers:
556 \def\ch#1#2{\item \char"#1\ ({\tt U+00#1}, #2)}%"
557 \ch{A7}{Section Sign}
558 \ch{A8}{Umlaut or diaeresis}
560 \ch{B1}{Plus-minus sign}
561 \ch{B4}{Spacing acute}
562 \ch{B6}{Paragraph sign}
563 \ch{D7}{Multiplication sign}
564 \ch{F7}{Division Sign}
567 \item[Range~1${}^{\text{A}}$] Latin characters that some of them are included in Adobe-Japan1-6.
568 This range consist of the following Unicode ranges, \emph{except characters in the range~8 above}:
571 \item {\tt U+0080}--{\tt U+00FF}: Latin-1 Supplement
572 \item {\tt U+0100}--{\tt U+017F}: Latin Extended-A
573 \item {\tt U+0180}--{\tt U+024F}: Latin Extended-B
574 \item {\tt U+0250}--{\tt U+02AF}: IPA Extensions
575 \item {\tt U+02B0}--{\tt U+02FF}: Spacing Modifier Letters
576 \item {\tt U+0300}--{\tt U+036F}: Combining Diacritical Marks
577 \item {\tt U+1E00}--{\tt U+1EFF}: Latin Extended Additional
581 \item[Range~2${}^{\text{J}}$] Greek and Cyrillic letters. JIS~X~0208 (hence most of Japanese
582 fonts) has some of these characters.
585 \item {\tt U+0370}--{\tt U+03FF}: Greek and Coptic
586 \item {\tt U+0400}--{\tt U+04FF}: Cyrillic
587 \item {\tt U+1F00}--{\tt U+1FFF}: Greek Extended
591 \item[Range~3${}^{\text{J}}$] Punctuations and Miscellaneous symbols. The block list is
592 indicated in Table~\ref{table-rng3}.
594 \caption{Unicode blocks in predefined character range~3.}\label{table-rng3}
595 \catcode`\"=13\def"#1#2#3#4{{\tt U+#1#2#3#4}}%"
597 \begin{tabular}{llll}
598 "2000--"206F&General Punctuation&
599 "2070--"209F&Superscripts and Subscripts\\
600 "20A0--"20CF&Currency Symbols&
601 "20D0--"20FF&Comb.\ Diacritical Marks for Symbols\\
602 "2100--"214F&Letterlike Symbols&
603 "2150--"218F&Number Forms\\
605 "2200--"22FF&Mathematical Operators\\
606 "2300--"23FF&Miscellaneous Technical&
607 "2400--"243F&Control Pictures\\
608 "2500--"257F&Box Drawing&
609 "2580--"259F&Block Elements\\
610 "25A0--"25FF&Geometric Shapes&
611 "2600--"26FF&Miscellaneous Symbols\\
612 "2700--"27BF&Dingbats&
613 "2900--"297F&Supplemental Arrows-B\\
614 "2980--"29FF&Misc.\ Mathematical Symbols-B&
615 "2B00--"2BFF&Miscellaneous Symbols and Arrows\\
616 "E000--"F8FF&Private Use Area&
620 \item[Range~4${}^{\text{A}}$] Characters usually not in Japanese fonts. This range consists
621 of almost all Unicode blocks which are not in other
622 predefined ranges. Hence, instead of showing the block list,
623 we put the definition of this range itself:
625 \ltjdefcharrange{4}{%
626 "500-"10FF, "1200-"1DFF, "2440-"245F, "27C0-"28FF, "2A00-"2AFF,
627 "2C00-"2E7F, "4DC0-"4DFF, "A4D0-"A82F, "A840-"ABFF, "FB50-"FE0F,
628 "FE20-"FE2F, "FE70-"FEFF, "FB00-"FB4F, "10000-"1FFFF} % non-Japanese
630 \item[Range~5${}^{\text{A}}$] Surrogates and Supplementary Private Use Areas.
631 \item[Range~6${}^{\text{J}}$] Characters used in Japanese. The block list is indicated in Table~\ref{table-rng6}.
633 \caption{Unicode blocks in predefined character range~6.}\label{table-rng6}
634 \catcode`\"=13\def"#1#2#3#4{{\tt U+#1#2#3#4}}%"
636 \begin{tabular}{llll}
637 "2460--"24FF&Enclosed Alphanumerics&
638 "2E80--"2EFF&CJK Radicals Supplement\\
639 "3000--"303F&CJK Symbols and Punctuation&
640 "3040--"309F&Hiragana\\
641 "30A0--"30FF&Katakana&
642 "3190--"319F&Kanbun\\
643 "31F0--"31FF&Katakana Phonetic Extensions&
644 "3200--"32FF&Enclosed CJK Letters and Months\\
645 "3300--"33FF&CJK Compatibility&
646 "3400--"4DBF&CJK Unified Ideographs Extension A\\
647 "4E00--"9FFF&CJK Unified Ideographs&
648 "F900--"FAFF&CJK Compatibility Ideographs\\
649 "FE10--"FE1F&Vertical Forms&
650 "FE30--"FE4F&CJK Compatibility Forms\\
651 "FE50--"FE6F&Small Form Variants&
652 "{20}000--"{2F}FFF&(Supplementary Ideographic Plane)
656 \item[Range~7${}^{\text{J}}$] Characters used in CJK languages, but not included in Adobe-Japan1-6.
657 The block list is indicated in Table~\ref{table-rng7}.
659 \caption{Unicode blocks in predefined character range~7.}\label{table-rng7}
660 \catcode`\"=13\def"#1#2#3#4{{\tt U+#1#2#3#4}}%"
662 \begin{tabular}{llll}
663 "1100--"11FF&Hangul Jamo&
664 "2F00--"2FDF&Kangxi Radicals\\
665 "2FF0--"2FFF&Ideographic Description Characters&
666 "3100--"312F&Bopomofo\\
667 "3130--"318F&Hangul Compatibility Jamo&
668 "31A0--"31BF&Bopomofo Extended\\
669 "31C0--"31EF&CJK Strokes&
670 "A000--"A48F&Yi Syllables\\
671 "A490--"A4CF&Yi Radicals&
672 "A830--"A83F&Common Indic Number Forms\\
673 "AC00--"D7AF&Hangul Syllables&
674 "D7B0--"D7FF&Hangul Jamo Extended-B
681 \subsection{\Param{kanjiskip} and \Param{xkanjiskip}}\label{subs-kskip}
682 \textbf{JAglue} is divided into the following three categories:
684 \item Glues/kerns specified in JFM. If \verb+\inhibitglue+ is issued
685 around a Japanese character, this glue will be not inserted at the
687 \item The default glue which inserted between two \textbf{JAchar}s ({\sf
689 \item The default glue which inserted between a \textbf{JAchar} and an
690 \textbf{ALchar} (\Param{xkanjiskip}).
692 The value (a skip) of \Param{kanjiskip} or \Param{xkanjiskip} can be
693 changed as the following.
695 \ltjsetparameter{kanjiskip={0pt plus 0.4pt minus 0.4pt},
696 xkanjiskip={0.25\zw plus 1pt minus 1pt}}
700 It may occur that JFM contains the data of `ideal width of {\sf
701 kanjiskip}' and/or `ideal width of \Param{xkanjiskip}'.
702 To use these data from JFM, set the value of \Param{kanjiskip} or
703 \Param{xkanjiskip} to \verb+\maxdimen+.
705 \subsection{Insertion Setting of \Param{xkanjiskip}}
706 It is not desirable that \Param{xkanjiskip} is inserted between every
707 boundary between \textbf{JAchar}s and \textbf{ALchar}s. For example,
708 \Param{xkanjiskip} should not be inserted after opening parenthesis
709 (\textit{e.g.}, compare `(あ' and `(\hskip\ltjgetparameter{xkanjiskip}あ').
711 \LuaTeX-ja can control whether \Param{xkanjiskip} can be inserted
712 before/after a character, by changing \Param{jaxspmode} for \textbf{JAchar}s and
713 \Param{alxspmode} parameters \textbf{ALchar}s respectively.
715 \ltjsetparameter{jaxspmode={`あ,preonly}, alxspmode={`\!,postonly}}
719 The second argument {\tt preonly} means `the insertion of
720 \Param{xkanjiskip} is allowed before this character, but not after'.
721 the other possible values are {\tt postonly}, {\tt allow} and {\tt
722 inhibit}. For the compatibility with \pTeX, natural numbers between
723 0~and~3 are also allowed as the second argument\footnote{But we don't
724 recommend this: since numbers 1~and~2 have opposite meanings in
725 \Param{jaxspmode} and \Param{alxspmode}.}.
727 If you want to enable/disable all insertions of \Param{kanjiskip} and
728 \Param{xkanjiskip}, set \Param{autospacing} and \Param{autoxspacing}
729 parameters to {\tt false}, respectively.
732 \subsection{Shifting Baseline}
733 To make a match between a Japanese font and an alphabetic font, sometimes
734 shifting of the baseline of one of the pair is needed. In \pTeX, this is achieved
735 by setting \verb+\ybaselineshift+ to a non-zero length (the
736 baseline of alphabetic fonts is shifted below). However, for documents
737 whose main language is not Japanese, it is good to shift the baseline of
738 Japanese fonts, but not that of alphabetic fonts.
739 Because of this, \LuaTeX-ja can independently set the shifting amount
740 of the baseline of alphabetic fonts (\Param{yalbaselineshift}
741 parameter) and that of Japanese fonts (\Param{yjabaselineshift}
745 \vrule width 150pt height 0.4pt depth 0pt\hskip-120pt
746 \ltjsetparameter{yjabaselineshift=0pt, yalbaselineshift=0pt}abcあいう
747 \ltjsetparameter{yjabaselineshift=5pt, yalbaselineshift=2pt}abcあいう
749 Here the horizontal line in above is the baseline of a line.
751 There is an interesting side-effect: characters in different size can be
752 vertically aligned center in a line, by setting two parameters appropriately.
753 The following is an example (beware the value is not well tuned):
757 \ltjsetparameter{yjabaselineshift=-1pt,
758 yalbaselineshift=-1pt}
764 \subsection{Cropmark}
765 Cropmark is a mark for indicating 4~corners and horizontal/vertical
766 center of the paper. In Japanese, we call cropmark as tombo(w).
767 \pLaTeX\ and this \LuaTeX-ja support `tombow' by their kernel.
768 The following steps are needed to typeset cropmark:
771 \item First, define the banner which will be printed at the upper left
772 of the paper. This is done by assigning a token list to
773 \verb+\@bannertoken+.
775 For example, the following sets banner as `{\tt filename (2012-01-01 17:01)}':
779 \hour\time \divide\hour by 60 \@tempcnta\hour \multiply\@tempcnta 60\relax
780 \minute\time \advance\minute-\@tempcnta
782 \jobname\space(\number\year-\two@digits\month-\two@digits\day
783 \space\two@digits\hour:\two@digits\minute)}%
790 \part{Reference}\label{part-ref}
791 \section{Font Metric and Japanese Font}
792 \subsection{\texttt{\char92jfont} primitive}
793 To load a font as a Japanese font, you must use the
794 \verb+\jfont+ primitive instead of~\verb+\font+, while
795 \verb+\jfont+ admits the same syntax used in~\verb+\font+.
796 \LuaTeX-ja automatically loads \Pkg{luaotfload} package,
797 so TrueType/OpenType fonts with features can be used for Japanese fonts:
799 \jfont\tradgt={file:ipaexg.ttf:script=latn;%
800 +trad;-kern;jfm=ujis} at 14pt
804 Note that the defined control sequence
805 (\verb+\tradgt+ in the example above) using \verb+\jfont+ is not a
806 \textit{font\_def} token, hence the input like \verb+\fontname\tradgt+
807 causes a error. We denote control sequences which are defined in
808 \verb+\jfont+ by <jfont\_cs>.
810 \paragraph{Prefix \texttt{psft}}
811 Besides \texttt{file:}\ and \texttt{name:}\ prefixes, \texttt{psft:}\
812 can be used a prefix in \verb+\jfont+ (and~\verb+\font+) primitive.
813 Using this prefix, you can specify a `name-only' Japanese font which
814 will be not embedded to PDF. Typical use of this prefix is to specify
815 the `standard' Japanese fonts, namely, `Ryumin-Light' and
816 `GothicBBB-Medium'. For kerning or other informations, that of Kozuka
817 Mincho Pr6N Regular (this is a font by Adobe Inc., and included in
818 Japanese Font Packs for Adore Reader) will be used.
822 As noted in Introduction, a JFM has measurements of characters and
823 glues/kerns that are automatically inserted for Japanese
824 typesetting. The structure of JFM will be described in the next
825 subsection. At the calling of \verb+\jfont+ primitive, you must specify
826 which JFM will be used for this font by the following keys:
828 \begin{list}{}{\def\makelabel{\ttfamily}\def\{{\char`\{}\def\}{\char`\}}}
830 Specify the name of JFM. A file named \texttt{jfm-<name>.lua} will be searched and/or loaded.
832 The followings are JFMs shipped with Lua\TeX-ja:
834 \item[\tt jfm-ujis.lua] A standard JFM in Lua\TeX-ja. This JFM is
835 based on \verb+upnmlminr-h.tfm+, a metric for UTF/OTF package that
836 is used in \upTeX. When you use the \Pkg{luatexja-otf} package, please use this JFM.
837 \item[\tt jfm-jis.lua] A counterpart for \verb+jis.tfm+, `JIS font
838 metric' which is widely used in \pTeX. A major difference of
839 \texttt{jfm-ujis.lua} and this \texttt{jfm-jis.lua} is that
840 most haracters under \texttt{jfm-ujis.lua} are square-shaped,
841 while that under \texttt{jfm-jis.lua} are horizontal
844 \item[\tt jfm-min.lua] A counterpart for \verb+min10.tfm+, which is one
845 of the default Japanese font metric shipped with \pTeX. There
846 are notable difference between this JFM and other 2~JFMs, as
847 shown in Table~\ref{tab-difjfm}.
850 \item[jfmvar=<string>] Sometimes there is a need that
854 \caption{Differences between JFMs shipped with \LuaTeX-ja}
857 \def\r#1{{\jfont\g=psft:Ryumin-Light:jfm=#1 at 14.43324pt \g
858 \setbox0=\vtop{\hsize=7\zw\noindent ◆◆◆◆◆◆◆
859 ある日モモちゃんがお使いで迷子になって泣きました.}\copy0
860 \vrule height 0pt depth \dp0}}
861 \def\s#1{{\jfont\g=psft:Ryumin-Light:jfm=#1 at 14.43324pt \g
862 \setbox0=\vtop{\hsize=7\zw\noindent ちょっと!何}\copy0}}
863 \def\t#1{{\jfont\g=psft:Ryumin-Light:jfm=#1 at 19.24432pt \g
865 \vrule width 0.4pt height\ht0 depth\dp0\kern-.2pt\copy0
866 \kern-\wd0\vrule width\wd0height .2pt depth .2pt
867 \kern-\wd0\raise\ht0\hbox{\vrule width\wd0height .2pt depth .2pt}%
868 \kern-\wd0\lower\dp0\hbox{\vrule width\wd0height .2pt depth .2pt}%
869 \kern-.2pt\vrule width 0.4pt height\ht0 depth \dp0}}
870 \begin{tabular}{rccc}
872 &\tt jfm-ujis.lua&\tt jfm-jis.lua&\tt jfm-min.lua\\
874 Example~1&\r{ujis}&\r{jis}&\r{min}\\
875 Example~2&\s{ujis}&\s{jis}&\s{min}\\
876 Bounding Box&\t{ujis}&\t{jis}&\t{min}\\
882 \paragraph{Note: kern feature}\label{para-kern}
883 Some fonts have information for inter-glyph spacing. However, this
884 information is not well-compatible with \LuaTeX-ja. More concretely,
885 this kerning space from this information are inserted \emph{before} the
886 insertion process of \textbf{JAglue}, and this causes incorrect spacing
887 between two characters when both a glue/kern from the data in the font
888 and it from JFM are present.
891 \item You should specify {\tt -kern} in
892 {\tt\char92jfont} primitive, when you want to use other font features,
893 such as {\tt script=...}\,.
894 \item If you want to use Japanese fonts in proportinal width, and use
895 information from this font, use \texttt{jfm-prop.lua} for its JFM, and ...
901 \subsection{Structure of JFM file}
902 A JFM file is a Lua script which has only one function call:
904 luatexja.jfont.define_jfm { ... }
906 Real data are stored in the table which indicated above by
907 \verb+{ ... }+. So, the rest of this subsection are devoted to describe the
908 structure of this table. Note that all lengths in a JFM file are
909 floating-point numbers in design-size unit.
911 \begin{list}{}{\def\makelabel{\ttfamily}\def\{{\char`\{}\def\}{\char`\}}}
912 \item[dir=<direction>] (required)
914 The direction of JFM. At the present, only \texttt{'yoko'} is supported.
916 \item[zw=<length>] (required)
918 The amount of the length of the `full-width'.
920 \item[zh=<length>] (required)
922 \item[kanjiskip=\{<natural>, <stretch>, <shrink>\}] (optional)
924 This field specifies the `ideal' amount of \Param{kanjiskip}. As noted
925 in Subsection~\ref{subs-kskip}, if the parameter
926 \Param{kanjiskip} is \verb+\maxdimen+, the value specified
927 in this field is actually used (if this field is not specified in
928 JFM, it is regarded as 0\,pt). Note that <stretch> and <shrink>
929 fields are in design-size unit too.
932 \item[xkanjiskip=\{<natural>, <stretch>, <shrink>\}] (optional)
934 Like the \Param{kanjiskip} field, this field specifies the `ideal'
935 amount of \Param{xkanjiskip}.
939 Besides from above fields, a JFM file have several sub-tables those
940 indices are natural numbers. The table indexed by~$i\in\omega$ stores
941 informations of `character class'~$i$. At least, the character class~0 is
942 always present, so each JFM file must have a sub-table whose index is
943 \texttt{[0]}. Each sub-table (its numerical index is denoted by $i$) has
944 the following fields:
946 \begin{list}{}{\def\makelabel{\ttfamily}\def\{{\char`\{}\def\}{\char`\}}}
947 \item[chars=\{<character>, ...\}] (required except character class~0)
949 This field is a list of characters which are in this character
950 type~$i$. This field is not required if $i=0$, since all
951 \textbf{JAchar} which are not in any character class other
952 than 0 (hence, the character class~0 contains most of
953 \textbf{JAchar}s). In the list, a character can be
954 specified by its code number, or by the character itself
955 (as a string of length~1). Moreover, there are `imaginary
956 characters' which specified in the list. We will describe these later.
958 \item[width=<length>, height=<length>, depth=<length>, italic=<length>]\ (required)
960 Specify width of characters in character class~$i$, height, depth and
961 the amount of italic correction. All characters in character class~$i$ are regarded that its width, height and depth are
962 as values of these fields.
963 But there is one exception: if \texttt{'prop'} is specified in \texttt{width} field, width of a character becomes that of its `real' glyph
965 \item[left=<length>, down=<length>, align=<align>]\
967 These fields are for adjusting the position of the `real' glyph. Legal
968 values of \texttt{align} field are \texttt{'left'},
969 \texttt{'middle'} and \texttt{'right'}. If one of these
970 3~fields are omitted, \texttt{left} and \texttt{down} are
971 treated as~0, and \texttt{align} field is treated as
973 The effects of these 3~fields are indicated in Figure~\ref{fig-pos}.
975 In most cases, \texttt{left} and \texttt{down} fields are~0, while
976 it is not uncommon that the \texttt{align} field is \texttt{'middle'} or \texttt{'right'}.
977 For example, setting the \texttt{align} field to \texttt{'right'} is practically needed
978 when the current character class is the class for opening delimiters'.
980 \begin{minipage}{0.4\textwidth}%
981 \begin{center}\unitlength=10pt\small
982 \begin{picture}(15,12)(-1,-4)
983 \color{black!10!white}% real glyph :step1
984 \put(0,0){\vrule width 12\unitlength height 8\unitlength depth 3\unitlength}
986 \color{red!20!white}% real glyph :step1
987 \put(-1,-1.5){\vrule width 6\unitlength height 7\unitlength depth 2.5\unitlength}
989 \color{red}% real glyph
991 \put(-1,-1.5){\vector(0,1){7}\vector(0,-1){2.5}\vector(1,0){6}}
992 \put(5,-1.5){\line(0,1){7}\line(0,-1){2.5}}
993 \put(-1,5.5){\line(1,0){6}}
994 \put(-1,-4){\line(1,0){6}}
996 \color{green!20!white}% real glyph :step1
997 \put(3,0){\vrule width 6\unitlength height 7\unitlength depth 2.5\unitlength}
999 \color{black}% real glyph :step1
1001 \put(0,0){\vector(0,1){8}\line(0,-1){3}\vector(1,0){12}}
1002 \put(12,0){\line(0,1){8}\vector(0,-1){3}}
1003 \put(0,8){\line(1,0){12}}
1004 \put(0,-3){\line(1,0){12}}
1005 \put(0.2,4){\makebox(0,0)[l]{\texttt{height}}}
1006 \put(12.2,-1.5){\makebox(0,0)[l]{\texttt{depth}}}
1007 \put(6,0.2){\makebox(0,0)[b]{\texttt{width}}}
1009 \color{green!50!black}% real glyph :step1
1011 \put(3,0){\vector(0,1){7}\vector(0,-1){2.5}\vector(1,0){6}}
1012 \put(9,0){\line(0,1){7}\line(0,-1){2.5}}
1013 \put(3,7){\line(1,0){6}}
1014 \put(3,-2.5){\line(1,0){6}}
1015 \newsavebox{\eqdist}
1016 \savebox{\eqdist}(0,0)[b]{%
1018 \put(-0.08,0.2){\line(0,-1){0.4}}%
1019 \put(0.08,0.2){\line(0,-1){0.4}}}
1020 \put(1.5,0){\usebox{\eqdist}}
1021 \put(10.5,0){\usebox{\eqdist}}
1023 \color{blue}% shifted
1025 \put(3,-1.5){\vector(-1,0){4}}
1026 \put(1,-1.7){\makebox(0,0)[t]{\texttt{left}}}
1027 \put(3,0){\vector(0,-1){1.5}}
1028 \put(3.2,-0.75){\makebox(0,0)[l]{\texttt{down}}}
1032 \begin{minipage}{0.6\textwidth}%
1033 Consider a node containing Japanese character whose value of the \texttt{align}
1034 field is \texttt{'middle'}.
1036 \item The black rectangle is a frame of the node.
1037 Its width, height and depth are specified by JFM.
1038 \item Since the \texttt{align} field is \texttt{'middle'},
1039 the `real' glyph is centered horizontally (the green rectangle).
1040 \item Furthermore, the glyph is shifted according to values of fields
1041 \texttt{left} and \texttt{down}. The ultimate position of the real
1042 glyph is indicated by the red rectangle.
1045 \caption{The position of the `real' glyph.}
1050 \item[kern={\{[$j$]=<kern>, ...\}}]
1052 \item[glue={\{[$j$]=\{<width>, <stretch>, <shrink>\}, ...\}}]
1056 \begin{list}{}{\def\makelabel{\ttfamily}\def\{{\char`\{}\def\}{\char`\}}}
1057 \item['lineend'] An ending of a line.
1058 \item['diffmet'] Used at a boundary between two \textbf{JAchar}s whose JFM or size is different.
1059 \item['boxbdd'] The beginning/ending of a horizontal box, and the beginging of a noindented paragraph.
1060 \item['parbdd'] The beginning of an (indented) paragraph.
1061 \item['jcharbdd'] A boundary between \textbf{JAchar} and anything else
1062 (such as \textbf{ALchar}, kern, glue, ...).
1063 \item[$-1$] The left/right boundary of an inline math formula.
1068 上で説明した通り,\texttt{chars}フィールド中にはいくつかの「特殊文字」も
1069 指定可能である.これらは,大半が\pTeX のJFMグルーの挿入処理ではみな「文字
1070 クラス0の文字」として扱われていた文字であり,その結果として\pTeX より細か
1071 い組版調整ができるようになっている.以下のその一覧を述べる:
1072 \begin{list}{}{\def\makelabel{\ttfamily}\def\{{\char`\{}\def\}{\char`\}}}
1073 \item['lineend'] 行の終端を表す.
1076 \item['boxbdd'] hboxの先頭と末尾,及びインデントされていない
1077 (\verb+\noindent+で開始された)段落の先頭を表す.
1078 \item['parbdd'] 通常の(\verb+\noindent+で開始されていない)段落の先頭.
1079 \item['jcharbdd'] 和文文字と「その他のもの」(欧文文字,glue,kern等)との境界.
1080 \item[$-1$] 行中数式と地の文との境界.
1083 \paragraph{\pTeX 用和文フォントメトリックの移植}
1084 以下に,\pTeX 用和文フォントメトリックを\LuaTeX-ja用に移植する場合の注意点を挙げておく.
1086 \item 実際に出力される和文フォントのサイズがdesign sizeとなる.
1087 このため,例えば$1\,\textrm{zw}$がdesign sizeの0.962216倍であるJISフォン
1090 \item JFM中の全ての数値を$1/0.962216$倍しておく.
1091 \item \TeX ソース中で使用するところで,サイズ指定を0.962216倍にする.
1092 \LaTeX でのフォント宣言なら,例えば次のように:
1094 \DeclareFontShape{JY3}{mc}{m}{n}{<-> s*[0.962216] psft:Ryumin-Light:jfm=jis}{}
1097 \item 上に述べた特殊文字は,\texttt{'boxbdd'}を除き文字クラスを全部0とする
1099 \item \texttt{'boxbdd'}については,それのみで一つの文字クラスを形成し,その
1100 文字クラスに関してはglue/kernの設定はしない.
1103 hboxの先頭・末尾とインデントされていない(\verb+\noindent+で開始さ
1104 れた)段落の先頭にはJFMグルーは入らないという仕様を実現させるためである.
1105 \item \pTeX の組版を再現させようというのが目的であれば以上の注意を守れば十分である.
1107 ところで,\pTeX では通常の段落の先頭にJFMグルーが残るという仕様があるので,
1108 段落先頭の開き括弧は全角二分下がりになる.全角下がりを実現させるに
1109 は,段落の最初に手動で\verb+\inhibitglue+を追加するか,あるいは
1110 \verb+\everypar+のhackを行い,それを自動化させるしかなかった.
1112 一方,\LuaTeX-jaでは,\texttt{'parbdd'}によって,それがJFM側で調整できるよ
1113 うになった.例えば,\LuaTeX-ja同梱のJFMのように,\texttt{'boxbdd'}と同じ文字クラスに
1114 \texttt{'parbdd'}を入れれば全角下がりとなる.
1117 \jfont\g=psft:Ryumin-Light:jfm=test \g
1118 \parindent1\zw\noindent{}◆◆◆◆◆
1127 \subsection{Math Font Family}
1128 \TeX\ handles fonts in math formulas by 16~font families\footnote{Omega,
1129 Aleph, \LuaTeX~and $\varepsilon$-\kern-.125em(u)\pTeX can handles 256~families, but
1130 an external package is needed to support this in plain \TeX\ and
1131 \LaTeX.}, and each family has three fonts:
1132 \verb+\textfont+, \verb+\scriptfont+ and \verb+\scriptscriptfont+.
1134 \LuaTeX-ja's handling of Japanese fonts in math formulas is similar;
1135 Table~\ref{tab-math} shows counterparts to \TeX's primitives for math
1136 font families. There is no relation between the value of
1137 \verb+\fam+ and that of \verb+\jfam+; with appropreate settings,
1138 you can set both \verb+\fam+ and \verb+\jfam+ to~the same value.
1141 \caption{Primitives for Japanese math fonts.}
1143 \begin{center}\def\{{\char`\{}\def\}{\char`\}}
1144 \begin{tabular}{lll}
1146 &Japanese fonts&alphabetic fonts\\
1148 font family&\verb+\jfam+${}\in [0,256)$&\verb+\fam+\\
1149 text size&\tt\Param{jatextfont}\,=\{<jfam>,<jfont\_cs>\}&\tt\verb+\textfont+<fam>=<font\_cs>\\
1150 script size&\tt\Param{jascriptfont}\,=\{<jfam>,<jfont\_cs>\}&\tt\verb+\scriptfont+<fam>=<font\_cs>\\
1151 scriptscript size&\tt\Param{jascriptscriptfont}\,=\{<jfam>,<jfont\_cs>\}&\tt\verb+\scriptscriptfont+<fam>=<font\_cs>\\
1157 \subsection{Callbacks}
1158 Like \LuaTeX\ itself, \LuaTeX-ja also has callbacks. These callbacks can
1159 be accessed via \verb+luatexbase.add_to_callback+ function and so on, as other callbacks
1161 {\def\makelabel#1{\bfseries#1}}
1162 \item[\texttt{luatexja.load\_jfm} callback]
1163 With this callback you can overwrite JFMs.
1166 function (<table> jfm_info, <string> jfm_name)
1167 return <table> new_jfm_info
1171 The argument \verb+jfm_info+ contains a table similar to the table in a JFM file, except
1172 this argument has \texttt{chars} field which contains character codes
1173 whose character class is not~0.
1175 An example of this callback is the \texttt{ltjarticle} class, with
1176 forcefully assigning character class~0 to \texttt{'parbdd'}
1177 in the JFM \texttt{jfm-min.lua}. This callback doesn't
1178 replace any code of \LuaTeX-ja.
1180 \item[\texttt{luatexja.define\_font} callback]
1181 This callback and the next callback form a pair, and you can assign letters which don't have
1182 fixed codepoints in Unicode to non-zero character classes.
1183 This \texttt{luatexja.define\_font} callback is called just when new Japanese font is loaded.
1185 function (<table> jfont_info, <number> font_number)
1186 return <table> new_jfont_info
1190 You may assume that \verb+jfont_info+ has the following fields:
1192 \item[\tt jfm] The index number of JFM.
1193 \item[\tt size] Font size in a scaled point (${}=2^{-16}\,\textrm{pt}$).
1194 \item[\tt var] The value specified in \texttt{jfmvar=...} at a call of \verb+\jfont+.
1197 The returned table \verb+new_jfont_info+ also should include these three fields.
1198 The \verb+font_number+ is a font number.
1200 A good example of this and the next callbacks is the \Pkg{luatexja-otf}
1201 package, supporting \verb+"AJ1-xxx"+ form for Adobe-Japan1
1202 CID characters in a JFM. This callback doesn't replace any
1206 \item[\texttt{luatexja.find\_char\_class} callback]
1207 This callback is called just when \LuaTeX-ja inready to determine which
1208 character class a character \verb+chr_code+ belongs.
1209 A function used in this callback should be in the following form:
1210 \begin{lstlisting}[numbers=left]
1211 function (<number> char_class, <table> jfont_info, <number> chr_code)
1212 if char_class~=0 then return char_class
1215 return (<number> new_char_class or 0)
1220 The argument \verb+char_class+ is the result of \LuaTeX-ja's default
1221 routine or previous function calls in this callback, hence
1222 this argument may not be 0. Moreover, the returned
1223 \verb+new_char_class+ should be as same as \verb+char_class+ when \verb+char_class+
1224 is not~0, otherwise you will overwrite the \LuaTeX-ja's
1227 This callback doesn't replace any code of \LuaTeX-ja.
1235 \section{Parameters}
1236 \subsection{{\tt\char92 ltjsetparameter} primitive}
1237 As noted before, \verb+\ltjsetparameter+ and \verb+\ltjgetparameter+ are
1238 primitives for accessing most parameters of \LuaTeX-ja. One of the main
1239 reason that \LuaTeX-ja didn't adopted the syntax similar to that of \pTeX\
1240 (\textit{e.g.},~\verb+\prebreakpenalty`)=10000+)
1241 is the position of \verb+hpack_filter+ callback in the source
1242 of \LuaTeX, see Section~\ref{sec-para}.
1244 \verb+\ltjsetparameter+ and \verb+\ltjglobalsetparameter+ are primitives
1245 for assigning parameters. These take one argument which is a
1246 \texttt{<key>=<value>} list. Allowed keys are described in the next
1248 The difference between
1249 \verb+\ltjsetparameter+ and \verb+\ltjglobalsetparameter+ is only the
1250 scope of assignment;
1251 \verb+\ltjsetparameter+ does a local assignment and
1252 \verb+\ltjglobalsetparameter+ does a global one.
1253 They also obey the value of \verb+\globaldefs+,
1254 like other assignment.
1256 \verb+\ltjgetparameter+ is the primitive for acquiring parameters. It
1257 always takes a parameter name as first argument, and also takes the
1258 additional argument---a character code, for example---in some cases.
1260 \ltjgetparameter{differentjfm},
1261 \ltjgetparameter{autospacing},
1262 \ltjgetparameter{prebreakpenalty}{`)}.
1264 \emph{The return value of\/ {\normalfont\tt\char92ltjgetparameter} is
1265 always a string}. This is outputted by \texttt{tex.write()}, so any
1266 character other than space~`{\tt\char32}'~(U+0020) has the category code
1267 12~(other), while the space has 10~(space).
1269 \subsection{List of Parameters}
1270 The following is the list of parameters which can be specificated by the
1271 \verb+\ltjsetparameter+ command. [\verb+\cs+] indicates the counterpart
1272 in \pTeX, and symbols beside each parameter has the following meaning:
1274 \item No mark: values at the end of the paragraph or the hbox are
1275 adopted in the whole paragraph/hbox.
1276 \item `\ast' : local parameters, which can change everywhere inside a paragraph/hbox.
1277 \item `\dagger': assignments are always global.
1280 \begin{list}{}{\def\makelabel{\ttfamily}\def\{{\char`\{}\def\}{\char`\}}}
1281 \item[\Param{jcharwidowpenalty}\,=<penalty>] [\verb+\jcharwidowpenalty+]
1283 Penalty value for supressing orphans. This penalty is inserted just
1284 after the last \textbf{JAchar} which is not regarded as a
1285 (Japanese) punctuation mark.
1287 \item[\Param{kcatcode}\,=\{<chr\_code>,<natural number>\}]\
1289 An additional attributes having each character whose character code is <chr\_code>.
1290 At the present version, the lowermost bit of <natural number> indicates
1291 whether the character is considered as a punctuation mark
1292 (see the description of \Param{jcharwidowpenalty} above).
1295 \item[\Param{prebreakpenalty}\,=\{<chr\_code>,<penalty>\}] [\verb+\prebreakpenalty+]\
1297 文字コード<chr\_code>の\textbf{JAchar}が行頭にくることを抑止するために,
1298 この文字の前に挿入/追加されるペナルティの量を指定する.
1300 例えば閉じ括弧「〗」は絶対に行頭にきてはならないので,標準で読み込まれる
1301 \texttt{luatexja-kinsoku.tex}において
1303 \ltjsetparameter{prebreakpenalty={`〙,10000}}
1305 と,最大値の10000が指定されている.他にも,小書きのカナなど,絶対禁止とい
1306 うわけではないができれば行頭にはきて欲しくない場合に,0と
1307 10000の間の値を指定するのも有用であろう.
1309 \ltjsetparameter{prebreakpenalty={`ゕ,150}}
1313 \item[\Param{postbreakpenalty}\,=\{<chr\_code>,<penalty>\}] [\verb+\postbreakpenalty+]
1315 文字コード<chr\_code>の\textbf{JAchar}が行末にくることを抑止するために,
1316 この文字の後に挿入/追加されるペナルティの量を指定する.
1318 \pTeX では,\verb+\prebreakpenalty+, \verb+\postbreakpenalty+において,
1320 \item 一つの文字に対して,pre, postどちらか一つしか指定することができなかっ
1322 \item pre, post合わせて256文字分の情報を格納することしかできなかった.
1324 という制限があったが,\LuaTeX-ja ではこれらの制限は解消されている.
1327 \item[\Param{jatextfont}\,=\{<jfam>,<jfont\_cs>\}] [\verb+\textfont+ in \TeX]
1328 \item[\Param{jascriptfont}\,=\{<jfam>,<jfont\_cs>\}] [\verb+\scriptfont+ in \TeX]
1329 \item[\Param{jascriptscriptfont}\,=\{<jfam>,<jfont\_cs>\}] [\verb+\scriptscriptfont+ in \TeX]
1330 \item[\Param{yjabaselineshift}\,=<dimen>$^\ast$]\
1331 \item[\Param{yalbaselineshift}\,=<dimen>$^\ast$] [\verb+\ybaselineshift+]
1333 \item[\Param{jaxspmode}\,=\{<chr\_code>,<mode>\}] [\verb+\inhibitxspcode+]
1335 Setting whether inserting \Param{xkanjiskip} is allowed before/after a \textbf{JAchar} whose character code is <chr\_code>.
1336 The followings are allowed for <mode>:
1338 \item[0, \texttt{inhibit}] Insertion of \Param{xkanjiskip} is inhibited before the charater, nor after the charater.
1339 \item[2, \texttt{preonly}] Insertion of \Param{xkanjiskip} is allowed before the charater, but not after.
1340 \item[1, \texttt{postonly}] Insertion of \Param{xkanjiskip} is allowed after the charater, but not before.
1341 \item[3, \texttt{allow}] Insertion of \Param{xkanjiskip} is allowed before the charater and after the charater.
1342 This is the default value.
1345 \item[\Param{alxspmode}\,=\{<chr\_code>,<mode>\}] [\verb+\xspcode+]
1347 Setting whether inserting \Param{xkanjiskip} is allowed before/after a
1348 \textbf{ALchar} whose character code is <chr\_code>.
1349 The followings are allowed for <mode>:
1351 \item[0, \texttt{inhibit}] Insertion of \Param{xkanjiskip} is inhibited
1352 before the charater, nor after the charater.
1353 \item[1, \texttt{preonly}] Insertion of \Param{xkanjiskip} is allowed
1354 before the charater, but not after.
1355 \item[2, \texttt{postonly}] Insertion of \Param{xkanjiskip} is allowed
1356 after the charater, but not before.
1357 \item[3, \texttt{allow}] Insertion of \Param{xkanjiskip} is allowed both
1358 before the charater and after the charater.
1359 This is the default value.
1361 Note that parameters \Param{jaxspmode} and \Param{alxspmode} use a common table.
1363 \item[\Param{autospacing}\,=<bool>$^\ast$] [\verb+\autospacing+]
1364 \item[\Param{autoxspacing}\,=<bool>$^\ast$] [\verb+\autoxspacing+]
1365 \item[\Param{kanjiskip}\,=<skip>] [\verb+\kanjiskip+]
1366 \item[\Param{xkanjiskip}\,=<skip>] [\verb+\xkanjiskip+]
1368 \item[\Param{differentjfm}\,=<mode>$^\dagger$]
1370 Specify how glues/kerns between two \textbf{JAchar}s whose JFM (or size) are different.
1371 The allowed arguments are the followings:
1373 \item[\texttt{average}]
1374 \item[\texttt{both}]
1375 \item[\texttt{large}]
1376 \item[\texttt{small}]
1379 \item[\Param{jacharrange}\,=<ranges>$^\ast$]
1380 \item[\Param{kansujichar}\,=\{<digit>, <chr\_code>\}] [\verb+\kansujichar+]
1384 \section{Other Primitives}
1385 \subsection{Primitives for Compatibility}
1386 The following primtives are implemented for compatibility with \pTeX:
1387 \begin{list}{}{\def\makelabel{\ttfamily\char92 }}
1395 \subsection{{\tt\char92 inhibitglue}}
1396 The primitive \verb+\inhibitglue+ suppresses the insertion of \textbf{JAglue}.
1397 The following is an example, using a special JFM that there will be a glue between
1398 the beginning of a box and `あ', and also between `あ' and `ウ'.
1401 \jfont\g=psft:Ryumin-Light:jfm=test \g
1402 あウあ\inhibitglue{}ウ\inhibitglue\par
1403 あ\par\inhibitglue{}あ
1404 \par\inhibitglue\hrule{}あoff\inhibitglue ice
1407 With the help of this example, we remark the specification of \verb+\inhibitglue+:
1409 \item The call of \verb+\inhibitglue+ in the (internal) vertical mode is
1410 effective at the beginning of the next paragraph. This is realized
1411 by hacking \verb+\everypar+.
1412 \item The call of \verb+\inhibitglue+ in the (restricted) horizontal
1413 mode is only effective on the spot; does not get over boundary of
1414 paragraphs. Moreover, \verb+\inhibitglue+ cancels ligatures and
1415 kernings, as shown in line~4 of above example.
1416 \item The call of \verb+\inhibitglue+ in math mode is just ignored.
1419 \section{Control Sequences for \LaTeXe}
1420 \subsection{Patch for NFSS2}\label{ssub-nfsspat}
1421 As described in Subsection~\ref{ssec-ltx}, \LuaTeX-ja simply adopted
1422 \texttt{plfonts.dtx} in \pLaTeXe\ for the Japanese patch for NFSS2.
1423 For an convinience, we will describe
1424 commands which are not described in Subsection~\ref{ssub-chgfnt}.
1427 \item[DeclareYokoKanjiEncoding\{<encoding>\}\{<text-settings>\}\{<math-settings>\}]
1428 In NFSS2 under \LuaTeX-ja, distinction between alphabetic font families
1429 and Japanese font families is only made by its
1430 encoding. For example, encodings OT1 and T1 are for
1431 alphabetic font families, and a Japanese font family cannot
1432 have these encodings. This command defines a new encoding
1433 scheme for Japanese font family (in horizontal direction).
1435 \item[DeclareKanjiEncodingDefaults\{<text-settings>\}\{<math-settings>\}]
1436 \item[DeclareKanjiSubstitution\{<encoding>\}\{<family>\}\{<series>\}\{<shape>\}]
1437 \item[DeclareErrorKanjiFont\{<encoding>\}\{<family>\}\{<series>\}\{<shape>\}\{<size>\}]
1439 The above 3~commands are just the counterparts for \verb+DeclareFontEncodingDefaults+ and~others.
1441 \item[reDeclareMathAlphabet\{<unified-cmd>\}\{<al-cmd>\}\{<ja-cmd>\}]
1442 和文・欧文の数式用フォントファミリを一度に変更する命令を作成する.
1443 具体的には,欧文数式用フォントファミリ変更の命令<al-cmd>と,和文数式用フォ
1444 ントファミリ変更の命令<ja-cmd>の2つを同時に行う命令として
1445 <unified-cmd>を(再)定義する.実際の使用では<unified-cmd>と
1446 <al-cmd>に同じものを指定する,すなわち,<al-cmd>に和文側も変
1447 更させるようにするのが一般的と思われる.
1449 本コマンドの使用については,\pLaTeX 配布中の\texttt{plfonts.dtx}に詳しく
1450 注意点が述べられているので,そちらを参照されたい.
1452 \item[DeclareRelationFont\{<ja-encoding>\}\{<ja-family>\}\{<ja-series>\}\{<ja-shape>\}\\
1453 \hfill\{<al-encoding>\}\{<al-family>\}\{<al-series>\}\{<al-shape>\}]
1455 This command sets the `accompanied' alphabetic font family (given by the latter 4~arguments)
1456 with respect to a Japanese font family given by the former 4~arguments.
1459 いわゆる「従属欧文」を設定するための命令である.前半の4引数で表される和文フォントファミリに対して,
1460 そのフォントに対応する「従属欧文」フォントファミリを後半の4引数により与える.
1462 \item[SetRelationFont]
1463 This command is almost same as \verb+\DeclareRelationFont+, except that this command does a local
1464 assignment, where \verb+\DeclareRelationFont+ does a global assignment.
1466 Change current alphabetic font encoding/family/\dots\ to the `accompanied' alphabetic
1467 font family with respect to current Japanese font family,
1469 \verb+\DeclareRelationFont+ or \verb+SetRelationFont+.
1470 Like \verb+\fontfamily+, \verb+\selectfont+ is required to take an effect.
1472 \item[adjustbaseline]
1475 \item[fontfamily\{<family>\}]
1477 As in \LaTeXe, this command changes current font family (alphabetic, Japanese,~\emph{or both})
1478 to <family>. Which family will be changed is determined as follows:
1480 \item Let current encoding scheme for Japanese fonts be
1481 <ja-enc>. Current Japanese font family will be changed to
1482 <family>, if one of the following two conditions is met:
1484 \item The family <fam> under the encoding <ja-enc> is already defined by
1485 \verb+\DeclareKanijFamily+.
1486 \item A font definition named \texttt{<enc><ja-enc>.fd} (the filename is
1487 all lowercase) exists.
1489 \item Let current encoding scheme for Japanese fonts be
1490 <al-enc>. For alphabetic font family, the criterion as above is used.
1491 \item There is a case which none of the above applies, that is, the font
1492 family named <family> doesn't seem to be defined neither under the
1493 encoding <ja-enc>, nor under <al-enc>.
1495 In this case, the default family for font substitution is used for
1496 alphabetic and Japanese fonts. Note that current encoding will not
1497 be set to <family>, unlike the original inplementation in \LaTeX.
1502 As closing this subsection, we shall introduce an example of
1503 \verb+SetRelationFont+ and \verb+\userelfont+:
1506 \SetRelationFont{JY3}{gt}{m}{n}{OT1}{pag}{m}{n}
1507 \userelfont\selectfont{}あいうabc
1511 \subsection{Cropmark/`tombow'}
1513 \section{Extensions}
1514 \subsection{{\tt luatexja-fontspec.sty}}
1516 \subsection{{\tt luatexja-otf.sty}}
1517 This optional package supports typesetting charaters in
1518 Adobe-Japan1. {\tt luatexja-otf.sty} offers the following 2~low-level
1520 \begin{list}{}{\def\makelabel{\ttfamily}\def\{{\char`\{}\def\}{\char`\}}}
1521 \item[\char92CID\{<number>\}]
1522 Typeset a character whose CID number is <number>.
1523 \item[\char92UTF\{<hex\_number>\}]
1524 Typeset a character whose character code is <hex\_number> (in hexadecimal).
1525 This command is similar to \verb+\char"+<hex\_number>,\ %"
1526 but please remind remarks below.
1530 Characters by \verb+\CID+ and \verb+\UTF+ commands are different from
1531 ordinary characters in the following points:
1533 \item Always treated as \textbf{JAchar}s.
1534 \item Processing codes for supporting OpenType features (\textit{e.g.},
1535 glyph replacement and kerning) by the \Pkg{luaotfload} package
1536 is not performed to these characters.
1540 \paragraph{Additionally Syntax of JFM}
1541 {\tt luatexja-otf.sty} extends the syntax of JFM; the entries of {\tt
1542 chars} table in JFM now allows a string in the form
1543 \verb+'AJ1-xxx'+, which stands for the character
1544 whose CID number in Adobe-Japan1 is \verb+xxx+.
1546 \part{Implementations}\label{part-imp}
1547 \section{Storing Parameters}\label{sec-para}
1548 \subsection{Used Dimensions, Attributes and whatsit nodes}
1549 Here the following is the list of dimension and attributes which are used in \LuaTeX-ja.
1551 \def\makelabel{\ttfamily}
1552 \def\dim#1{\item[\char92 #1\ \textrm{(dimension)}]}
1553 \def\attr#1{\item[\char92 #1\ \textrm{(attribute)}]}
1557 As explained in Subsection~\ref{ssec-plain}, \verb+\jQ+ is equal to
1558 $1\,\textrm{Q}=0.25\,\textrm{mm}$, where `Q'~(also called `級') is
1559 a unit used in Japanese phototypesetting. So one should not change the value of this dimension.
1561 There is also a unit called `歯' which equals to $0.25\,\textrm{mm}$ and
1562 used in Japanese phototypesetting. The dimension
1563 \verb+\jH+ stores this length, similar to \verb+\jQ+.
1564 \dim{ltj@zw} A temporal register for the `full-width' of current Japanese font.
1565 \dim{ltj@zh} A temporal register for the `full-height' (usually the sum of height of imaginary body and its depth) of current Japanese font.
1566 \attr{jfam} Current number of Japanese font family for math formulas.
1567 \attr{ltj@curjfnt} The font index of current Japanese font.
1568 \attr{ltj@charclass} The character class of Japanese \textit{glyph\_node}.
1569 \attr{ltj@yablshift} The amount of shifting the baseline of alphabetic
1570 fonts in scaled point ($2^{-16}\,\textrm{pt}$).
1571 \attr{ltj@ykblshift} The amount of shifting the baseline of Japanese
1572 fonts in scaled point ($2^{-16}\,\textrm{pt}$).
1573 \attr{ltj@autospc} Whether the auto insertion of \Param{kanjiskip} is allowed at the node.
1574 \attr{ltj@autoxspc} Whether the auto insertion of \Param{xkanjiskip} is allowed at the node.
1575 \attr{ltj@icflag} An attribute for distinguishing `kinds' of a node. One of the following value is
1576 assigned to this attribute:
1578 \item[\textit{italic} (1)] Glues from an itaric correction
1579 (\verb+\/+). This distinction of origins of glues
1580 (from explicit \verb+\kern+, or from \verb+\/+)
1581 is needed in the insertion process of \Param{xkanjiskip}.
1582 \item[\textit{packed} (2)]
1583 \item[\textit{kinsoku} (3)] Penalties inserted for the word-wrapping process of Japanese characters (\emph{kinsoku}).
1584 \item[\textit{from\_jfm} (4)] Glues/kerns from JFM.
1585 \item[\textit{line\_end} (5)] Kerns for ...
1586 \item[\textit{kanji\_skip} (6)] Glues for \Param{kanjiskip}.
1587 \item[\textit{xkanji\_skip} (7)] Glues for \Param{xkanjiskip}.
1588 \item[\textit{processed} (8)] Nodes which is already processed by ...
1589 \item[\textit{ic\_processed} (9)] Glues from an itaric correction, but also already processed.
1590 \item[\textit{boxbdd} (15)] Glues/kerns that inserted just the beginning or the ending of an hbox or a paragraph.
1592 \attr{ltj@kcat$i$} Where $i$~is a natural number which is less than~7.
1593 These 7~attributes store bit~vectors indicating which character block is regarded as a block of \textbf{JAchar}s.
1596 Furthermore, \LuaTeX-ja uses several `user-defined' whatsit nodes for
1597 typesetting. All those nodes store a natural number (hence the node's
1598 \texttt{type} is 100).
1600 \item[30111] Nodes for indicating that \verb+\inhibitglue+ is
1601 specified. The \texttt{value} field of these nodes doesn't matter.
1602 \item[30112] Nodes for \LuaTeX-ja's stack system (see the next
1603 subsection). The \texttt{value} field of these nodes is
1605 \item[30113] Nodes for Japanese Characters which the callback process of
1606 luaotfload won't be applied, andd the character code is
1607 stored in the \texttt{value} field. Each node having this
1608 \verb+user_id+ is converted to a `glyph\_node' \emph{after}
1609 the callback process of luaotfload.
1611 These whatsits will be removed during the process of inserting \textbf{JAglue}s.
1613 \subsection{Stack System of \LuaTeX-ja}\label{ssec-stack}
1614 \paragraph{Background}
1615 \LuaTeX-ja has its own stack system, and most parameters of \LuaTeX-ja
1616 are stored in it. To clarify the reason, imagine the parameter
1617 \Param{kanjiskip} is stored by a skip, and consider the following
1620 \ltjsetparameter{kanjiskip=0pt}ふがふが.%
1621 \setbox0=\hbox{\ltjsetparameter{kanjiskip=5pt}ほげほげ}
1625 As described in Part~\ref{part-ref}, the only effective value of
1626 \Param{kanjiskip} in an hbox is the latest value, so the value of
1627 \Param{kanjiskip} which applied in the entire hbox should be 5\,pt.
1628 However, by the implementation method of \LuaTeX, this `5\,pt' cannot be
1629 known from any callbacks. In the \texttt{tex/packaging.w} (which is a
1630 file in the source of \LuaTeX), there are the following codes:
1634 scaled h; /* height of box */
1635 halfword p; /* first node in a box */
1636 scaled d; /* max depth */
1642 if (cur_list.mode_field == -hmode) {
1643 cur_box = filtered_hpack(cur_list.head_field,
1644 cur_list.tail_field, saved_value(1),
1645 saved_level(1), grp, saved_level(2));
1646 subtype(cur_box) = HLIST_SUBTYPE_HBOX;
1648 Notice that \verb+unsave+ is executed \emph{before}
1649 \verb+filtered_hpack+ (this is where \verb+hpack_filter+ callback is
1650 executed): so `5\,pt' in the above source is orphaned at
1651 \texttt+unsave+, and hence it can't be accessed from \verb+hpack_filter+
1654 \paragraph{The method}
1655 The code of stack system is based on that in a post of Dev-luatex mailing list\footnote{%
1656 \texttt{[Dev-luatex] tex.currentgrouplevel}, a post at 2008/8/19 by Jonathan Sauer.}.
1658 These are two \TeX\ count registers for maintaining informations:
1659 \verb+\ltj@@stack+ for the stack level, and \verb+\ltj@@group@level+ for
1660 the \TeX's group level when the last assignment was done. Parameters
1661 are stored in one big table named \texttt{charprop\_stack\_table}, where
1662 \texttt{charprop\_stack\_table[$i$]} stores data of stack level~$i$. If
1663 a new stack level is created by \verb+\ltjsetparameter+, all data of the
1664 previous level is copied.
1666 To resolve the problem mentioned in `Background' above, \LuaTeX-ja uses
1667 another thing: When a new stack level is about to be created, a whatsit
1668 node whose type, subtype and value are 44~(\textit{user\_defined}),
1669 30112, and current group level respectively is appended to the current
1670 list (we refer this node by \textit{stack\_flag}). This enables us to
1671 know whether assignment is done just inside a hbox. Suppose that the
1672 stack level is~$s$ and the \TeX's group level is~$t$ just after the hbox
1675 \item If there is no \textit{stack\_flag} node in the list of hbox, then
1676 no assignment was occurred inside the hbox. Hence values of
1677 parameters at the end of the hbox are stored in the stack
1679 \item If there is a \textit{stack\_flag} node whose value is~$t+1$, then
1680 an assignment was occurred just inside the hbox group. Hence
1681 values of parameters at the end of the hbox are stored in the
1683 \item If there are \textit{stack\_flag} nodes but all of their values
1684 are more than~$t+1$, then an assignment was occurred in the box,
1685 but it is done is `more internal' group. Hence values of
1686 parameters at the end of the hbox are stored in the stack
1690 Note that to work this trick correctly, assignments to
1691 \verb+\ltj@@stack+ and \verb+\ltj@@group@level+ have to be local always,
1692 regardless the value of \verb+\globaldefs+.
1693 This problem is resolved by using
1694 \hbox{\verb+\directlua{tex.globaldefs=0}+} (this assignment is local).
1697 \section{Linebreak after Japanese Character}\label{sec-lbreak}
1698 \subsection{Reference: Behavior in \pTeX}
1700 In~\pTeX, a linebreak after a Japanese character doesn't emit a space,
1701 since words are not separated by spaces in Japanese writings. However,
1702 this feature isn't fully implemented in \LuaTeX-ja due to the
1703 specification of callbacks in~\LuaTeX. To clarify the difference between
1704 \pTeX~and~\LuaTeX, We briefly describe the handling of a linebreak in~\pTeX, in
1707 \pTeX's input processor can be described in terms of a finite state
1708 automaton, as that of~\TeX\ in~Section~2.5 of~\cite{texbytopic}. The
1709 internal states are as follows:
1711 \item State~$N$: new line
1712 \item State~$S$: skipping spaces
1713 \item State~$M$: middle of line
1714 \item State~$K$: after a Japanese character
1716 The first three states---$N$, $S$~and~$M$---are as same as \TeX's input
1717 processor. State~$K$ is similar to state~$M$, and is entered after
1718 Japanese characters. The diagram of state transitions are indicated in
1719 Figure~\ref{fig-ptexipro}. Note that \pTeX\ doesn't leave state~$K$
1720 after `beginning/ending of a group' characters.
1724 欧文では文章の改行は単語間でしか行わない.そのため,\TeX では,(文字の直後の)改行は
1725 空白文字と同じ扱いとして扱われる.一方,和文ではほとんどどどこでも改行が可能なため,
1726 \pTeX では和文文字の直後の改行は単純に無視されるようになっている.
1728 このような動作は,\pTeX が\TeX からエンジンとして拡張されたことによって可能になったことである.
1729 \pTeX の入力処理部は,\TeX におけるそれと同じように,有限オートマトンとして記述することができ,
1733 \item State~$N$: 行の開始.
1734 \item State~$S$: 空白読み飛ばし.
1735 \item State~$M$: 行中.
1736 \item State~$K$: 行中(和文文字の後).
1738 また,状態遷移は,図\label{fig-ptexipro}のようになっており,図中の数字は
1739 カテゴリーコードを表している.最初の3状態は\TeX の入力処理部と同じであり,
1740 図中から状態$K$と「$j$」と書かれた矢印を取り除けば,\TeX の入力処理部と同
1745 行が和文文字(とグループ境界文字)で終わっていれば,改行は無視される
1752 \def\sp{\text{\tt\char32}}
1754 {\text{scan a cs}}\ar@(r,ul)[dr]&\\
1756 *++[o][F-]{N}\ar[ur]^0\ar[dd]_{d,\ g}\ar[u]^{5\ (\text{\tt\char92par})}
1757 \ar@{->}@(d,l)[ddrr]_(0.45){j}&&
1758 *++[o][F-]{S}\ar@(l,dr)[ul]^0\ar@(l,ur)[ddll]_{d,\ g}\ar[u]_{5}
1759 \ar@{->}@(r,r)[dd]^{j}\\&\\&
1760 *++[o][F-]{M}\ar[uuur]^0\ar@(r,dl)[uurr]_(0.55){10\ (\sp)}
1761 \ar[d]_{5\ ({\sp})}\ar@{->}@(dr,dl)[rr]_{j}&&
1762 *++[o][F-]{K}\ar@{->}@(ul,d)[uuul]^0\ar@{->}[ll]^{d}
1763 \ar@{->}@(ur,dr)[uu]^{10\ (\sp)}\ar@{->}[d]_5\\
1766 d:=\{3,4,6,7,8,11,12,13\},\quad g:=\{1,2\},\quad j:=(\text{Japanese characters})
1769 \item Numbers represent category codes.
1770 \item Category codes 9~(ignored), 14~(comment)~and~15~(invalid) are omitted in above diagram.
1772 \caption{State transitions of \pTeX's input processor.}
1773 \label{fig-ptexipro}
1777 \subsection{Behavior in \LuaTeX-ja}
1779 States in the input processoe of \LuaTeX\ is the same as that of \TeX,
1780 and they can't be customized by any callbacks. Hence, we can only use
1781 \verb+process_input_buffer+ and \verb+token_filter+ callbacks for to
1782 suppress a space by a linebreak which is after Japanese characters.
1784 However, \verb+token_filter+ callback cannot be used either, since a
1785 character in category code 5~(end-of-line) is converted into an space
1786 token \emph{in the input processor}. So we can use only the
1787 \verb+process_input_buffer+ callback. This means that suppressing a
1788 space must be done \emph{just before} an input line is read.
1790 Considering these situations, handling of an end-of-line in \LuaTeX-ja are as follows:
1792 A character U+FFFFF (its category code is set to 14~(comment) by
1793 \LuaTeX-ja) is appended to an input line, \emph{before \LuaTeX\ actually
1794 process it}, if and only if the following two conditions are satisfied:
1796 \item The category code of the character $\langle${return}$\rangle$
1797 (whose character code is 13) is 5~(end-of-line).
1798 \item The input line matches the following `regular expression':
1800 (\text{any char})^*(\textbf{JAchar})
1801 \bigl(\{\text{catcode}=1\}\cup\{\text{catcode}=2\}\bigr)^*
1807 The following example shows the major difference from the behavior of \pTeX:
1809 \ltjsetparameter{autoxspacing=false}
1810 \ltjsetparameter{jacharrange={-6}}xあ
1811 y\ltjsetparameter{jacharrange={+6}}zあ
1815 \item There is no space between `x' and `y', since the line~2 ends with a \textbf{JAchar} `あ'
1816 (this `あ' considered as an \textbf{JAchar} at the ending of line~1).
1817 \item There is no space between `あ' (in the line~3) and `u', since the
1818 line~3 ends with an \textbf{ALchar}
1819 (the letter `あ' considered as an \textbf{ALchar} at the ending of line~2).
1824 \LuaTeX の入力処理部は\TeX のそれと全く同じであり,callbackによりユーザが
1825 カスタマイズすることはできない.このため,改行抑制の目的でユーザが利用で
1826 きそうなcallbackとしては,\verb+process_input_buffer+や
1827 \verb+token_filter+に限られてしまう.しかし,\TeX の入力処理部をよく見る
1828 と,後者も役には経たないことが分かる:改行文字は,入力処理部によってトー
1829 クン化される時に,カテゴリーコード10の32番文字へと置き換えられてしまうた
1830 め,\verb+token_filter+で非標準なトークン読み出しを行おうとしても,空白文
1831 字由来のトークンと,改行文字由来のトークンは区別できないのだ.
1833 すると,我々のとれる道は,\verb+process_input_buffer+を用いて
1834 \LuaTeX の入力処理部に引き渡される前に入力文字列を編集するというものしかない.
1835 以上を踏まえ,\LuaTeX-jaにおける「和文文字直後の改行抑制」の処理は,次のようになっている:
1838 各入力行に対し,\textbf{その入力行が読まれる前の内部状態で}
1839 以下の2条件が満たされている場合,\LuaTeX-jaはU+FFFFF番の文字
1840 \footnote{この文字はコメント文字として扱われるように\LuaTeX-ja内部で設定をしている.}
1841 を末尾に追加する.よって,その場合に改行は空白とは見做されないこととなる.
1843 \item 改行文字(文字コード13番)のカテゴリーコードが5~(end-of-line)である.
1844 \item 入力行は次の「正規表現」にマッチしている:
1846 (\text{any char})^*(\textbf{JAchar})
1847 \bigl(\{\text{catcode}=1\}\cup\{\text{catcode}=2\}\bigr)^*
1852 この仕様は,前節で述べた\pTeX の仕様にできるだけ近づけたものとなっている.最初の条件は,
1853 \texttt{verbatim}系環境などの日本語対応マクロを書かなくてすませるためのものである.
1854 しかしながら,完全に同じ挙動が実現できたわけではない.
1855 差異は,次の例が示すように,和文文字の範囲を変更した行の改行において見られる:
1857 \ltjsetparameter{autoxspacing=false}
1858 \ltjsetparameter{jacharrange={-6}}xあ
1859 y\ltjsetparameter{jacharrange={+6}}zあ
1862 もし\pTeX とまったく同じ挙動を示すならば,出力は
1863 「\hbox{\ltjsetparameter{autoxspacing=false}x yzあu}」となるべきである.しかし,実際には
1866 \item 2行目は「あ」という和文文字で終わる(2行目を処理する前の時点では,
1867 「あ」は和文文字扱いである)ため,直後の改行文字は無視される.
1868 \item 3行目は「あ」という欧文文字で終わる(2行目を処理する前の時点では,
1869 「あ」は欧文文字扱いである)ため,直後の改行文字は空白に置き換わる.
1871 このため,トラブルを避けるために,和文文字の範囲を\verb+\ltjsetparameter+で編集した場合,
1872 その行はそこで改行するようにした方がいいだろう.
1876 \section{Insertion of JFM glues, \Param{kanjiskip} and \Param{xkanjiskip}}
1877 \subsection{Overview}
1883 \LuaTeX-ja における和文処理グルーの挿入方法は,\pTeX のそれとは全く異なる.
1884 \pTeX では次のような仕様であった:
1886 \item JFMグルーの挿入は,和文文字を表すトークンを元に水平リストに(文字を表す)<char\_node>を
1888 \item \Param{xkanjiskip}の挿入は,hboxへのパッケージングや行分割前に行われる.
1889 \item \Param{kanjiskip}はノードとしては挿入されない.パッケージングや行分割の計算時に
1890 「和文文字を表す2つの<char\_node>の間には\Param{kanjiskip}がある」ものとみなされる.
1892 しかし,\LuaTeX-jaでは,hboxへのパッケージングや行分割前に全ての
1893 \textbf{JAglue},即ちJFMグルー・\Param{xkanjiskip}・\Param{kanjiskip}の
1894 3種類を一度に挿入することになっている.これは,\LuaTeX において欧文の合字・
1895 カーニング処理がノードベースになったことに対応する変更である.
1897 \LuaTeX-jaにおける\textbf{JAglue}挿入処理では,下の図\ref{fig-clu}のよう
1898 に「塊」を単位にして行われる.大雑把にいうと,「塊」は文字とそれに付随す
1899 るノード達(アクセント位置補正用のkernや,イタリック補正)をまとめたもの
1900 であり,2つの塊の間には,ペナルティ,\verb+\vadjust+,whatsitなど,行組版
1901 には関係しないものがある.そのため,……
1904 % \begin{figure}[!tb]
1908 \subsection{Definition of a `cluster'}
1911 A \emph{cluster} is a list of nodes in one of the following forms, with the \textit{id} of it:
1913 \item Nodes whose value of\ \verb+\ltj@icflag+ is in $[3,15)$. These
1914 nodes come from a hbox which is already packaged, by unpackaging
1916 The \textit{id} is \textit{id\_pbox}.
1917 \item A inline math formula, including two \textit{math\_node}s at the boundary of it:
1919 The \textit{id} is \textit{id\_math}.
1920 \item A \textit{glyph\_node} with nodes which relate with it:
1922 The \textit{id} is \textit{id\_jglyph} or
1923 \textit{id\_glyph}, according to whether the \textit{glyph\_node}
1924 represents a Japanese character or not.
1925 \item An box-like node, that is, an hbox, an vbox and an rule (\verb+\vrule+).
1926 The \textit{id} is \textit{id\_hlist} if the node is an
1927 hbox which is not shifted vertically, or \textit{id\_box\_like}
1929 \item A glue, a kern whose subtype is not 2~(\textit{accent}), and a discretionary break.
1930 The \textit{id} is \textit{id\_glue}, \textit{id\_kern}
1931 and \textit{id\_disc}, respectively.
1932 %Just a node which will \dots, \textit{i.e.}, a node which is \emph{not} one of the following:
1933 %\textit{ins\_node}, \textit{mark\_node}, \textit{adjust\_node}, \textit{whatsit\_node}
1934 %and \textit{penalty\_node}.
1936 We denote a cluster by \textit{Np}, \textit{Nq} and \textit{Nr}.
1939 Internally, a cluster is represented by a table $\textit{Np}$ with the following fields.
1942 \def\makelabel#1{\textbf{\textit{#1}}}
1943 \item[first, last] The first/last node of the cluster.
1944 \item[id] The \textit{id} in above definition.
1948 \item[auto\_kspc, auto\_xspc]
1949 \item[xspc\_before, xspc\_after]