Discussion Papers

ディスカッションペーパー

2025年

SSE-DP-2025-1「移動型休日と経済指標の季節性」国友直人（編集） 2025/03/12改訂 SSE-DP-2025-1 「移動型休日と経済指標の季節性」データ SSE-DP-2025-2「統計的多変量解析－Pythonによる実装－」岡本雅子 SSE-DP-2025-3「RとPythonによる統計的学習入門：ISLR・ISLP実習（日本語版）」前編 R言語　国友直人、湯浅良太 SSE-DP-2025-3「RとPythonによる統計的学習入門：ISLR・ISLP実習（日本語版）」後編 Python言語　国友直人、湯浅良太 SSE-DP-2025-4「１５歳時に本を持っていることは何を意味するのか：社会階層論の観点から」眞田英毅、中西寛子 SSE-DP-2025-5「Akaike's Relative Power Contribution: A Revisit」, Naoto Kunitomo and Xue Yujie

SSE-DP-2025-1「移動型休日と経済指標の季節性」国友直人（編集）

As part of the statistical expert training program promoted by the Institute of Statistical Mathematics, a statistical consultation class titled “Movable Holidays and the Seasonality of Economic Indicators” was conducted in the 2024 academic year. During the class, an official from the Cabinet Office (Economic and Fiscal Analysis and Overseas Affairs), who is actively engaged in analyzing trends in the Chinese economy, provided an explanation of the challenges related to seasonality in the analysis of China’s macroeconomic data. The class focused on addressing the seasonal adjustment of publicly available macroeconomic data from China, as well as examining issues related to the use of statistical tools such as X-13ARIMA-SEATS (published by the U.S. Census Bureau), Kitagawa’s DECOMP, and Sato’s S-SIML. This report first discusses issues related to the processing of macroeconomic data based on the particularly in relation to the Gregorian calendar (solar calendar), the lunar calendar, the Spring Festival effect, and holiday effects. Among these, the application of seasonal adjustment methods such as X-12-ARIMA and X-13ARIMA-SEATS is of particular importance. Regarding China’s seasonality and calendar effects, a Zhao dummy variable was developed to analyze the seasonality of China’s monthly trade statistics. Additionally, this study examines issues such as filtering in non-stationary time series containing seaonal components with missing values and the construction of seasonally adjusted series. Furthermore, a new frequency decomposition-based seasonal adjustment method, SarSIML (S-SIML), was developed, which was found to be effective in handling missing values with ease.
SSE-DP-2025-2「統計的多変量解析－Pythonによる実装－」岡本雅子

統計的多変量解析は、非常に複雑な計算が必要なため、重回帰分析を除いて手計算やExcelを用いて計算することは非常に困難です。そのため、一般的にRやPython、有料の統計解析ソフトを用いることが多い。本稿では、統計学およびプログラミングの初学者を対象として、統計的多変量解析に係る各手法について、Pythonで実装を紹介している。以下に紹介しているPythonのプログラムと実行結果を確認しながら、Pythonでの実装方法を学習することができる。また、Pythonの出力結果について、留意すべき点がある場合はコメントを入れており、分析結果を解釈する際の参考にしていただきたい。
SSE-DP-2025-3「RとPythonによる統計的学習入門：ISLR・ISLP実習（日本語版）」　国友直人、湯浅良太

統計数理研究所が推進している統計エキスパート養成事業の一環として、2024年ｰ2025年の教材作成演習「RとPythonによる統計的学習入門」が企画された。演習では統計的学習の理論と応用に関心のある研修参加者がStanford大学統計学科Trevor Hastie教授を中心に（James, G., D. Witten, T. Hastie and R. Tibshirani、及びJ. Taylor）が出版した２冊の教科書、ISLR（An Introduction to Statistical Learning with Applications in R, 2nd edition, 2021）とISLP（An Introduction to Statistical Learning with Applications in Python, 2023）の内容を検討した。特に二つの書籍に関連した実修教材がWeb上で公開されているので、翻訳などの作業を行い、日本語で利用できる教材を作成した。ISLRおよびISLPではそれぞれ各省にRおよびPythonによる統計的学習によるデータ分析の方法が英語で説明されているので、その説明をChatGPTなどAI翻訳、新たに利用可能となったGoogle Colaboratoryなどを活用するとともに、参加者が内容を検討、統計エキスパート養成事業など統計化学の教育上で有用と判断した内容を精査、日本語の教材としてまとめた。
SSE-DP-2025-4「１５歳時に本を持っていることは何を意味するのか：社会階層論の観点から」眞田英毅、中西寛子

This study examines whether the number of books in the home at age 15 (NBH) serves as a reliable indicator of social class. Although NBH is often considered a form of cultural capital, its relevance as a measure of social class remains uncertain. By using Japanese data, this study explores the relationship between NBH and key determinants of social class, including parental education, occupation, economic conditions, and cultural resources. Findings indicate that NBH is strongly associated with parental education and cultural capital but has weaker links to economic status. This suggests that NBH primarily reflects cultural rather than economic aspects of social class. Although it provides insight into social class distinctions, its limitations require careful interpretation. Given potential biases in self-reported data, future research should incorporate more comprehensive indicators, such as parental income and broader educational resources, to improve the accuracy of measuring social class. Understanding the role of NBH in social stratification contributes to advancing search on educational inequality and intergenerational mobility.
SSE-DP-2025-5「Akaike's Relative Power Contribution: A Revisit」, Naoto Kunitomo and Xue Yujie

Statistical analysis of inter-variable relationships in multivariate time series was initiated by Akaike (1968, 1971) at the Institute of Statistical Mathematics, where methods such as RPC (Relative Power Contribution) were developed for engineering applications. In the field of econometrics, vector autoregressive (VAR) analysis has evolved since the seminal works of Granger (1969) and Sims (1980), with further developments including the decomposition proposed by Pesaran and Shin (1998), Diebold and Yilmaz (2012, 2014), and Barunik and Krehlik (2018). This paper sheds some new lights on the limitations of existing methods regarding correlations among innovation variables, and proposes the use of decomposition of the predictive spectral density matrix with finite prediction horizon. The practical utility of this approach is discussed.

2024年

SSE-DP-2024-1「統計エキスパート演習2023」国友直人、湯浅良太、西颯人、趙宇、中西正（2024/01/17 誤植訂正版に差し替え） SSE-DP-2024-2「日本の消費者物価指数を巡る課題」国友直人（編集） SSE-DP-2024-3「ｔ統計量の分布が双峰型となる場合」国友直人、西颯人、薛玉傑（2024/09/26改訂） SSE-DP-2024-4「On SarSIML (A Seasonal Adjustment Method)」Seisho Sato and Naoto Kunitomo SSE-DP-2024-4「On SarSIML (A Seasonal Adjustment Method)」実行データ例 SSE-DP-2024-5「Forward and Backward Smoothing for Noisy Nonstationary Time Series with an Application of Detecting Recent Change Points」, Seisho Sato and Naoto Kunitomo

SSE-DP-2024-1「統計エキスパート演習2023」国友直人、湯浅良太、西颯人、趙宇、中西正

統計数理研究所が推進している統計エキスパート養成プログラムでは必ずしも統計学を専門としているわけではない各分野の若手研究者と統計家であるメンターにより統計エキスパート演習をおこなっている。2023年度に実施したある一つのグループ演習では統計学の基礎と応用について基礎的ではあるがしばしば見逃しがちな幾つかの内容、幾つかの応用統計の話題をとりあげた。統計学の専門的な研究とまではいかないが、大学・大学院などで統計学を教える機会が少なくない統計エキスパートにとり有益と考えられる基礎統計を巡る５つの話題および講義などに関連した応用統計の３つの話題についての報告をまとめて報告する。またとりあげた話題を検討する過程で新たに作成したＲプログラム、Ｐｙｔｈｏｎプログラムなどの掲載する。
SSE-DP-2024-2「日本の消費者物価指数を巡る課題」国友直人（編集）

2024年2月6日、マス・メディアのトップ記事の一つとして2023年の実質賃金がマイナスであったことが報じられた。実質賃金とは勤労者が実際に受け取る名目賃金を消費者物価指数（CPI）で割った数値という意味である。単なる一例に過ぎないが、CPIは経済の動きや人々の生活水準を判断する重要なデータとして広く利用されている。

この報告書は統計数理研究所が推進している「統計エキスパート人材育成プロジェクト」の一環として2023年4月ｰ2024年2月のコンサルテーション演習「日本の消費者物価指数を巡る課題」における議論をまとめたものである。演習では総務省統計局で日本の消費者物価（CPI）を実際に作成、公表に携わっている方々からCPIの現状と今後の課題についての説明を受け、その後CPI改善法の幾つかの論点について自由に議論した。本稿はコンサルテーション演習での議論に基づく論考をまとめた報告であるが、現在の日本で公表されているCPIをめぐる幾つかの課題について一般にはあまり理解されていないとも考えられることから、近年における消費者物価の基礎的議論の展開を踏まえて実際の日本におけるCPI作成の課題について研究者の立場から論じるとともに、考えられる幾つかの改善策を述べた。
SSE-DP-2024-3「ｔ統計量の分布が双峰型となる場合」国友直人、西颯人、薛玉傑

統計数理研究所が推進している統計エキスパート養成プログラムの意見交換サイト（slack）上で「t-統計量の標本分布が双峰型になるシミュレーション」と云う（三輪哲久特任教授が掲示した）Ｒプログラムを巡って、「ｔ統計量の標本分布」について活発な議論が行われた。コーシー分布を含む安定分布などについてのシミュレーションの結果、数理的基礎および関連する統計的問題について関心を抱いた参加者の考察を報告する。また議論の内容の理解に資すると思われる事項を注として述べるとともに付論Ａとして利用した計算プログラム、付論Ｂ、付論Ｃとしてこれまであまり複素関数論・確率論などを学ぶ機会がなかった方々のために図を掲載、基礎的事項を引用、応用統計家にとって有用と考えられる話題を解説した。
SSE-DP-2024-4「On SarSIML (A Seasonal Adjustment Method)」Seisho Sato and Naoto Kunitomo

We explain a new seasonal adjustment program called SarSIML (or S-SIML). It is based on the (real-valued) spectral decomposition of non-stationary time series, which is an application of the SIML filtering method developed by Kunitomo and Sato (The SIML Filtering Method for Noisy Non-stationary Economic Time Series, 2024, JSS-Springer Series , Springer, forthcoming).
SSE-DP-2024-5「Forward and Backward Smoothing for Noisy Nonstationary Time Series with an Application of Detecting Recent Change Points」, Seisho Sato and Naoto Kunitomo

We propose a novel smoothing (or filtering) approach for time series analysis to estimate the hidden states of random variables and handle noisy, nonstationary time series data. The method is applicable even when the sample size is small, as is often the case with major macroeconomic time series data. Our approach is based on the frequency decomposition of nonstationary time series, and we address the smoothing and filtering challenges specific to such data. In particular, we introduce two methods: forward and backward SIML smoothing, designed to resolve the initial value problem in nonstationary time series analysis. The proposed smoothing methods offer interpretations in both the time and frequency domains. To demonstrate the effectiveness of our approach, we provide an illustrative empirical example using U.S. manufacturers’ new order data and apply the filtering method to the problem of detecting recent breaks in macroeconomic consumption trends.

2023年

SSE-DP-2023-1「日本の公的統計と季節調整 - X-13ARIMA-SEATS と労働力調査を題材に -」国友直人（編）（統計数理研究所） SSE-DP-2023-2 "Frequency Regression and Smoothing for Noisy Nonstationary Multivariate Time Series", N. Kunitomo, S. Sato　本稿は書籍の章となりました（書籍情報は下の詳細欄を参照） SSE-DP-2023-3 「統計的学習（講義スライド） - Statistical Learning -」国友直人、趙宇、湯浅良太（訳）、Trevor Hastie, Robert Tibshirani (原著） SSE-DP-2023-4 "An Asymptotically Optimal Two-Sample IV Estimation with Many Instruments", N. Kunitomo and R. Yuasa （2024/12/29　JJSDに改訂版掲載） SSE-DP-2023-4　JJSD改訂版（Open Access）への外部リンク SSE-DP-2023-5 「２種類の陽性に対するグループテストのためのBPとMCMCのhybridアルゴリズム」松島裕康、田島友祐、盧暁南、神保雅一

SSE-DP-2023-1「日本の公的統計と季節調整 - X-13ARIMA-SEATS と労働力調査を題材に -」国友直人（編）（統計数理研究所）

統計数理研究所が推進している統計エキスパート要請事業の一環として２０２２年度のコンサルテーション演習「公的統計と季節調整」が実施された。この演習では総務省統計局で労働力調査を実際に扱っている担当者から季節調整の課題についての説明を受け、題材として「労働力調査・産業別就業者数」と米国センサス局が公開している「Ｘ－１３ＡＲＩＭＡ－ＳＥＡＴＳ」などの検討が行われた。
SSE-DP-2023-2 "Frequency Regression and Smoothing for Noisy Nonstationary Multivariate Time Series", N. Kunitomo, S. Sato

We develop a new method called frequency regression and smoothing (or the SIML-frequency method) based on the nonstationary errors-in-variables model. It is developed for estimating the relationships among hidden states of random variables and handling noisy nonstationary small sample time series economic data in comparison with data in engineering fields and natural sciences. Many economic time series include not only trend, cycle, seasonal, and measurement error components, but also factors such as abrupt changes, trading-day effects, and institutional changes. The frequency regression and smoothing method can be applied to handle such factors in nonstationary time series. The proposed method is simple and applicable for analyzing nonstationary economic time series and handling seasonal adjustments. Our formulation leads to the asymptotic results on the low frequency method proposed by Muller and Watson (2018) as a consequence. An illustrative empirical analysis of the macro-consumption in Japan is provided.

本稿の改訂稿は以下の書籍5章になりました

The SIML Filtering Method for Noisy Non-stationary Economic Time Series | SpringerLink
SSE-DP-2023-3 「統計的学習（講義スライド） - Statistical Learning -」国友直人、趙宇、湯浅良太（訳）、Trevor Hastie, Robert Tibshirani (原著）

この(日本語)スライド講義録は元々は米国スタンフォード(Stanford)大学統計学科のヘイスティ(Hastie)教授とティブシラニ(Tibshirani)教授が同大学学部・大学院修士課程における講義の為に準備した英文スライドを(Hastie教授のご厚意により次貢のような許可を受け)日本語に翻訳したものである。なおこの日本語版では原スライドの誤植を修正、また幾つかの箇所で授業を行う上で有益と思われる補足を加えた。(翻訳の担当は国友1,2,3,7,11,日本版注;趙4,5,8,9;湯浅6,10,12,13の各章とし,その後に内容を調整した。)

統計数理研究所では「統計エキスパート人材育成」の為に大学統計教員育成センターを新たに立ち上げ、日本の大学学部専門課程・大学院修士課程における統計学教育を充実するための教材を開発中であり、この翻訳もそうした教材開発の一環として行われたもので、公開する。大学・大学院における統計教育の一助になれば幸いである。

2023年5月　国友直人（日本語版・作成者代表統計数理研究所）
SSE-DP-2023-4 "An Asymptotically Optimal Two-Sample IV Estimation with Many Instruments", N. Kunitomo and R. Yuasa

※ 2024/12/29 にSpringer社 JJSD(online in Japanese Journal of Statistics and Data Science) に改訂版が掲載 https://rdcu.be/d5bhL

We consider the statistical estimation of the coefficients of a linear structural equation in a simultaneous equation system when we use two-sample data and there are many instrumental variables. We derive some asymptotic properties of the Two-Sample Least Variance Ratio (2SLVR) estimator, which is an extension of the limited information maximum likelihood (LIML) estimator in one-sample, when we have two-sample data with many instrumental variables. It has been known that there is a non-negligible bias in the one-sample two stage least squares (TSLS) estimator and the generalized moment method (GMM), which are widely used in practice. They often lose even consistency when we have many instruments. We have found that the variance-covariance matrix of the limiting distribution of the 2SLVR estimator and its modifications often attain the asymptotic lower bound when the number of instruments is large and the disturbance terms are not necessarily normally distributed. The results would be useful for applications in econometrics and biometrics including Mendelian Randomization (RM) using DNA data analysis.
SSE-DP-2023-5 「２種類の陽性に対するグループテストのためのBPとMCMCのhybridアルゴリズム」松島裕康、田島友祐、盧暁南、神保雅一

Testing n objectives one by one requires n tests, but the number of defective objective is often small. When multiple objectives can be tested of a pool, if the test result is negative, it can be determined that all objectives in its pool are negative at one time. Or, if the test result for that pool is positive, at least one or more of the objectives in that pool is positive. Thus, testing each pool that made by combining multiple objectives is called group testing. Using a group test, the posterior probability that each specimen is defective can be calculated from the test results of a much smaller number of pools than the total number of objectives. However, when making a positive/negative determination, the probability of false positives/false negatives in each test must be considered. Therefore, for this purpose, algorithms such as Belief Propagation (BP) and Markov Chain Monte Carlo (MCMC) are employed.
In this report, we develop and evaluate BP and MCMC algorithms for a combinatorial group test design that reduces the number of tests when there are two types of defectives.

2022年

SSE-DP-2022-1「オッズ比の平方根変換」岩崎学（統計数理研究所/順天堂大学大学院） SSE-DP-2022-2「ワクチンの有効率と有効者率」岩崎学（統計数理研究所/順天堂大学大学院） SSE-DP-2022-3「操作変数法の理解へ：計量生物と計量経済の邂逅」国友直人（統計数理研究所）本稿改訂版は日本統計学会和文誌に掲載されました SSE-DP-2022-4 “A Statistical Data Envelopment Analysis”, N. Kunitomo, Y. Zhao (Revised on June 5, 2023)

SSE-DP-2022-1「オッズ比の平方根変換」岩崎学（統計数理研究所/順天堂大学大学院）

内容

医療統計では，リスク比とオッズ比が重要な役割を果たす．リスク比はその解釈が容易であるがオッズ比はそうではないのであるが，研究結果がオッズ比で報告されることが多いのが現状である．
最近，オッズ比の平方根がリスク比に近似するという論文が出た(VanderWeele, 2017, 2020)．
ここでは，その平方根変換の近似の程度を信頼区間の被覆確率の観点から評価する．
本稿の構成は以下のようである．
第1 節で確率，オッズ，リスク比，オッズ比の定義を確認し，
第2 節でオッズ比とリスク比の関係を示す．
第3 節では，オッズ比の平方根変換とその簡単な性質，および信頼区間の被覆確率について述べ，
第4 節ではシミュレーションの手順とその結果を示す．
最後の第5 節で簡単なまとめを行う．
参考文献では，本文中では陽に言及してはいないが関連する論文をいくつかまとめている．
参考のため，付録にVanderWeele (2017) と関連論文であるBland and Altman (2000) の邦訳を示す．
SSE-DP-2022-2「ワクチンの有効率と有効者率」岩崎学（統計数理研究所/順天堂大学大学院）

内容

新型コロナへの対応として，政府によりワクチンの接種が強く奨励されている．新型コロナワクチンは「有効率95%」とも称されるが，これが何を意味するのかを理解している人はそう多くないのではないかと推察される．それ故に，ワクチンは果たして効くのか効かないのかの議論が巻き起こっているのが現状である（巻き起こっていないのであれば，それはそれで問題かもしれない）．
そこで本論では，ワクチンの有効率に加え，有効者率なるものを定義し，それらを統計的因果推論の潜在的アウトカムの観点から考察する．有効率と有効者率の定義およびそれらの違いを示し，公表されている新型コロナワクチンの臨床試験のデータに適用する．また，ワクチンの有効率と称されるいくつかの研究レポートの解釈上の問題点にも言及する．
SSE-DP-2022-3「操作変数法の理解へ：計量生物と計量経済の邂逅」国友直人（統計数理研究所）

内容

因果関係(causality) は統計科学を含め諸科学にとっては基本的かつ重要な分析対象である。計量生物と計量経済の分野ではこの間、統計的因果推論(statistical causal inference) が盛んに応用されている。
本稿ではまずRu-bin (1974) に始まる反実仮想(counter-factual) モデルとAngrist, Imbens and Rubin (1996, 略してAIR) による操作変数法(instrumental variables method) を説明する。
次に計量経済学における同時方程式と構造方程式(structural equation) を簡単な需要関数の例を用いて説明する。一般の構造方程式を用いて統計的因果関係を解釈し、操作変数法を含めた構造方程式の統計的推定法を議論する。構造方程式の推定ではOLS法(最小二乗法)は一致性を持たないので、操作変数法(IV法) としてのWald法、LIML(制限情報最尤法, 分散比最小法)、TSLS(2段階最小二乗法)、GMM(一般化積率法) などの長所と短所を説明する。さらに構造方程式を巡る歴史的展開を説明し、最後に計量生物と計量経済などにおける統計的因果分析のさらなる課題を展望する。

本稿改訂版の日本統計学会和文誌掲載記事　操作変数法の理解へ：計量生物と計量経済の邂逅
SSE-DP-2022-4 “A Statistical Data Envelopment Analysis”, N. Kunitomo, Y. Zhao

内容

In operations research and management sciences, the data envelopment analy-sis (DEA) has been known as one of important tools. We develop a statistical data envelopment analysis (SDEA), which seems to be new to operations re-search literatures as well as statistical community. We first consider the basic statistical DEA model, in which the observed data is the sum of an increasing concave function of inputs and a random noise (or inefficiency) term taking only non-positive value. The purpose of data analysis is to estimate the un-known function, called the efficiency frontier, nonparametrically based on the set of observed data of inputs and outputs. The key idea is to use the non-parametric statistical analysis, the linear regression analysis and the statistical extreme value theory. We report an empirical analysis on the life-insurance industry in Japan as an application.

ディスカッションペーパー

2025年

2024年

2023年

2022年

内容

内容

内容

内容