Miscellaneous

Deep Generative Models for Medical Diagnosis Using Brain Functional Imaging

Brain functional imaging is expected to supplement the diagnosis of mental disorders based on clinical interviews by providing objective criteria. However, data collection is extremely costly, and existing datasets are very small at the scale required for standard deep learning. Furthermore, individual variability such as age and gender often hinders detecting disease-related activity patterns. Additionally, discrepancies in imaging equipment across hospitals can cause further complications. To address these issues, we construct a Bayesian network using a deep generative model that simultaneously captures both the disease-related activity patterns and individual/environmental variability separately. As a result, multiple small datasets can be pooled and analyzed as if they were a single large dataset, accelerating robust, high-precision diagnosis robust to gender and age variations, as well as identifying of the disease-related regions.

Takashi Matsubara, Koki Kusano, Tetsuo Tashiro, Ken'ya Ukai, and Kuniaki Uehara, "Deep Generative Model of Individual Variability in fMRI Images of Psychiatric Patients," IEEE Transactions on Biomedical Engineering, vol. 68, no. 2, pp. 592-605, 2021.
IEEE
Koki Kusano, Tetsuo Tashiro, Takashi Matsubara, and Kuniaki Uehara, "Deep Generative State-Space Modeling of FMRI Images for Psychiatric Disorder Diagnosis," The 2019 International Joint Conference on Neural Networks (IJCNN2019), Budapest, Jul. 2019.
Paper
Takashi Matsubara, Tetsuo Tashiro, and Kuniaki Uehara, "Deep Neural Generative Model of Functional MRI Images for Psychiatric Disorder Diagnosis," IEEE Transactions on Biomedical Engineering, vol. 66, no. 10, pp. 2768-2779, 2019.
IEEE arXiv
Takashi Matsubara, Tetsuo Tashiro, and Kuniaki Uehara, "Structured Deep Generative Model of FMRI Signals for Mental Disorder Diagnosis," The 21st International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI2018), Granada, Sep. 2018, pp. 258-266.
Paper
Tetsuo Tashiro, Takashi Matsubara, and Kuniaki Uehara, "Deep Neural Generative Model for fMRI Image Based Diagnosis of Mental Disorder," The 2017 International Symposium on Nonlinear Theory and its Applications (NOLTA2017), Cancun, Dec. 2017, pp. 700-703, 5169.

Stock Price Prediction Using Deep Generative Models of Language Information

In this study, we propose a method for daily stock price trend prediction from news articles using deep generative models. We consider the impact of each news article. First, we use a method called Paragraph Vector to represent the information in news articles as fixed-length vectors, which sufficiently captures the information in the language. Next, we represent the relationship between stock price information and language information using deep generative models and learn the parameters based on the distributed representation. By using generative models, we can represent latent variables and probabilistic processes that generate news articles, and suppress overfitting of the parameters required for that representation. We demonstrated the effectiveness of this method by performing binary classification of stock price trends for both the Japanese and American markets.

Takashi Matsubara, Ryo Akita, and Kuniaki Uehara, "Stock Price Prediction by Deep Neural Generative Model of News Articles," IEICE Transactions on Information and Systems, Vol.E101-D, No.4, pp.901-908, 2018.
J-STAGE
Ryo Akita, Akira Yoshihara, Takashi Matsubara, and Kuniaki Uehara, "Deep Learning for Stock Prediction Using Numerical and Textual Information," the 15th IEEE/ACIS International Conference on Computer and Information Science (ICIS 2016), Okayama, Jun. 2016, pp. 945-950.
IEEE

Image Modality Conversion for Constructing Diverse Virtual

Deep learning has been intensively investigated for autonomous driving and robotic control. These real-world problems require massive and diverse datasets, but collecting data for a wide range of conditions (such as nighttime and rainy weather) incurs significant costs. Although one approach is to develop environmental simulators to generate artificial data, discrepancies from real environments can degrade performance, and building a high-fidelity simulator itself is expensive. To overcome these challenges, this study proposes a method to augment training data by transforming real data modalities (that is, converting daytime data into nighttime data). This research was conducted as a joint research project with Toyota Central R&D Labs., Inc.

Shinta Masuda, Takashi Matsubara, Kuniaki Uehara, "Image Modality Conversion for Constructing Diverse Virtual Spaces," 2018 Annual Conference of the Japanese Society for Artificial Intelligence (JSAI2018) , 4M1-04, Kagoshima, June, 2018.
Slide

Few-shot Anomaly Detection Using Grouped Data Generative Models

"Anomaly detection" is an important task in image analysis, with applications ranging from defective product inspection to medical imaging. Deep generative models enable estimation of the likelihood of high-dimensional real data, such as images, and rare samples (e.g., defective items) tend to have lower likelihood, making them detectable as anomalies. However, normal yet unseen products (e.g., newly developed products not included in the training set) also exhibit lower likelihood and are thus falsely detected as anomalies. To address this issue, we propose a deep generative model that separates features unique to each product group from those unique to individual items. Leveraging a model trained on existing products, our approach achieves "few-shot anomaly detection," thereby enabling the identification of defective items even among a small number of new product samples. This research was conducted as a joint research project with The KAITEKI Institute, Inc.

Kazuki Sato, Satoshi Nakata, Takashi Matsubara, and Kuniaki Uehara, "Few-shot Anomaly Detection using Deep Generative Models for Grouped Data," IEICE Transactions on Information and Systems, vol.E105-D, no.2, pp.436-440, 2022.
J-STAGE

Anomaly Detection Using Unregularized Anomaly Score with Deep Generative Models

In general, anomaly detection refers to identifying rare instances within large datasets as "anomalies." Deep generative models, which learn to compress and reconstruct samples such as images, primarily learn typical samples and regard samples that cannot be reconstructed as anomalies. However, reconstruction failure may stem from "epistemic uncertainty" (due to insufficient training of rare samples) or "aleatoric uncertainty" (due to noise or complex shapes). Regions with high aleatoric uncertainty, such as screw holes, are frequently misdetected as anomalies despite being normal. To mitigate this problem, we decompose the likelihood in a deep generative model and uses only the component corresponding to epistemic uncertainty, termed the non-regularized anomaly score, for anomaly detection. This approach avoids being misled by visually complex regions, thereby improving detection accuracy. This research was conducted as a joint research project with AISIN AW Co., Ltd.

Takashi Matsubara, Kazuki Sato, Kenta Hama, Ryosuke Tachibana, and Kuniaki Uehara, "Deep Generative Model using Unregularized Score for Anomaly Detection with Heterogeneous Complexity," IEEE Transactions on Cybernetics, vol. 52, no. 6, pp. 5161-5173, 2022.
arXiv IEEE
Takashi Matsubara, Ryosuke Tachibana, and Kuniaki Uehara, "Anomaly Machine Component Detection by Deep Generative Model with Unregularized Score," The 2018 International Joint Conference on Neural Networks (IJCNN2018), Rio de Janeiro, Jul. 2018, pp.4067-4074.
IEEE

Reliability Evaluation in Image-Text Retrieval Using Bayesian Deep Learning

Evaluating the reliability of decision making of machine learning algorithms remains a major challenge. While uncertainty-based methods in Bayesian neural networks have been proposed for assessing reliability in classification and regression tasks, these methods cannot be directly applied to image-text retrieval. In this work, we define two types of uncertainty by interpreting image-text retrieval as a classification problem (posterior uncertainty) and as a regression problem (uncertainty in the embedding). Through experimentation, we found that treating image-text retrieval as a classification problem provides a more appropriate evaluation of reliability.

Kenta Hama, Takashi Matsubara, Kuniaki Uehara, and Jianfei Cai, "Exploring Uncertainty Measures for Image-Caption Embedding-and-Retrieval Task," ACM Transactions on Multimedia Computing, Communications, and Applications , vol. 17, no. 2, article no. 46, 2021.
arXiv
Kenta Hama, Takashi Matsubara, and Kuniaki Uehara, "Image-Caption Retrieval with Evaluating Uncertainties," The 7th Japan-Korea Joint Workshop on Complex Communication Sciences (JKCCS), Pyengonchang, Jan. 2019.

Bayesian Estimation and Model Averaging of Convolutional Neural Networks by Hypernetwork

Neural networks can learn complex representations and show high performance in various tasks. However, since the data available for training is limited, they are prone to overfitting. Regularizing the training of neural networks to prevent overfitting is one of the most important challenges. In this study, we target large-scale convolutional neural networks and use hypernetworks to implicitly estimate the posterior distribution of parameters to regularize training. Additionally, since the distribution of parameters is learned, classification accuracy can be improved through model averaging.

Kenya Ukai, Takashi Matsubara, and Kuniaki Uehara, "Bayesian Estimation and Model Averaging of Convolutional Neural Networks by Hypernetwork," Nonlinear Theory and Its Applications, IEICE, Vol.E10-N, No.1, 2019.
J-STAGE
Kenya Ukai, Takashi Matsubara, and Kuniaki Uehara, "Hypernetwork-based Implicit Posterior Estimation and Model Averaging of Convolutional Neural Networks," The 10th Asian Conference on Machine Learning (ACML2018), Beijing, Nov. 2018, pp. 176-191.
Paper

Gradient Search for Deep Neural Networks with Attention Mechanism

Recent advances in Neural Architecture Search (NAS) have made it possible to automatically design efficient architectures for image classification tasks. Convolutional Neural Networks (CNNs) are commonly used in image classification, which rely primarily on convolution and pooling operations. While traditional NAS methods have focused on selecting among these operations, recent work has shown that incorporating attention mechanisms into CNNs can increase representational power, improving accuracy while limiting the growth in parameters. In this study, we propose a method for automatically designing CNNs that integrate attention mechanisms.

Kohei Nakai, Takashi Matsubara, and Kuniaki Uehara, "Neural Architecture Search for Convolutional Neural Networks with Attention," IEICE Transactions on Information and Systems, vol. E104.D, no. 2, 2021. J-STAGE
Kohei Nakai, Takashi Matsubara, and Kuniaki Uehara, "Att-DARTS: Differentiable Neural Architecture Search for Attention," The 2020 International Joint Conference on Neural Networks (IJCNN2020), Glasgow (Online), Jul. 2020.
Slide Code

Data Augmentation Using Random Image Cropping and Patching

Deep convolutional neural networks (CNNs) with a very large number of parameters have achieved remarkable success in image processing. However, an excessively large number of parameters is increases the risk of overfitting. To mitigate this, various data augmentation methods have been proposed, such as flipping, cropping, scaling, and color transformations. Building on these techniques, our study introduces a new technique in which we randomly crop four different images and patch them together to form a new training sample, thereby achieving even higher accuracy in image processing tasks.

Ryo Takahashi, Takashi Matsubara, and Kuniaki Uehara, "Data Augmentation using Random Image Cropping and Patching for Deep CNNs," IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 9, pp. 2917-2931, 2020.
IEEE arXiv
Ryo Takahashi, Takashi Matsubara, and Kuniaki Uehara, "RICAP: Random Image Cropping and Patching Data Augmentation for Deep CNNs," The 10th Asian Conference on Machine Learning (ACML2018), Beijing, Nov. 2018, pp. 786-798. (acceptance rate 57/230=0.248)
Paper Code

Human-Like Agents by Combining Reinforcement Learning and Imitation Learning

Reinforcement learning (RL) agents are capable of solving a wide range of tasks, including the board game Go, autonomous driving, and video games. Although RL trains agents to maximize rewards, practical applications demand considerations beyond pure performance. For instance, an agent that is too strong can diminish user enjoyment in video games, while in autonomous driving, excessive acceleration and deceleration may cause passenger anxiety. As a result, there is growing interest in designing agents with more human-like behavior. Imitation learning, which trains agents on expert human policies, can yield human-like actions but cannot surpass the expert's performance. In this study, we propose a method that integrates reinforcement learning with imitation learning, thereby combining the strengths of both approaches. We applied our model to Atari games and the driving simulator TORCS, and experimental evaluation demonstrated that our method outperforms imitation-only agents while exhibiting human-like behavior compared to RL-only agent. This research was conducted as a joint research project with Equos Research Co., Ltd.

Rousslan Fernand Julien Dossa, Xinyu Lian, Hirokazu Nomoto, Takashi Matsubara, and Kuniaki Uehara, "Hybrid of Reinforcement and Imitation Learning for Human-Like Agents," IEICE Transactions on Information and Systems, vol. E103.D, no. 9, pp. 1960-1970, 2020.
J-STAGE
Rousslan Fernand Julien Dossa, Xinyu Lian, Hirokazu Nomoto, Takashi Matsubara, and Kuniaki Uehara, "A Human-Like Agent Based on a Hybrid of Reinforcement and Imitation Learning," The 2019 International Joint Conference on Neural Networks (IJCNN2019) , Budapest, Jul. 2019.
IEEE Slide

Collection of All-Sky Images and Cloud Type Classification

Marine weather observations are essential for safe navigation, and in Japan, even commercial vessels are required to report their findings to the Japan Meteorological Agency. However, conventional weather instruments cannot determine cloud type or cloud cover, requiring observers to rely on visual inspection. Existing approaches developed in other countries are not fully adapted to Japanese weather conditions or reporting standards, highlighting the need for a Japan-specific system. In this study, we developed a device capable of capturing full-sky images, installed it on vessels to collect medium-scale datasets, and labeled cloud types and conditions in the lower, middle, and upper layers of the atmosphere. After training a deep convolutional neural network on these datasets, we confirmed that both cloud type and condition classification achieved accuracies above 0.9. This work was conducted as a joint research project with SKY Perfect JSAT Corporation, Banyan Partners, Kobe Digital Labo Inc., and Professor Osawa of the Graduate School of Maritime Sciences at Kobe University.

Part of the research results has been released as the iOS and Android app "Kumolog." It can be downloaded from the App Store and Google Play. It has also been featured in the Kobe University Press Release, NHK WEB NEWS, Asahi Shimbun, and Nikkan Kogyo Shimbun.

森川優, 中西波瑠, 稲村直樹, 近藤伸明, 小渕浩希, 大澤輝夫, 松原崇, 申吉浩, 大島裕明, 上原邦昭, "船舶における全天球画像のデータ収集と雲形・状態・全雲量の自動判定の試み," 日本気象学会機関誌「天気」, vol. 70, no. 12, pp. 577-692, 2023.
Naoki Inamura, Kota Fujiwara, Takahisa Amakata, Fumio Tsuri, Haru Nakanishi, Hiroki Obuchi, Teruo Osawa, Takashi Matsubara, Kuniaki Uehara, "Solar Power Generation Prediction Using All-Sky Images and Solar Radiation Data," 2020 Annual Conference of the Japanese Society for Artificial Intelligence (JSAI2020), Kumamoto, June, 2020.
Yu Morikawa, Haru Nakanishi, Naoki Inamura, Nobuaki Kondo, Hiroki Obuchi, Teruo Osawa, Takashi Matsubara, Kuniaki Uehara, "Data Collection of All-Sky Images and Cloud Type and State Judgment," 2018 Annual Conference of the Japanese Society for Artificial Intelligence (JSAI2018), 2A4-01, Kagoshima, June, 2018.

Research

Deep Generative Models