Updated on 2025/05/01

写真a

 
Leow Chee Siang
 
Organization
Graduate Faculty of Interdisciplinary Research Faculty of Engineering Mechanical Engineering (Mechatronics) Assistant Professor
Title
Assistant Professor
Contact information
メールアドレス
Profile
Born in Kuala Lumpur, Malaysia, he entered the University of Yamanashi in 2014 and graduated in March 2024 with a PhD. He has been working at the University of Yamanashi since April 2024. He specializes in spoken language processing, character detection and recognition (OCR), and character generation using deep learning. Recently, he is interested in robot control for smart agriculture using deep learning.
External link

Name(s) appearing in print

  • レオチーシャン

  • Chee Siang LEOW

  • LEOW Chee Siang

  • チーシャン レオ

  • レオ チー シャン

  • Leow Chee Siang

  • Chee Siang Leow

▼display all

Research History

  • University of Yamanashi   Department of Mechanical Engineering, Faculty of Engineering, Graduate School of Science and Technology   Assistant Professor

    2024.4

      More details

    Country:Japan

    Job classification:Assistant professor

Education

  • University of Yamanashi   Integrated Graduate School of Medicine, Engineering, and Agricultural Sciences Doctoral Programs   System Integration Engineering Course System Design Major

    2020.4 - 2024.3

      More details

    Country: Japan

    researchmap

  • University of Yamanashi   Integrated Graduate School of Medicine, Engineering, and Agriculture Sciences Master's Program   Mechatronics

    2018.4 - 2020.3

      More details

    Country: Japan

    researchmap

  • University of Yamanashi   Faculty of Engineering   Department of Mechatronics

    2014.4 - 2018.3

      More details

    Country: Japan

    researchmap

Degree

  • Doctor of Engineering ( 2024.3   University of Yamanashi )

  • Master of Engineering ( 2020.3   University of Yamanashi )

  • Bachelor of Engineering ( 2018.3   University of Yamanashi )

Research Areas

  • Informatics / Intelligent informatics  / Handwritten Character Generation

  • Informatics / Perceptual information processing  / Handwritten Text Recognition

  • Informatics / Perceptual information processing  / Speech Recognition

  • Informatics / Intelligent robotics  / Smart Agriculture

  • Informatics / Perceptual information processing  / Optical Character Recognition

Research Interests

  • Speech Recognition

  • Noise Reduction

  • Deep Learning

  • Machine Learning

  • Text Recognition

  • Character Generation

  • Text Detection

  • Technical Transfer

  • Handwritten Character Generation

  • Smart Agriculture

  • 手書き文字認識

  • Text Detection

  • Character Image Generation

  • Handwritten Text Recognition

  • Optical Character Recognition

  • Speech Recognition

  • Smart Agriculture

Research Projects

  • 多機能ロボット開発と栽培体系革新によるシャインマスカット高効率栽培の実現  Major achievement

    Grant number:24036033  2024.10 - 2026.3

    国立研究開発法人農業・食品産業技術総合研究機構  戦略的スマート農業技術等の開発・改良  2024年度 / 令和6年度当初予算 次世代スマート農業技術の開発・改良・実用化

    レオ チーシャン

      More details

    Authorship:Coinvestigator(s)  Grant type:Competitive  Type of fund::Funded research

  • 高精度AI-OCRモデル構築のための多様な手書き文字画像の自動生成モデリング  Major achievement

    Grant number:24K23888  2024.8 - 2026.3

    文部科学省  科学研究費助成事業  研究活動スタート支援

    LEOW CHEESIANG

      More details

    Authorship:Principal investigator  Type of fund::Science research expense

  • 改良型音AI駆動の果実盗難検知通報システムの開発と大規模検証補助事業

    2024.4 - 2026.3

    公益財団法人JKA  2024年度機械振興補助事業振興事業補助 

    石田和義、西崎博光、レオチーシャン

      More details

    Authorship:Coinvestigator(s)  Type of fund::Others

Papers

  • Development of QR Code-Guided Autonomous Navigation System for Grape Cultivation Robot in Overhead Trellis Vineyard Reviewed

    Proceedings of the 2025 IEEE International Conference on Industrial Technology   1 - 6   2025.3

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

  • Development of a Fruit Theft Reporting System Using a Compact Microcontroller with Deep Learning Based on Suspicious Sounds Reviewed

    Leow Chee Siang, Tsutomu Tanzawa, Bong Tze Yaw, Koji Makino, Kazuyoshi Ishida, Hiromitsu Nishizaki

    Proceedings of the 50th Annual Conference of the IEEE Industrial Electronics Society (IECON 2024)   1 - 6   2024.11

     More details

    Authorship:Lead author   Language:English  

    DOI: 10.1109/IECON55916.2024.10905809

  • Evaluation of LoRa-based Long-Range Communication in a Fruit Theft Prevention Device Reviewed

    Kazuyoshi Ishida, Chee Siang Leow, Tsutomu Tanzawa, Tze Yaw Bong, Hiromitsu Nishizaki, Koji Makino

    2024 IEEE 13th Global Conference on Consumer Electronics, GCCE 2024   276 - 279   2024.11

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    DOI: 10.1109/GCCE62371.2024.10760408

  • Text Detection and Style Classification from Images Using Vision Transformer and Transformer Decoder Reviewed

    Hideaki Yajima, Chee Siang Leow, Hiromitsu Nishizaki

    2024 IEEE 13th Global Conference on Consumer Electronics, GCCE 2024   619 - 623   2024.11

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    DOI: 10.1109/GCCE62371.2024.10761042

  • Evaluation of Speech Translation Subtitles Generated by ASR with Unnecessary Word Detection Reviewed

    Makoto Hotta, Chee Siang Leow, Norihide Kitaoka, Hiromitsu Nishizaki

    2024 IEEE 13th Global Conference on Consumer Electronics, GCCE 2024   815 - 819   2024.11

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    DOI: 10.1109/GCCE62371.2024.10760522

  • High Quality Color Estimation of Shine Muscat Grape Using Vision Transformer Reviewed

    Ryosuke Shimazu, Chee Siang Leow, Prawit Buayai, Koji Makino, Xiayang Mao, Hiromitsu Nishizaki

    Proceedings of the 23rd International Conference on Cyberworlds (Cyberworlds 2024)   195 - 202   2024.10

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    DOI: 10.1109/CW64301.2024.00028

    Other Link: https://www.computer.org/csdl/proceedings-article/cw/2024/271700a195/259PFR330Eo

  • Analysis of Classroom Processes Based on Deep Learning With Video and Audio Features. Reviewed

    Chuo Hiang Heng, Masahiro Toyoura, Chee Siang Leow, Hiromitsu Nishizaki

    IEEE Access   12   110705 - 110712   2024( ISSN:2169-3536 )

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    DOI: 10.1109/ACCESS.2024.3434742

    researchmap

  • AI自習ドリル 番外編 シャインマスカットの収穫時期をAIで判断

    牧野浩二, 西崎博光, SIANG Leow Chee, BUAYAI Prawit, 茅暁陽

    インターフェース   50 ( 1 )   218 - 219   2024( ISSN:0387-9569 )

     More details

    Language:Japanese   Publishing type:(MISC) Lecture materials etc. (seminar, tutorial, course, lecture and others)  

    J-GLOBAL

    researchmap

  • Assessment and Improvement of Customer Service Speech with Multiple Large Language Models Reviewed

    So Watanabe, Chee Siang Leow, Junichi Hoshino, Takehito Utsuro, Hiromitsu Nishizaki

    Proceedings of the 2024 APSIPA Annual Summit and Conference   1 - 6   2024

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    DOI: 10.1109/APSIPAASC63619.2025.10849072

  • Single-Line Text Detection in Multi-Line Text with Narrow Spacing for Line-Based Character Recognition Reviewed

    Chee Siang Leow, Hideaki Yajima, Tomoki Kitagawa, Hiromitsu Nishizaki

    IEICE Transactions on Information and Systems   E106.D ( 12 )   2097 - 2106   2023.12( ISSN:0916-8532  eISSN:1745-1361 )

     More details

    Authorship:Lead author   Language:English   Publishing type:Research paper (scientific journal)  

    Text detection is a crucial pre-processing step in optical character recognition (OCR) for the accurate recognition of text, including both fonts and handwritten characters, in documents. While current deep learning-based text detection tools can detect text regions with high accuracy, they often treat multiple lines of text as a single region. To perform line-based character recognition, it is necessary to divide the text into individual lines, which requires a line detection technique. This paper focuses on the development of a new approach to single-line detection in OCR that is based on the existing Character Region Awareness For Text detection (CRAFT) model and incorporates a deep neural network specialized in line segmentation. However, this new method may still detect multiple lines as a single text region when multi-line text with narrow spacing is present. To address this, we also introduce a post-processing algorithm to detect single text regions using the output of the single-line segmentation. Our proposed method successfully detects single lines, even in multi-line text with narrow line spacing, and hence improves the accuracy of OCR.

    DOI: 10.1587/transinf.2023EDP7070

    Scopus

    researchmap

  • 画像の色空間を考慮したシャインマスカットブドウの色推定モデル

    雨宮, 達佳, チーシャン, レオ, プラウィット, ブアヤイ, 牧野, 浩二, 茅, 暁陽, 西崎, 博光

    第85回全国大会講演論文集   2023 ( 1 )   221 - 222   2023.2

     More details

    Language:Japanese   Publishing type:Research paper (research society, symposium materials, etc.)  

    シャインマスカットという品種のブドウにおいて,実際にブドウを収穫してよいかどうかの判断の一つとして色が使われる。山梨県果樹試験場が開発したシャインマスカットブドウ用のカラーチャートでは5段階の色の変化が示されている。本稿では,実際のブドウの写真からカラーチャートに基づくブドウの色を推定する深層学習モデルを検討した。提案モデルでは,RGB以外の色空間としてLAB空間を採用し,RGBのGBチャンネルとLABのLチャンネルを組み合わせることで,色推定の精度が改善することが分かった。

    CiNii Books

    CiNii Research

    researchmap

  • Metric Learning Approach for End-to-End Multilingual Automatic Speech Recognition Model Reviewed

    Akihiro Dobashi, Chee Siang Leow, Hiromitsu Nishizaki

    GCCE 2023 - 2023 IEEE 12th Global Conference on Consumer Electronics   446 - 450   2023(  ISBN:9798350340181 )

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    This study explores the application of metric learning in an end-to-end multilingual automatic speech recognition (ASR) model, employing the wav2vec 2.0 framework. In the proposed method, the E2E ASR model implements metric learning by obtaining acoustic features corresponding to character labels through forced alignment. When metric learning was applied to a six-language E2E ASR model during training, the model incorporating metric learning demonstrated a 0.7-point improvement in the character error rate (from 8.4% to 7.7%) over the baseline model, which was trained without metric learning. Additionally, the visualization of feature vectors indicated a decrease in both the variation of acoustic feature vectors for individual characters and inter-character interference, further underscoring the effectiveness of our approach.

    DOI: 10.1109/GCCE59613.2023.10315608

    Scopus

    researchmap

  • Image Remapping Data Augmentation Approach for Improving Fisheye Face Recognition Reviewed

    Yinghao He, Chee Siang Leow, Hiromitsu Nishizaki

    GCCE 2023 - 2023 IEEE 12th Global Conference on Consumer Electronics   742 - 746   2023(  ISBN:9798350340181 )

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    Fisheye cameras present a challenge for face recognition due to their wide-angle perspective and image distortion. This paper introduces a novel approach to enhancing training data for fisheye-based face recognition without requiring image calibration. We employ five image-remapping transformations to diversify and expand the training dataset and evaluate the effectiveness of this approach using deep learning networks: HRNetV2 and ResNet50. The results demonstrate significant improvements in classification accuracy when utilizing authentic fisheye facial data. Specifically, HRNetV2 exhibits an increase of 30.2%, while ResNet50's performance improves by 11.8% compared to their respective baseline performances. This study presents a fresh method for refining face recognition in fisheye camera scenarios, thereby extending its potential for real-world applications.

    DOI: 10.1109/GCCE59613.2023.10315437

    Scopus

    researchmap

  • Harvest-Appropriate Timing Decision System for an Assistive Robot of Shine-Muscat Cultivation

    NISHIZAKI Hiromitsu, AMEMIYA Tatsuyoshi, LEOW Chee Siang, BUAYAI Prawit, MAKINO Koji, MAO Xiaoyang

    The Proceedings of JSME annual Conference on Robotics and Mechatronics (Robomec)   2023   2A1-B03   2023(  eISSN:2424-3124 )

     More details

    Language:Japanese   Publishing type:Research paper (research society, symposium materials, etc.)   Publisher:The Japan Society of Mechanical Engineers  

    Our group is researching a support system for cultivating Shine-Muscat grapes. The ripeness of the grapes is determined by their color and is compared against a Shine-Muscat color chart. In this paper, we introduce a color estimation model that can determine the optimal time for harvesting by using a deep learning model to estimate the color of the grape bunch. We incorporate metric learning during model training and use the specific color space representation of images to achieve high accuracy in our color estimation model. In addition, we propose a method to incorporate an appropriate reference color into the color estimation model. Our evaluation experiment indicatesd that our proposed model correctly identified the optimal time to harvest for 85.9% of the grapes.

    DOI: 10.1299/jsmermd.2023.2a1-b03

    CiNii Research

    researchmap

  • Estimation of Non-Invasive Grape Ripeness and Sweetness from Images Captured by a General-Purpose Camera Reviewed

    Chee Siang Leow, Ryosuke Shimazu, Tomoki Kitagawa, Hideaki Yajima, Prawit Buayai, Koji Makino, Xiaoyang Mao, Hiromitsu Nishizaki

    2023 IEEE International Workshop on Metrology for Agriculture and Forestry, MetroAgriFor 2023 - Proceedings   295 - 300   2023(  ISBN:9798350312720 )

     More details

    Authorship:Lead author   Language:English   Publishing type:Research paper (international conference proceedings)  

    This paper examines the potential of deep learning methods for grape quality evaluation, particularly determining ripeness suitable for harvest and estimating grape 'sweetness,' using imagery. The uniqueness of this study lies in its exclusive use of images captured with standard cameras to predict the quality of an edible green grape variety known as 'Shine Muscat.' This approach anticipates future scenarios where farmers can easily capture images using smart glasses or smartphones. The method we developed leverages three channels from the RGB color space and CIELAB color space of the grape image, utilizes a feature extractor employing a vision transformer-based model, and introduces a deep learning model training method using a reference color. These innovations have successfully elevated the accuracy of the harvest determination task to 87.7%. Additionally, though the accuracy is not yet high, the paper showcases the potential of estimating grape 'sweetness' directly from images by using the color estimation model as a pre-training model.

    DOI: 10.1109/MetroAgriFor58484.2023.10424087

    Scopus

    researchmap

  • Effectiveness of Metric Learning for End-to-End Multi Language Speech Recognition Model Training.

    土橋晃弘, LEOW Chee Siang, 西崎博光

    日本音響学会研究発表会講演論文集(CD-ROM)   2023   2023( ISSN:1880-7658 )

     More details

    Language:Japanese   Publishing type:Research paper (research society, symposium materials, etc.)  

    J-GLOBAL

    researchmap

  • Data Augmentation with Automatically Generated Images for Character Classifier Model Training Reviewed

    Chee Siang Leow, Tomoki Kitagawa, Hideaki Yajima, Hiromitsu Nishizaki

    GCCE 2023 - 2023 IEEE 12th Global Conference on Consumer Electronics   845 - 849   2023(  ISBN:9798350340181 )

     More details

    Authorship:Lead author   Language:English   Publishing type:Research paper (international conference proceedings)  

    This paper presents a novel data augmentation technique crucial for training AI-OCR systems for handwritten character classification. Using a Y-autoencoder (Y-AE) enhanced with Adaptive Instance Normalization, diverse handwriting styles are generated to improve the breadth of handwriting representations. A filtering mechanism is introduced to include only valid character images for training. The method was tested on a subset of the ETL Character Database, featuring 92 unique Japanese Hiragana and Katakana characters. The baseline classifier achieved an accuracy of 0.9061. However, when using the augmented dataset, which included Y-AE model-generated and filtered images, the accuracy improved to a maximum 0.9555 with data augmentation technique. These results showcase the potential of this data augmentation technique in consumer electronics, particularly in AI-OCR software. Despite needing some noise removal, the approach significantly boosts classifier accuracy, suggesting an efficient way forward for document processing in various sectors.

    DOI: 10.1109/GCCE59613.2023.10315447

    Scopus

    researchmap

  • Appropriate grape color estimation based on metric learning for judging harvest timing Reviewed

    Tatsuyoshi Amemiya, Chee Siang Leow, Prawit Buayai, Koji Makino, Xiaoyang Mao, Hiromitsu Nishizaki

    Visual Computer   38 ( 12 )   4083 - 4094   2022.12( ISSN:0178-2789 )

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    The color of a bunch of grapes is a very important factor when determining the appropriate time for harvesting. However, judging whether the color of the bunch is appropriate for harvesting requires experience and the result can vary by individuals. In this paper, we describe a system to support grape harvesting based on color estimation using deep learning. To estimate the color of a bunch of grapes, bunch detection, grain detection, removal of pest grains, and color estimation are required, for which deep learning-based approaches are adopted. In this study, YOLOv5, an object detection model that considers both accuracy and processing speed, is adopted for bunch detection and grain detection. For the detection of diseased grains, an autoencoder-based anomaly detection model is also employed. Since color is strongly affected by brightness, a color estimation model that is less affected by this factor is required. Accordingly, we propose multitask learning that uses metric learning. The color estimation model in this study is based on AlexNet. Metric learning was applied to train this model. Brightness is an important factor affecting the perception of color. In a practical experiment using actual grapes, we empirically selected the best three image channels from RGB and CIELAB (L*a*b*) color spaces and we found that the color estimation accuracy of the proposed multi-task model, the combination with “L” channel from L*a*b color space and “GB” from RGB color space for the grape image (represented as “LGB” color space), was 72.1%, compared to 21.1% for the model which used the normal RGB image. In addition, it was found that the proposed system was able to determine the suitability of grapes for harvesting with an accuracy of 81.6%, demonstrating the effectiveness of the proposed system.

    DOI: 10.1007/s00371-022-02666-0

    Scopus

    researchmap

  • Frequency-Directional Attention Model for Multilingual Automatic Speech Recognition

    Akihiro Dobashi, Chee Siang Leow, Hiromitsu Nishizaki

    2022.3

     More details

    Language:English   Publishing type:(MISC) Institution technical report and pre-print, etc.  

    This paper proposes a model for transforming speech features using the
    frequency-directional attention model for End-to-End (E2E) automatic speech
    recognition. The idea is based on the hypothesis that in the phoneme system of
    each language, the characteristics of the frequency bands of speech when
    uttering them are different. By transforming the input Mel filter bank features
    with an attention model that characterizes the frequency direction, a feature
    transformation suitable for ASR in each language can be expected. This paper
    introduces a Transformer-encoder as a frequency-directional attention model. We
    evaluated the proposed method on a multilingual E2E ASR system for six
    different languages and found that the proposed method could achieve, on
    average, 5.3 points higher accuracy than the ASR model for each language by
    introducing the frequency-directional attention mechanism. Furthermore,
    visualization of the attention weights based on the proposed method suggested
    that it is possible to transform acoustic features considering the frequency
    characteristics of each language.

    arXiv

    researchmap

    Other Link: http://arxiv.org/pdf/2203.15473v1

  • 文字認識モデル訓練のための手書き文字生成

    北川, 智樹, チーシャン, レオ, 西崎, 博光

    第84回全国大会講演論文集   2022 ( 1 )   275 - 276   2022.2

     More details

    Language:Japanese   Publishing type:Research paper (research society, symposium materials, etc.)  

    深層学習を用いた文字認識システムでは,高性能な文字認識器を学習させるために,大量のデータが必要であることはよく知られている。本稿では,Y-Autoencoder (Y-AE) ベースの手書き文字生成器を導入し,1枚の画像から複数の日本語文字を生成することで,手書き文字認識器の学習用データ量を増やす。実験の結果,Y-AEは日本語の文字画像を生成でき,手書き文字認識器の学習に用いることができ,手書き文字認識精度(F1スコア)を向上することがわかった。

    CiNii Books

    CiNii Research

    researchmap

  • Handwritten Character Generation using Y-Autoencoder for Character Recognition Model Training Reviewed

    Tomoki Kitagawa, Chee Siang Leow, Hiromitsu Nishizaki

    2022 Language Resources and Evaluation Conference, LREC 2022   7344 - 7351   2022(  ISBN:9791095546726 )

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    It is well-known that the deep learning-based optical character recognition (OCR) system needs a large amount of data to train a high-performance character recognizer. However, it is costly to collect a large amount of realistic handwritten characters. This paper introduces a Y-Autoencoder (Y-AE)-based handwritten character generator to generate multiple Japanese Hiragana characters with a single image to increase the amount of data for training a handwritten character recognizer. The adaptive instance normalization (AdaIN) layer allows the generator to be trained and generate handwritten character images without paired-character image labels. The experiment showed that the Y-AE could generate Japanese character images then used to train the handwritten character recognizer, producing an F1-score improved from 0.8664 to 0.9281. We further analyzed the usefulness of the Y-AE-based generator with shape images and out-of-character (OOC) images, which have different character image styles in model training. The result showed that the generator could generate a handwritten image with a similar style to that of the input character.

    Scopus

    researchmap

  • Multi-Lingual Speech Recognition based on Feature Transform Model using Frequency Axis Attention.

    土橋晃弘, LEOW Chee Siang, 西崎博光

    日本音響学会研究発表会講演論文集(CD-ROM)   2022   2022( ISSN:1880-7658 )

     More details

    Language:Japanese   Publishing type:Research paper (research society, symposium materials, etc.)  

    J-GLOBAL

    researchmap

  • Development of a Streaming ASR system based on Kaldi

    LEOW Chee Siang, WANG Yu, 小林彰夫, 宇津呂武仁, 西崎博光

    日本音響学会研究発表会講演論文集(CD-ROM)   2021   2021( ISSN:1880-7658 )

     More details

    Authorship:Lead author   Language:Japanese   Publishing type:Research paper (research society, symposium materials, etc.)  

    J-GLOBAL

    researchmap

  • Language and speaker-independent feature transformation for end-to-end multilingual speech recognition Reviewed

    Tomoaki Hayakawa, Chee Siang Leow, Akio Kobayashi, Takehito Utsuro, Hiromitsu Nishizaki

    Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH   1   396 - 400   2021( ISSN:2308-457X  ISBN:9781713836902  eISSN:1990-9772 )

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    This paper proposes a method to improve the performance of multilingual automatic speech recognition (ASR) systems through language- and speaker-independent feature transformation in a framework of end-to-end (E2E) ASR. Specifically, we propose a multi-task training method that combines a language recognizer and a speaker recognizer with an E2E ASR system based on connectionist temporal classification (CTC) loss functions. We introduce the language and speaker recognition subtasks into the E2E ASR network and introduce a gradient reversal layer (GRL) for each sub-task to achieve language and speaker-independent feature transformation. The evaluation results of the proposed method in the multilingual ASR system in six sorts of languages show that the proposed method achieves higher accuracy than the ASR models for each language by introducing multi-tasking and GRL.

    DOI: 10.21437/Interspeech.2021-390

    Scopus

    researchmap

  • Development of a Support System for Judging the Appropriate Timing for Grape Harvesting Reviewed

    Tatsuyoshi Amemiya, Kodai Akiyama, Chee Siang Leow, Prawit Buayai, Koji Makino, Xiaoyang Mao, Hiromitsu Nishizaki

    Proceedings - 2021 International Conference on Cyberworlds, CW 2021   194 - 200   2021(  ISBN:9781665440653 )

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    The color of grape bunches is a significant factor when harvesting grapes at the appropriate timing. Judging the suitable color for shipment requires experience and varies from one person to another. We herein describe a support system for grape harvesting based on color estimation. To estimate the color of a bunch of grapes, bunch detection, grain detection, removal of diseased grains, and color estimation should be performed. Models based on deep learning are employed for this series of processes. Since color is strongly affected by sunlight, we propose a multitask model that considers sunlight exposure to achieve a robust color estimation model that exhibits decreased sensitivity to sunlight. Our results show that the color estimation accuracy of the model is 76% when sunlight exposure is not considered and 81% when sunlight exposure is considered. In addition, we performed a practical field test of the developed harvest support system in an actual grape field. The results show that our support system can determine the appropriateness of grape harvest with an accuracy of 90%, demonstrating the effectiveness of the system.

    DOI: 10.1109/CW52790.2021.00040

    Scopus

    researchmap

  • Customer service training VR system that can train how to speak

    西尾瞳希, 飯田宗一郎, 佐野祐太, SIANG Leow Chee, 西崎博光, 宇津呂武仁, 星野准一

    電子情報通信学会技術研究報告(Web)   120 ( 319(MVE2020 30-41) )   2021( ISSN:2432-6380 )

     More details

    Language:Japanese   Publishing type:Research paper (research society, symposium materials, etc.)  

    J-GLOBAL

    researchmap

  • Voice activity detection for live speech of baseball game based on tandem connection with speech/noise separation model Reviewed

    Yuto Nonaka, Chee Siang Leow, Akio Kobayashi, Takehito Utsuro, Hiromitsu Nishizaki

    Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH   6   4595 - 4599   2021( ISSN:2308-457X  ISBN:9781713836902  eISSN:1990-9772 )

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    When applying voice activity detection (VAD) to a noisy sound, in general, noise reduction (speech separation) and VAD are performed separately. In this case, the noise reduction may suppress the speech, and the VAD may not work well for the speech after the noise reduction. This study proposes a VAD model through the tandem connection of neural network-based noise separation and a VAD model. By training the two models simultaneously, the noise separation model is expected to be trained to consider the VAD results, and thus effective noise separation can be achieved. Moreover, the improved speech/noise separation model will improve the accuracy of the VAD model. In this research, we deal with real-live speeches from baseball games, which have a very poor signal-to-noise ratio. The VAD experiments showed that the VAD performance at the frame level achieved 4.2 points improvement in F1-score by tandemly connecting the speech/noise separation model and the VAD model.

    DOI: 10.21437/Interspeech.2021-792

    Scopus

    researchmap

  • ExKaldi-RT: A Real-Time Automatic Speech Recognition Extension Toolkit of Kaldi Reviewed

    Yu Wang, Chee Siang Leow, Akio Kobayashi, Takehito Utsuro, Hiromitsu Nishizaki

    2021 IEEE 10th Global Conference on Consumer Electronics, GCCE 2021   320 - 324   2021(  ISBN:9781665436762 )

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    This paper describes the ExKaldi-RT online automatic speech recognition (ASR) toolkit that is implemented based on the Kaldi ASR toolkit and Python language. ExKaldi-RT provides tools for building online recognition pipelines. While similar tools are available built on Kaldi, a key feature of ExKaldi-RT that it works on Python, which has an easy-to-use interface that allows online ASR system developers to develop original research, such as by applying neural network-based signal processing and by decoding model trained with deep learning frameworks. We performed benchmark experiments on the minimum LibriSpeech corpus, and it showed that ExKaldi-RT could achieve competitive ASR performance in real-time recognition.

    DOI: 10.1109/GCCE53005.2021.9621992

    Scopus

    researchmap

  • End-To-End Inflorescence Measurement for Supporting Table Grape Trimming with Augmented Reality Reviewed

    Prawit Buayai, Kabin Yok-In, Daisuke Inoue, Chee Siang Leow, Hiromitsu Nishizaki, Koji Makino, Xiaoyang Mao

    Proceedings - 2021 International Conference on Cyberworlds, CW 2021   101 - 108   2021(  ISBN:9781665440653 )

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    Inflorescence trimming is a crucial process to produce high-quality table grapes. It can eliminate nutrient competition in a bunch and makes it less vulnerable to disease development. After trimming, the remaining part of the inflorescence should have a target length decided by the grape variety. This is challenging for novice farmers because of the time constraint. The farmer needs to finish trimming the inflorescence before the berries develop. This paper proposes a novel end-To-end inflorescence length measurement method for supporting a trimming process with augmented reality technology. The proposed technique makes use of the state-of-The-Art deep neural network model for detecting the inflorescence area, as well as the scissors from the images captured with a camera installed on an optical see-Through head-mounted display. A new algorithm is designed to estimate the length of the remaining inflorescence with the screw of the scissors loop as the calibrator. The estimated length is then visualized on the head-mounted display to support the farmer in performing the trimming correctly and efficiently. The experiment, conducted with real inflorescence trimming tasks, shows that the mean absolute error of the length estimation is only 0.19 cm, which is small enough for use in real applications.

    DOI: 10.1109/CW52790.2021.00022

    Scopus

    researchmap

  • Spoken Dialog Training System for Customer Service Improvement Reviewed

    Yuta Sano, Chee Siang Leow, Soichiro Iiday, Takehito Utsuroy, Junichi Hoshino, Akio Kobayashi, Hiromitsu Nishizaki

    2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2020 - Proceedings   403 - 408   2020.12(  ISBN:9789881476883 )

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    In the hospitality industry, including convenience stores and airport service counters, operational staff must be trained to serve customers satisfactorily and to avoid problems with them. This study investigates a spoken dialog training system for improving customer service by operational staff. In a conventional spoken dialog system, a system user uses the dialog system as a customer, and the dialog agent assists the system user in fulfilling his or her requirements. In our system, in contrast, the dialog agent plays the role of the customer. Consequently, the behaviors of the human and the customer are opposite to those in a traditional dialog system, and there has thus far been no research on such systems. This paper introduces a prototype of such a system that we have developed and describes a simple experiment. The results of the experiment confirm the usefulness of our system for hospitality training.

    Scopus

    researchmap

  • ExKaldi: a python-based extension tool of kaldi Reviewed

    Yu Wang, Chee Siang Leow, Hiromitsu Nishizaki, Akio Kobayashi, Takehito Utsuro

    2020 IEEE 9th Global Conference on Consumer Electronics, GCCE 2020   929 - 932   2020.10(  ISBN:9781728198026 )

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    We present ExKaldi, an automatic speech recognition (ASR) toolkit, which is implemented based on the Kaldi toolkit and Python language. While similar Kaldi wrappers are available, a key feature of ExKaldi is an integrated strategy to build ASR systems, including processing feature and alignment, training an acoustic model, training, querying N-grams language model, decoding and scoring. Primarily, ExKaldi builds a bridge between Kaldi and deep learning frameworks to help users customize a hybrid hidden Markov model-deep neural network-based ASR system. We performed benchmark experiments on the TIMIT corpus and revealed that ExKaldi could build a system from scratch with Python and achieved reasonable recognition accuracy. The toolkit is open-source and released under the Apache license.

    DOI: 10.1109/GCCE50665.2020.9291717

    Scopus

    researchmap

  • Development of a low-latency and real-time automatic speech recognition system Reviewed

    Chee Siang Leow, Tomoaki Hayakawa, Hiromitsu Nishizaki, Norihide Kitaoka

    2020 IEEE 9th Global Conference on Consumer Electronics, GCCE 2020   925 - 928   2020.10(  ISBN:9781728198026 )

     More details

    Authorship:Lead author   Language:English   Publishing type:Research paper (international conference proceedings)  

    In this study, a real-time automatic speech recognition (ASR) system based on the Kaldi ASR toolkit, with low-latency and customizable models, without any internet connection, was developed. The proposed ASR system includes a voice activity detection (VAD) module and an audio transmitter as a front-end speech processing and a decoder for the received audio signals. The ASR system was evaluated in terms of ASR accuracy and speech processing speed. Consequently, the ASR system achieved high ASR accuracy on the CSJ (Corpus of Spontaneous Japanese) test set with super low-latency.

    DOI: 10.1109/GCCE50665.2020.9291818

    Scopus

    researchmap

  • Development and Evaluation of a Low-Latency Real-Time ASR system based on Kaldi.

    LEOW Chee Siang, 早川友瑛, 西崎博光, 北岡教英

    日本音響学会研究発表会講演論文集(CD-ROM)   2020   2020( ISSN:1880-7658 )

     More details

    Authorship:Lead author   Language:Japanese   Publishing type:Research paper (research society, symposium materials, etc.)  

    J-GLOBAL

    researchmap

  • Prototype Development of a Spoken Dialog System for Customer Service Training

    佐野祐太, LEOW Chee Siang, 飯田宗一郎, 西崎博光, 星野准一, 宇津呂武仁

    日本音響学会研究発表会講演論文集(CD-ROM)   2020   2020( ISSN:1880-7658 )

     More details

    Language:Japanese   Publishing type:Research paper (research society, symposium materials, etc.)  

    J-GLOBAL

    researchmap

  • 深層学習を用いた屋外環境における自動ガス源探索 : 入力するセンサデータの長さについての検討—Autonomous Gas Source Localization in an Outdoor Environment Using Deep Learning : Investigation of Length of Sensor Data to Feed to the Network

    山本 晃史, Christian Bilgera, Chee-Siang Leow, 澤野 真樹, 松倉 悠, 澤田 直輝, 西崎 博光, 石田 寛

    「センサ・マイクロマシンと応用システム」シンポジウム論文集 電気学会センサ・マイクロマシン部門 [編]   36   4p   2019.11

     More details

    Language:Japanese   Publishing type:Research paper (research society, symposium materials, etc.)   Publisher:[東京] : Institute of Electrical Engineers of Japan  

    CiNii Research

    researchmap

    Other Link: http://id.ndl.go.jp/bib/031476283

  • Application of Sequence Input and Output Long Short-Term Memory Neural Networks for Autonomous Gas Source Localization in an Outdoor Environment Reviewed

    Akifumi Yamamoto, Christian Bilgera, Maki Sawano, Haruka Matsukura, Naoki Sawada, Chee Siang Leow, Hiromitsu Nishizaki, Hiroshi Ishida

    ISOEN 2019 - 18th International Symposium on Olfaction and Electronic Nose, Proceedings   2019.5(  ISBN:9781538683279 )

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    In outdoor environments, fluctuating airflow and gas distribution make gas source localization (GSL) tasks difficult. In our research, we use neural networks (NNs) to overcome these difficulties by applying long short-Term memory deep neural networks (LSTM-DNNs) to time series data taken from a gas sensor array and anemometer to estimate the position of a gas source. In this paper, we present NNs for GSL with the ability to use various length input data and estimate a gas source location each time-step. In doing so, we were able to estimate the location of a gas source within 40 time-steps (20 s) and achieved (using 300 time-steps) an estimation accuracy of 95%.

    DOI: 10.1109/ISOEN.2019.8823160

    Scopus

    researchmap

  • 既知の工場環境音を用いた深層学習に基づく工作機械雑音除去の検討

    LEOW Chee Siang, 西崎博光

    日本音響学会研究発表会講演論文集(CD-ROM)   2019   2019( ISSN:1880-7658 )

     More details

    Authorship:Lead author   Language:Japanese   Publishing type:Research paper (research society, symposium materials, etc.)  

    J-GLOBAL

    researchmap

  • Speech Recognition-based Evaluation of a Noise Reduction Method in Known-Noise Environment

    LEOW Chee Siang, NISHIZAKI Hiromitsu, KOBAYASHI Akio, UTSURO Takehito

    情報処理学会研究報告(Web)   2019 ( SLP-130 )   2019

     More details

    Authorship:Lead author   Language:Japanese   Publishing type:Research paper (research society, symposium materials, etc.)  

    J-GLOBAL

    researchmap

  • Operation verification of deep learning applications on small computers Reviewed

    Hiromitsu Nishizaki, Chee Siang Leow, Koji Makino

    IEEJ Transactions on Electronics, Information and Systems   138 ( 9 )   1108 - 1115   2018( ISSN:0385-4221  eISSN:1348-8155 )

     More details

    Language:Japanese   Publishing type:Research paper (scientific journal)  

    Recently, deep learning technologies have been in the spotlight. Deep learning is one of a powerful technology to classify or recognize objects which captured by a camera. Such application has a high affinity with Internet-of-Things (IoT) devices. Therefore, it is considered that these technologies are used in embedded systems and IoT devices. In this paper, we verify deep learning applications like image classification can work well on a small computer such as Raspberry Pi. We develop three deep learning applications by using two types of deep learning frameworks (libraries). We prepare four types of small computers, and these applications are tested on the computers. In addition, we also investigate the relationship between the processing time, the memory consumption and the number of parameters of the deep neural network model. The verification experiments show that a program based on a deep learning library implemented by C++ language fast run and simple neural network models could work in real-time on small computers. Besides, the other experiment also clears that the more parameters increase the processing time and the consumption memory in proportion without depending on the deep learning libraries and small computers.

    DOI: 10.1541/ieejeiss.138.1108

    Scopus

    researchmap

▼display all

Presentations

  • 農作物盗難防止のための小型エッジデバイスで動作する音分類モデル

    遠藤陽季,矢島英明,レオ チーシャン,丹沢勉,牧野浩二,石田和義,西崎博光

    情報処理学会第87回全国大会  2025.3  一般社団法人 情報処理学会

     More details

    Event date: 2025.3

    Language:Japanese   Presentation type:Oral presentation(general)  

    Venue:吹田市   Country:Japan  

  • 書類画像における枠の関連性を考慮した項目間の関係性抽出モデル

    佐藤創哉,矢島英明,レオ チーシャン,西崎博光

    情報処理学会第87回全国大会  2025.3  一般社団法人 情報処理学会

     More details

    Event date: 2025.3

    Language:Japanese   Presentation type:Oral presentation(general)  

    Venue:吹田市   Country:Japan  

  • ブドウ栽培支援ロボットのためのQRコード誘導型自律移動システムの開発

    藤本 蓮,レオ チーシャン,矢島英明,牧野浩二,茅 暁陽,西崎博光

    情報処理学会第87回全国大会  2025.3  一般社団法人 情報処理学会

     More details

    Event date: 2025.3

    Language:Japanese   Presentation type:Oral presentation(general)  

    Venue:吹田市   Country:Japan  

  • 特徴点マッチングと寸法を活用した2D-CAD図面検索

    西尾直樹,レオ チーシャン,西崎博光

    情報処理学会第87回全国大会  2025.3  一般社団法人 情報処理学会

     More details

    Event date: 2025.3

    Language:Japanese   Presentation type:Oral presentation(general)  

    Venue:吹田市   Country:Japan  

  • YAMANASHI AIハッカソン2024:生成AIを活用してアプリを作ってみよう!

    レオ チーシャン

    2024.11  山梨県DX・情報政策推進統括官主催(山梨大学協力)

     More details

    Event date: 2024.11 - 2024.12

    Language:Japanese   Presentation type:Public discourse, seminar, tutorial, course, lecture and others  

    Venue:山梨県、甲府市、山梨大学   Country:Japan  

  • 複数の大規模言語モデルを用いた円卓会議による接客評価

    渡辺蒼,レオ チーシャン,西崎博光,星野准一,宇津呂 武仁

    人工知能学会第38回全国大会  2024.5  一般社団法人人工知能学会

     More details

    Event date: 2024.5

    Language:Japanese   Presentation type:Oral presentation(general)  

    Venue:浜松市   Country:Japan  

▼display all

Awards

  • Best Paper Award Major achievement

    2024.10   23rd International Conference on Cyberworlds (CW2024)   High Quality Color Estimation of Shine Muscat Grape Using Vision Transformer

    Ryosuke Shimazu, Chee Siang Leow, Prawit Buayai, Koji Makino, Xiayang Mao, Hiromitsu Nishizaki

     More details

    Award type:Award from international society, conference, symposium, etc.  Country:Japan

    Currently, skilled farmers judge the ripeness of the Shine Muscat grape variety by looking at the color on the surface of the grapes. However, the color of Shine Muscat grapes does not change much as they grow, and there are individual differences in the way the color is perceived. Furthermore, the same color can look very different depending on the exposure to sunlight and shadows. Therefore, there is a need for a system that can quantitatively determine the color of Shine Muscat grapes to pass on the harvesting techniques of experienced farmers to amateurs and inexperienced farmers. This research aims to improve the accuracy of the color estimation of Shine Muscat grapes using deep learning. We propose a method to estimate the color of individual grapes using a color estimation model with a self-attention mechanism, from which the color of the whole bunch is estimated. A Vision Transformer model with a self-attention mechanism was found to improve the color estimation accuracy to 96.9\%. Furthermore, by eliminating outliers using the interquartile range, a color estimation accuracy of 97.2\% could be achieved, demonstrating the effectiveness of the new color estimation model.

  • Excellent Speaker Award

    2023.12   SI2023  

     More details

    Award type:Award from Japanese society, conference, symposium, etc.  Country:Japan

    researchmap

  • Excellent Paper Award

    2020.10   IEEE GCCE2020   ExKaldi: A Python-Based Extension Tool of Kaldi

    Yu Wang, Chee Siang Leow, Hiromitsu Nishizaki, Akio Kobayashi, Takehito Utsuro

     More details

    Award type:Award from international society, conference, symposium, etc. 

    researchmap

Teaching Experience (On-campus)

  • Embedded Programming I (Exercise)

    2025Year

  • Multimedia Engineering

    2025Year

  • C Language Programming

    2024Year

  • Multimedia Engineering

    2024Year

Guidance results

  • 2024

    Type:Undergraduate (Major A course)graduation thesis guidance  Period:1months  Year guidance time:200hours

    Number of people receiving guidance :4people 

    Graduation / pass / number of people awarded degrees :4people 

    Number of teachers:2people

  • 2024

    Type:Master's (Major B course)dissertations guidance  Period:1months  Year guidance time:120hours

    Number of people receiving guidance :8people  (Overseas students):2people

    Graduation / pass / number of people awarded degrees :6people 

    Number of teachers:2people