
Prof. Chi Man Pun,
IEEE Senior Member
University of Macau, Macau, China
BIO: Prof. Pun received his Ph.D. degree in Computer Science and Engineering from the Chinese University of Hong Kong in 2002, and his M.Sc. and B.Sc. degrees from the University of Macau. He had served as the Head of the Department of Computer and Information Science, University of Macau from 2014 to 2019, where he is currently a Professor and in charge of the Image Processing and Pattern Recognition Laboratory. He has investigated many externally funded research Projects as PI, and has authored/co-authored more than 200 refereed papers in many top-tier Journals (including T-PAMI, T-IFS, T-IP, T-DSC, T-KDE, and T-MM) and Conferences (including CVPR, ICCV, ECCV, AAAI, ICDE, IJCAI, MM, and VR). He has also co-invented several China/US Patents, and is the recipient of the Macao Science and Technology Award 2014 and the Best Paper Award in the 6th Chinese Conference on Pattern Recognition and Computer Vision (PRCV2023). Dr. Pun has served as the General Chair for the 10th &11th International Conference Computer Graphics, Imaging and Visualization (CGIV2013, CGIV2014), the 13th IEEE International Conference on e-Business Engineering (ICEBE2016), and the General Co-Chair for the IEEE International Conference on Visual Communications and Image Processing (VCIP2020) and the International Workshop on Advanced Image Technology (IWAIT2022), and the Program/Local Chair for several other international conferences. He has also served as the SPC/PC member for many top CS conferences such as AAAI, CVPR, ICCV, ECCV, MM, etc. He is currently serving as the editorial board member for the journal of Artificial Intelligence (AIJ). Besides, he has been listed in the World's Top 2% Scientists by Stanford University since 2020. His research interests include Image Processing and Pattern Recognition; Multimedia Information Security, Forensic and Privacy; Adversarial Machine Learning and AI Security, etc. He is also a senior member of the IEEE.
Speech Title: Object Segmentation with Test-Time Training and Promptable Models
Abstract: Object segmentation is a hot topic in computer vision and pattern recognition. It aims to localize the desired object in images and videos, which is of great importance in image understanding, video editing, human-computer interaction, etc. Despite advances, existing methods still fall short in terms of robustness, versatility, label-hungry, etc. This talk aims to advance object segmentation in images and videos by leveraging the power of deep learning. (1) Mainstream solutions mainly focus on learning a single model on large-scale video datasets, which struggle to generalize to unseen videos. This talk introduces a test-time training (TTT) strategy to address the problem. (2) Most existing methods are limited to specific prompt types, lacking universality. This talk introduces the first universal Promptable Video Object Segmentation (PVOS) model, capable of handling both geometric and multimodal prompts simultaneously. It is built on the powerful Segment Anything Model 2 (SAM2), which has learned robust representations from large-scale datasets. We enhance SAM2 with additional trainable parameters to support multimodal prompts, including text and audio. (3) Object segmentation based on supervised learning is time-consuming and labor-intensive due to the need for manual data annotation. This talk presents an unsupervised object segmentation method in images and videos. We design a simple yet effective pipeline to learn object segmentation models without human annotations. Overall, the proposed methods overcome the limitations of existing methods and improve performance.

Prof. Yen-Wei Chen
Ritsumeikan University, Japan
BIO: Yen-Wei Chen received the B.E. degree in 1985 from Kobe Univ., Kobe, Japan, the M.E. degree in 1987, and the D.E. degree in 1990, both from Osaka Univ., Osaka, Japan. He was a research fellow with the Institute for Laser Technology, Osaka, from 1991 to 1994. From Oct. 1994 to Mar. 2004, he was an associate Professor and a professor with the Department of Electrical and Electronic Engineering, Univ. of the Ryukyus, Okinawa, Japan. He is currently a professor with the college of Information Science and Engineering, Ritsumeikan University, Japan. He is the founder and the first director of Center of Advanced ICT for Medicine and Healthcare, Ritsumeikan University, Japan.
Speech Title: Artificial Intelligence and Virtual Reality in Bio-Medical Applications
Abstract: Artificial Intelligence (AI) and Virtual Reality (VR) have become key technologies and are playing today important roles in many academic and industrial areas. Applications of AI and VR in medicine and healthcare have received increasing attention in recent years. The AI technique has been widely used in computer-aided diagnosis (CAD) and the VR technique has been used in computer-aided surgery (CAS) such as surgery simulator. In this talk, I will introduce fundamental basis of AI for CAD and some applications of AI for diagnosis of hepatic disease. I will also introduce some applications of VR technique for support and simulations of hepatic surgery.

Prof. Seokwon Yeom
Daegu University Gyeongsan, South
Korea
BIO:
Seokwon Yeom has been a faculty
member of Daegu University since 2007. He has a Ph.D. in Electrical
and Computer Engineering from the University of Connecticut in 2006.
He has been a guest editor of Drones and Applied Sciences in MDPI
since 2019. He has served as a board member of the Korean Institute
of Intelligent Systems since 2016, and a member of the board of
directors of the Korean Institute of Convergence Signal Processing
since 2014. He has been program chair of several international
conferences. He was a vice director of the AI homecare center and a
head of the department of IT convergence engineering at Daegu
University in 2020-2023, a visiting scholar at the University of
Maryland in 2014, and a director of the Gyeongbuk techno-park
specialization center in 2013. He has been a keynote or invited
speaker at several international conferences. His research interests
are intelligent image and optical information processing, deep and
machine learning, target tracking, and drone localization.
Speech Title: The Localization of a Flying Drone with Multiple Optimal Windows
Abstract: In this keynote speech, the localization of a flying drone is performed based on the video frames captured by its camera. A novel frame-to-frame template-matching technique is presented. The velocity of the drone is computed through frame-to-frame template matching using optimal windows. Multiple templates are defined by their corresponding windows in a frame. The size and location of the windows are obtained by minimizing the sum of the least square errors between the piecewise linear regression model and the nonlinear image-to-position conversion function. The displacement between two consecutive frames is obtained via frame-to-frame template matching that minimizes the sum of normalized squared differences. In the experiments, various scenarios including short and medium range flights in urban and rural areas were tested. The drone starts from a hovering state, reaches top speed, and then continues to fly along a designated path. It will be shown that the proposed method achieves average drift errors and RMSE within a few meters.