July 23-27, 2018 @ San Diego, CA, USA
In conjunction with the 2018 IEEE International Conference on Multimedia and Expo (ICME)
All accepted papers which are registered by the author deadline and presented at the conference will be included in IEEE Xplore.
Time | Title | Presenter/Author |
---|---|---|
8:30 – 8:40 | Opening remarks | Dr. Sijia Liu |
8:40 – 9:20 | Keynote Talk: A Multi-task Learning framework for Head Pose Estimation and Actor-Action Semantic Video Segmentation | Prof. Yan Yan |
9:21 – 9:38 | Paper #46 Video Super Resolution Based on Deep Convolution Neural Network with Two-stage Motion Compensation | Haoyu Ren, Mostafa El Khamy, Jungwon Lee |
9:39 – 9:56 | Paper #55 A Fast No-reference Screen Content Image Quality Prediction using Convolutional Neural Networks | Zhengxue Cheng, Masaru Takeuchi, Kenji Kanai, Jiro Katto |
9:57 – 10:14 | Paper #57 An Enhanced Deep Convolutional Neural Network for Person Re-identification | Tiansheng Guo, Dongfei Wang, Zhuqing Jiang, Aidong Men, Yun Zhou |
10:15 – 10:32 | Paper #71 Single Image Haze Removal via Joint Estimation of Detail and Transmission | Shengdong Zhang, Yao Jian, Wenqi Ren |
Coffee Break (10:33 – 10:45) | ||
10:46 – 11:03 | Paper #82 Deep Global and Local Saliency Learning with New Re-ranking for Person Re-Identification | Wei Fei, Zhicheng Zhao, Fei Su |
11:04 – 11:21 | Paper #95 Hierarchical Learning of Sparse Image Representations using Steered Mixture of Experts | Rolf Jongebloed, Ruben Verhack, Lieven Lange, Thomas Sikora |
11:22 – 11:39 | Paper #123 HDR Image Reconstruction Using Locally Weighted Linear Regression | Xiaofen Li,Yongqing Huo |
11:40 – 11:57 | Paper #124 Supporting Collaboration Among Cyber Security Analysts Through Visualizing their Analytical Reasoning Processes | Lindsey Thomas, Adam Vaughan, Zachary Courtney, Chen Zhong, Awny Alnusair |
11:58 – 12:15 | Paper #146 Robust Weighted Regression for Ultrasound Image Super-Resolution | Walid Sharabati, Bowei Xi |
12:16 – 12:33 | Paper #150 A Two Layer Pairwise Framework to Approximate Super pixel-based Higher order Conditional Random filed for Semantic Segmentation | Li Sulimowicz, Ishfaq Ahmad, Alexander Aved |
This workshop focuses on the emerging field of multimedia creation using machine learning (ML) and artificial intelligence (AI) approaches. It aims to bring together researchers from ML and AI and practitioners from multimedia industry to foster multimedia creation. Multimedia creation, including style transfer and image synthesis, have been a major focus of machine learning and AI societies, owing to the recent technological breakthroughs such as generative adversarial networks (GANs). This workshop seeks to reinforce the implications to multimedia creation. It publishes papers on all emerging areas of content understanding and multimedia creation, all traditional areas of computer vision and data mining, and selected areas of artificial intelligence, with a particular emphasis on machine learning for pattern recognition. The applied fields such as art content creation, medical image and signal analysis, massive video/image sequence analysis, facial emotion analysis, control system for automation, content-based retrieval of video and image, and object recognition are also covered. The workshop is expected to provide an interactive platform to researchers, scientists, professors, and students to exchange their innovative ideas and experiences in the areas of Multimedia, and to specialize in the field of multimedia from underlying cutting-edge technologies to applications.
We intend to have a half-day workshop with four to five regular talks.
Potential topics of interest include ML and AI on Multimedia in areas of but not limited to:
All submitted paper will be reviewed by 3 program committee members.
Prof. Yan Yan, Assistant Professor at Texas State University
Abstract: Multi-task learning, as one important branch of machine learning, has developed very fast during the past decade. Multi-task learning methods aim to simultaneously learn classification or regression models for a set of related tasks. This typically leads to better models as compared to a learner that does not account for task relationships. In this talk, we will investigate a multi-task learning framework for head pose estimation and actor-action segmentation. (1) Head pose estimation from low-resolution surveillance data has gained in importance. However, monocular and multi-view head pose estimation approaches still work poorly under target motion, as facial appearance distorts owing to camera perspective and scale changes when a person moves around. We propose FEGA-MTL, a novel framework based on multi-task learning for classifying the head pose of a person who moves freely in an environment monitored by multiple, large field-of-view surveillance cameras. Upon partitioning the monitored scene into a dense uniform spatial grid, FEGA-MTL simultaneously clusters grid partitions into regions with similar facial appearance, while learning region-specific head pose classifiers. (2) Fine-grained activity understanding in videos has attracted considerable recent attention with a shift from action classification to detailed actor and action understanding that provides compelling results for perceptual needs of cutting-edge autonomous systems. However, current methods for detailed understanding of actor and action have significant limitations: they require large amounts of finely labeled data, and they fail to capture any internal relationship among actors and actions. To address these issues, we propose a novel, robust multi-task ranking model for weakly-supervised actor-action segmentation where only video-level tags are given for training samples. Our model is able to share useful information among different actors and actions while learning a ranking matrix to select representative supervoxels for actors and actions respectively.
Yan Yan is currently an Assistant Professor at Texas State University. He was a research fellow at the University of Michigan and at the University of Trento. He received his Ph.D in computer science from the University of Trento Italy, and the M.S. degree from Georgia Institute of Technology. He was a visiting scholar with Carnegie Mellon University in 2013 and a visiting research fellow with the Advanced Digital Sciences Center (ADSC), UIUC, Singapore in 2015. His research interests include computer vision, machine learning, and multimedia. He received the Best Student Paper Award in ICPR 2014 and Best Paper Award in ACM Multimedia 2015. He has published papers in CVPR / ICCV / ECCV / TPAMI / AAAI / IJCAI / ACM Multimedia. He has been PC members for several major conferences and reviewers for referred journals in computer vision and multimedia. He served as a guest editor in TPAMI, CVIU and TOMM. He is a member of the IEEE and the ACM.