IEEE Workshop on Machine Learning and Artificial Intelligence for Multimedia Creation

July 23-27, 2018 @ San Diego, CA, USA

In conjunction with the 2018 IEEE International Conference on Multimedia and Expo (ICME)

All accepted papers which are registered by the author deadline and presented at the conference will be included in IEEE Xplore.

Schedule

Time: 8:30 - 12:30
Date: Friday, July 27, 2018
Room: Milos

Time	Title	Presenter/Author
8:30 – 8:40	Opening remarks	Dr. Sijia Liu
8:40 – 9:20	Keynote Talk: A Multi-task Learning framework for Head Pose Estimation and Actor-Action Semantic Video Segmentation	Prof. Yan Yan
9:21 – 9:38	Paper #46 Video Super Resolution Based on Deep Convolution Neural Network with Two-stage Motion Compensation	Haoyu Ren, Mostafa El Khamy, Jungwon Lee
9:39 – 9:56	Paper #55 A Fast No-reference Screen Content Image Quality Prediction using Convolutional Neural Networks	Zhengxue Cheng, Masaru Takeuchi, Kenji Kanai, Jiro Katto
9:57 – 10:14	Paper #57 An Enhanced Deep Convolutional Neural Network for Person Re-identification	Tiansheng Guo, Dongfei Wang, Zhuqing Jiang, Aidong Men, Yun Zhou
10:15 – 10:32	Paper #71 Single Image Haze Removal via Joint Estimation of Detail and Transmission	Shengdong Zhang, Yao Jian, Wenqi Ren
Coffee Break (10:33 – 10:45)
10:46 – 11:03	Paper #82 Deep Global and Local Saliency Learning with New Re-ranking for Person Re-Identification	Wei Fei, Zhicheng Zhao, Fei Su
11:04 – 11:21	Paper #95 Hierarchical Learning of Sparse Image Representations using Steered Mixture of Experts	Rolf Jongebloed, Ruben Verhack, Lieven Lange, Thomas Sikora
11:22 – 11:39	Paper #123 HDR Image Reconstruction Using Locally Weighted Linear Regression	Xiaofen Li,Yongqing Huo
11:40 – 11:57	Paper #124 Supporting Collaboration Among Cyber Security Analysts Through Visualizing their Analytical Reasoning Processes	Lindsey Thomas, Adam Vaughan, Zachary Courtney, Chen Zhong, Awny Alnusair
11:58 – 12:15	Paper #146 Robust Weighted Regression for Ultrasound Image Super-Resolution	Walid Sharabati, Bowei Xi
12:16 – 12:33	Paper #150 A Two Layer Pairwise Framework to Approximate Super pixel-based Higher order Conditional Random filed for Semantic Segmentation	Li Sulimowicz, Ishfaq Ahmad, Alexander Aved

Motivation

This workshop focuses on the emerging field of multimedia creation using machine learning (ML) and artificial intelligence (AI) approaches. It aims to bring together researchers from ML and AI and practitioners from multimedia industry to foster multimedia creation. Multimedia creation, including style transfer and image synthesis, have been a major focus of machine learning and AI societies, owing to the recent technological breakthroughs such as generative adversarial networks (GANs). This workshop seeks to reinforce the implications to multimedia creation. It publishes papers on all emerging areas of content understanding and multimedia creation, all traditional areas of computer vision and data mining, and selected areas of artificial intelligence, with a particular emphasis on machine learning for pattern recognition. The applied fields such as art content creation, medical image and signal analysis, massive video/image sequence analysis, facial emotion analysis, control system for automation, content-based retrieval of video and image, and object recognition are also covered. The workshop is expected to provide an interactive platform to researchers, scientists, professors, and students to exchange their innovative ideas and experiences in the areas of Multimedia, and to specialize in the field of multimedia from underlying cutting-edge technologies to applications.

We intend to have a half-day workshop with four to five regular talks.

Topics

Potential topics of interest include ML and AI on Multimedia in areas of but not limited to:

Generative models for multimedia creation
AI for multimedia creation
Data mining techniques for multimedia creation
Synthesis and prediction of multimedia
Deep learning application in video and image analysis
Multi-modal data analysis
Medical image and signal analysis
Content of video and image extraction, analysis and application
Online and distributed computing for multimedia creation
Wireless technology and demonstrations for multimedia creation
Security, privacy and policy regulation for multimedia creation
Machine learning on social, emotional and affective multimedia
Augmented reality
Multimedia applied on control system for automation
Human-computer interaction
Signal processing including audio, video, image processing, and coding
Smart multimedia surveillance

Paper submission instructions

Papers must be no longer than 6 pages, including all text, figures and references.
Only electronic submissions will be accepted through CMT online system.
The templates for Microsoft Word and LaTeX submissions are available as below.

8.5" x 11" Word template downloadable from here.
LaTeX formatting macros downloadable from here.

For more details, please refer to the submission instruction of ICME 2018 Workshop here.

Important Dates

~~March 19~~ March 26, 2018: Due date for full workshop papers submission
~~April 23~~ April 27, 2018: Notification of paper acceptance to authors
May 11, 2018: Camera-ready of accepted papers
July 23-27, 2018: Workshop

Review procedure

All submitted paper will be reviewed by 3 program committee members.

Workshop Organizers

Workshop Chairs

Yanjia Sun, ADP, yanjia.sun@adp.com
Sijia Liu, IBM Research AI, lsjxjtu@gmail.com
Pin-Yu Chen, IBM T. J. Watson Research Center, pin-yu.chen@ibm.com
Tianpei Xie, Amazon, unidoctor@gmail.com

Technical Program Committee

Bhavya Kailkhura, Lawrence Livermore National Laboratory, USA
Yan Yan, University of Michigan, USA
Fangrong Peng, Delphi Coorporation, USA
Feng Han, Bloomberg, USA
Tan Yan, NEC Laboratories America, USA
Xin Zhang, Huawei Technologies, USA
Renqiang Min, NEC Laboratories America, USA
Onur Yilmaz, Nvidia, USA
Yanjie Fu, Missouri University of Science and Technology , USA
Shuo Chen, The Neat Company, USA
Shin-Ming Cheng, National Taiwan University of Science and Technology, Taiwan
Baichuan Zhang, Purdue University, USA
Xiaoliang Wang, Virginia State University, USA

Keynote Talk

A Multi-task Learning framework for Head Pose Estimation and Actor-Action Semantic Video Segmentation

Prof. Yan Yan, Assistant Professor at Texas State University

Abstract: Multi-task learning, as one important branch of machine learning, has developed very fast during the past decade. Multi-task learning methods aim to simultaneously learn classification or regression models for a set of related tasks. This typically leads to better models as compared to a learner that does not account for task relationships. In this talk, we will investigate a multi-task learning framework for head pose estimation and actor-action segmentation. (1) Head pose estimation from low-resolution surveillance data has gained in importance. However, monocular and multi-view head pose estimation approaches still work poorly under target motion, as facial appearance distorts owing to camera perspective and scale changes when a person moves around. We propose FEGA-MTL, a novel framework based on multi-task learning for classifying the head pose of a person who moves freely in an environment monitored by multiple, large field-of-view surveillance cameras. Upon partitioning the monitored scene into a dense uniform spatial grid, FEGA-MTL simultaneously clusters grid partitions into regions with similar facial appearance, while learning region-specific head pose classifiers. (2) Fine-grained activity understanding in videos has attracted considerable recent attention with a shift from action classification to detailed actor and action understanding that provides compelling results for perceptual needs of cutting-edge autonomous systems. However, current methods for detailed understanding of actor and action have significant limitations: they require large amounts of finely labeled data, and they fail to capture any internal relationship among actors and actions. To address these issues, we propose a novel, robust multi-task ranking model for weakly-supervised actor-action segmentation where only video-level tags are given for training samples. Our model is able to share useful information among different actors and actions while learning a ranking matrix to select representative supervoxels for actors and actions respectively.

Yan Yan is currently an Assistant Professor at Texas State University. He was a research fellow at the University of Michigan and at the University of Trento. He received his Ph.D in computer science from the University of Trento Italy, and the M.S. degree from Georgia Institute of Technology. He was a visiting scholar with Carnegie Mellon University in 2013 and a visiting research fellow with the Advanced Digital Sciences Center (ADSC), UIUC, Singapore in 2015. His research interests include computer vision, machine learning, and multimedia. He received the Best Student Paper Award in ICPR 2014 and Best Paper Award in ACM Multimedia 2015. He has published papers in CVPR / ICCV / ECCV / TPAMI / AAAI / IJCAI / ACM Multimedia. He has been PC members for several major conferences and reviewers for referred journals in computer vision and multimedia. He served as a guest editor in TPAMI, CVIU and TOMM. He is a member of the IEEE and the ACM.