Tutorials

Tutorial 1: Objects, Relationships, and Context in Visual Data

June 11, 10:00-13:00, Hall

Abstract: For decades, we are interested in detecting objects and classifying them into a fixed vocabulary of lexicon. With the maturity of these low-level vision solutions, we are hunger for a higher-level representation of the visual data, so as to extract visual knowledge rather than merely bags of visual entities, allowing machines to reason about human-level decision-making and even manipulate the visual data at the pixel-level. In this tutorial, we will introduce a various of machine learning techniques for modeling visual relationships (e.g., subject-predicate-object triplet detection) and contextual generative models (e.g., generating photo-realistic images using conditional generative adversarial networks). In particular, we plan to start from fundamental theories on object detection, relationship detection, generative adversarial networks, to more advanced topics on referring expression visual grounding, pose guided person image generation, and context based image inpainting.

Hanwang Zhang (Nanyang Technological University, Singapore)
Dr. Hanwang Zhang is an Assistant Professor at Nanyang Technological University, Singapore. He was a research scientist at the Department of Computer Science, Columbia University, USA and a senior research fellow at the School of Computing, National University of Singapore, Singapore. He has received the B.Eng (Hons.) degree in computer science from Zhejiang University, Hangzhou, China, in 2009, and the Ph.D. degree in computer science from the National University of Singapore in 2014. His research interest includes computer vision, multimedia, and social media. Dr. Zhang is the recipient of the Best Demo runner-up award in ACM MM 2012, the Best Student Paper award in ACM MM 2013, and the Best Paper Honorable Mention in ACM SIGIR 2016. He is also the winner of Best Ph.D. Thesis Award of School of Computing, National University of Singapore, 2014.

Qianru Sun (Max-Planck Institute for Informatics, Germany)
Dr. Qianru Sun is a postdoctoral researcher in the department of Computer Vision and Multimodal Computing, Max-Planck Institute for Informatics, Germany. She holds a PhD degree from the School of Electronics Engineering and Computer Science, Peking University since Jan. 2016. Her research interests include computer vision and pattern recognition. Specific experiences have been made in human action recognition, anomaly event detection in videos, social relation recognition and head image inpainting in social media photos, person image generation for both low-resolution re-identification images and high-resolution fashion photos.

Tutorial 2: Recommendation Technologies for Multimedia Content

June 11, 14:30-17:30, Hall

Abstract: Recommendation systems play a vital role in online information systems and have become a major monetization tool for user-oriented platforms. In recent years, there has been increasing research interest in recommendation technologies in the information retrieval and data mining community, and significant progress has been made owing to the fast development of deep learning. However, in the multimedia community, there has been relatively less attention paid to the development of multimedia recommendation technologies. In this tutorial, we summarize existing research efforts on multimedia recommendation. We first provide an overview on fundamental techniques and recent advances on personalized recommendation for general items. We then summarize existing developments on recommendation technologies for multimedia content. Lastly, we present insight into the challenges and future directions in this emerging and promising area.

Xiangnan He (National University of Singapore, Singapore)
Dr. Xiangnan He is a senior research fellow with School of Computing, National University of Singapore (NUS). He received his Ph.D. in Computer Science from NUS. His research interests span recommender systems, information retrieval, and multi-media processing. He has over 30 publications appeared in several top conferences such as SIGIR, WWW, MM, CIKM, and IJCAI, and journals including TKDE, TOIS, and TMM. His work on recommender systems has received the Best Paper Award Honorable Mention of ACM SIGIR 2016. Moreover, he has served as the PC member for the prestigious conferences including SIGIR, WWW, MM, KDD, WSDM, CIKM, IJCAI, AAAI, and ACL, and the regular reviewer for prestigious journals including TKDE, TOIS, TKDD, TMM etc.

Tat-Seng Chua (National University of Singapore, Singapore)
Dr. Tat-Seng Chua is the KITHCT Chair Professor at the School of Computing, National University of Singapore. He holds a PhD from the University of Leeds, UK. He was the Acting and Founding Dean of the School from 1998-2000. Dr Chua’s main research interest is in multimedia information retrieval and social media analytics. In particular, his research focuses on the extraction, retrieval and question-answering (QA) of text and rich media arising from the Web and multiple social networks. He is the co-Director of NExT, a joint Center between NUS and Tsinghua University to develop technologies for live social media search. Dr Chua is the 2015 winner of the prestigious ACM SIGMM award for Outstanding Technical Contributions to Multimedia Computing, Communications and Applications. He is the Chair of steering committee of ACM International Conference on Multimedia Retrieval (ICMR) and Multimedia Modeling (MMM) conference series. Dr Chua is also the General Co-Chair of ACM Multimedia 2005, ACM CIVR (now ACM ICMR) 2005, ACM SIGIR 2008, and ACM Web Science 2015. He serves in the editorial boards of four international journals. Dr. Chua is the co-Founder of two technology startup companies in Singapore.

Tutorial 3: Multimedia Content Understanding by Learning from Very Few Examples: Recent Progress on Unsupervised, Semi-Supervised and Supervised Deep Learning Approaches

June 11, 16:00-17:30, Room A

Abstract: In this tutorial, the speaker will present serval parallel efforts on building deep learning models with very few supervision information, with or without unsupervised data available. In particular, we will discuss in details.

Generative Adverbial Nets (GANs) and their applications to unsupervised feature extractions, semi-supervised learning with few labeled examples and a large amount of unlabeled data. We will discuss the state-of-the-art results that have been achieved by the semi-supervised GANs.
Low-Shot Learning algorithms to train and test models on disjoint sets of tasks. We will discuss the ideas of how to efficiently adapt models to tasks with very few examples. In particular, we will discuss several paradigms of learning-to-learn approaches.
We will also discuss how to transfer models across modalities by leveraging abundant labels from one modality to train a model for other modalities with few labels. We will discuss in details the cross-modal label transfer approach.

Guo-Jun Qi (University of Central Florida)
Dr. Qi is a faculty member in the Department of Computer Science at the University of Central Florida. His research interests include knowledge discovery, analysis and aggregation of big data deluging from a variety of modalities and sources in order to build smart and reliable information and decision-making systems. He aspires to apply my research to solve the practical problems through high quality data processing and analysis in healthcare, sensor and social networks, financial systems and so forth. He was the recipient of one-time Microsoft Fellowship, and twice IBM Fellowships. His research has been sponsored by grants and projects from government agencies and industry collaborators, including NSF, IARPA, Microsoft, IBM, and Adobe. Dr. Qi has published more than 100 papers in a broad range of venues, such as Proceedings of IEEE, IEEE T PAMI, IEEE T KDE, IEEE T Image Processing, ACM SIGKDD, WWW, ICML, ACM MM, CVPR, ICDM, SDM and ICDE. Among them are the best student paper of ICDM 2014, “the best ICDE 2013 paper” by IEEE Transactions on Knowledge and Data Engineering, as well as the best paper (finalist) of ACM Multimedia 2007 (2015). He has served or will serve as a technical program co-chair for MMM 2016 and ACM Multimedia 2020, and an area chair (a senior program committee member) for ICCV, ICPR, ACM SIGKDD, ACM CIKM, as well as ACM Multimedia. He is also serving or has served in the program committees of several academic conferences, including CVPR, ICCV, KDD, WSDM, CIKM, IJCAI, ICMR, ACM Multimedia, ACM/IEEE ASONAM, ICDM, ICIP, and ACL. He is an associate editor for IEEE Transactions on Circuits and Systems for Video Technology (CSVT), as well as a guest/lead editor for the special issue on “Big Media Data: Understanding, Search, and Mining” in IEEE Transactions on Big Data, “Deep Learning for Multimedia Computing” in IEEE Transactions on Multimedia, and “Social Media Mining and Knowledge Discovery” in Multimedia Systems, Springer. He was also a panelist for the NSF and the United States Department of Energy.

Tutorials

Tutorial 1: Objects, Relationships, and Context in Visual Data

June 11, 10:00-13:00, Hall

Tutorial 2: Recommendation Technologies for Multimedia Content

June 11, 14:30-17:30, Hall

Tutorial 3: Multimedia Content Understanding by Learning from Very Few Examples: Recent Progress on Unsupervised, Semi-Supervised and Supervised Deep Learning Approaches

June 11, 16:00-17:30, Room A

Information on ICMR2018

Venue Information

Yokohama Media and Communication Center
11 Nihon-Odori, Naka-ku, Yokohama, 231-0023, Japan

Tutorials

Tutorial 1: Objects, Relationships, and Context in Visual Data

June 11, 10:00-13:00, Hall

Tutorial 2: Recommendation Technologies for Multimedia Content

June 11, 14:30-17:30, Hall

Tutorial 3: Multimedia Content Understanding by Learning from Very Few Examples: Recent Progress on Unsupervised, Semi-Supervised and Supervised Deep Learning Approaches

June 11, 16:00-17:30, Room A

Information on ICMR2018

Venue Information

Yokohama Media and Communication Center11 Nihon-Odori, Naka-ku, Yokohama, 231-0023, Japan

Yokohama Media and Communication Center
11 Nihon-Odori, Naka-ku, Yokohama, 231-0023, Japan