Skip to content

Program at a glance

Day 1: Tutorial & Workshop (June 11)

Tutorial 1: Objects, Relationships, and Context in Visual Data (10:00-13:00, Hall)

Tutorial 2: Recommendation Technologies for Multimedia Content (14:30-17:30, Hall)

Tutorial 3: Multimedia Content Understanding by Learning from Very Few Examples: Recent Progress on Unsupervised, Semi-Supervised and Supervised Deep Learning Approaches (16:00-17:30, Room A)

Workshop 1: Workshop on Lifelog Search Challenge (10:00-15:30, Room A)

Workshop 2: Joint Workshop on Multimedia Artworks Analysis and Attractiveness Computing in Multimedia (10:00-12:55, Room B)

Workshop 3: Workshop on Multimedia for RETech’18 (14:30-17:30, Room B)

Reception (19:00-, Fisherman’s Market)

  • Fisherman’s Market. Located inside Yokohama Red Brick Warehouse
  • 11 mins walk from the conference location.

Day 2: Main conference (June 12)

Keynote 1 (10:00-11:00, Hall, Chair: Kiyoharu Aizawa)

The Ongoing Evolution of Broadcast Technology by Kohji Mitani (Science & Technology Research Laboratories NHK)

Best Paper Session (11:30-13:00, Hall, Chair: Benoit Huet)

[BS-1] Goncalo Marcelino, Ricardo Pinto and Joao Magalhaes: Ranking News-Quality Multimedia[BS-2] Niluthpol Mithun, Juncheng Li, Florian Metze and Amit Roy-Chowdhury: Learning Joint Embedding with Multimodal Cues for Cross-Modal Video-Text Retrieval[BS-3] Shizhe Chen, Jia Chen, Qin Jin and Alex Hauptmann: Class-aware Self-Attention for Audio Event Recognition[BS-4] Andrea Ceroni, Ma Chenyang and Ralph Ewerth: Mining Exoticism from Visual Content with Fusion-based Deep Neural Networks

Special Session 1: Predicting User Perceptions of Multimedia Content (Chair: Claire-Hélène Demarty)

Oral (14:00-14:45, Hall)[SS1-1] Dmitry Kuzovkin, Tania Pouli, Remi Cozot, Olivier Le Meur, Jonathan Kervec and Kadi Bouatouch: Image Selection in Photo Albums[SS1-2] Yasemin Timar, Nihan Karslioglu, Heysem Kaya and Albert Ali Salah: Feature Selection and Multimodal Fusion for Estimating Emotions Evoked by Movie Clips[SS1-3] Sarath Sivaprasad, Tanmayee Joshi, Rishabh Agrawal and Niranjan Pedanekar: Multimodal Continuous Prediction of Emotions in Movies using Long Short-Term Memory NetworksSpotlight (14:45-14:55, Hall)[SS1-4] Yang Liu, Zhonglei Gu, Tobey H. Ko and Kien A. Hua: Learning Perceptual Embeddings with Two Related Tasks for Joint Predictions of Media Interestingness and Emotions[SS1-5] Jayneel Parekh, Harshvardhan Tibrewal and Sanjeel Parekh: Deep Pairwise Classification and Ranking for Predicting Media Interestingness[SS1-6] Ivan Gonzalez Diaz, Jenny Benois-Pineau, Jean-Philippe Domenger and Aymar de Rugy: Perceptually-guided Understanding of Egocentric Video Content: Recognition of Objects to Grasp[SS1-7] Wenlu Yang, Maria Rifqi, Christophe Marsala and Andrea Pinna: Towards Better Understanding of Player’s Game Experience

Special Session 2: Social-Media Visual Summarization / Large-Scale 3D Multimedia Analysis and Applications (Chair: Joao Magalhaes, Rongrong Ji)

Oral (14:55-15:25, Hall)[SS2-1] Po-Yao Huang, Junwei Liang, Jean-Baptiste Lamare and Alexander Hauptmann: Multimodal Filtering of Social Media for Temporal Monitoring and Event Analysis[SS2-2] Xiangyu Yue, Bichen Wu, Sanjit Seshia, Kurt Keutzer and Alberto Sangiovanni-Vincentelli: A LiDAR Point Cloud Generator: from a Virtual World to Autonomous DrivingSpotlight (15:25-15:30, Hall)[SS2-3] Guoyu Lu and Jingkuan Song: 3D Image-based Indoor Localization Joint With WiFi Positioning[SS2-4] Zhiwei Li and Lei Yu: Compare Stereo Patches Using Atrous Convolutional Neural Networks

Special Session Posters (15:30-16:30, Foyer)

Demo (14:00-16:30, Room A, Chair: Koichi Shinoda, Zhipeng Wu)

[DE-1] Longhui Wei, Xiaobin Liu, Jianing Li and Shiliang Zhang: VP-ReID: Vehicle and Person Re-Identification System[DE-2] Maguell Sandifort, Jianquan Liu, Shoji Nishimura and Wolfgang Hürst: VisLoiter+: An Entropy Model-Based Loiterer Retrieval System with User-friendly Interfaces[DE-3] Wenjie Duan, Kengo Makino, Rui Ishiyama, Toru Takahashi, Yuta Kudo and Pieter Jonker: Automated Scanning and Individual Identification System for Parts without Marking or Tagging[DE-4] Nico Hezel and Kai Uwe Barthel: Dynamic construction and manipulation of hierarchical quartic image graphs[DE-5] Jonas Krause, Gavin Sugita, Kyungim Baek and Lipyeow Lim: WTPlant (What’s That Plant?): a Deep Learning System for Identifying Plants in Natural Images[DE-6] Matthew Cooper, Jian Zhao, Chidansh Bhatt and David Shamma: MOOCex: Exploring Educational Video via Recommendation[DE-7] Yangbangyan Jiang, Qianqian Xu, Xiaochun Cao and Qingming Huang: Who to Ask: An Intelligent Fashion Consultant[DE-8] Chou Po-Wen, Lin Fu-Neng, Chang Keh-Ning and Chen Herng-Yow: A Simple Score Following System for Music Ensembles Using Chroma and Dynamic Time Warping

Industrial Exhibition (14:00-16:30, Foyer)

[IE-1] NEC Corporation[IE-2] NVIDIA[IE-3] CyberAgent, Inc.[IE-4] LIFULL Co., Ltd.[IE-5] Mercari

Oral Session 1

Multimedia Retrieval (16:30-18:30, Hall, Chair: Chong-Wah Ngo)[OS1-1] Xing Xu, Jingkuan Song, Huimin Lu, Yang Yang, Fumin Shen and Zi Huang: Modal-adversarial Semantic Learning Network for Extendable Cross-modal Retrieval[OS1-2] Kevin Joslyn, Kai Li and Kien Hua: Cross-Modal Retrieval Using Deep De-correlated Subspace Ranking Hashing[OS1-3] Ge Song and Xiaoyang Tan: Learning multilevel semantic similarity for large-scale multi-label image retrieval[OS1-4] Limeng Cui, Zhensong Chen, Jiawei Zhang, Philip S. Yu, Yong Shi and Lifang He: Multi-view Collective Tensor Decomposition for Cross-modal Hashing[OS1-5] Lei Zhou, Xiao Bai, Xianglong Liu and Jun Zhou: Binary Coding by Matrix Classifier for Effective Subspace Retrieval[OS1-6] Zhongyan Zhang, Lei Wang, Yang Wang, Luping Zhou, Jianjia Zhang and Fang Chen: Instance Image Retrieval by Aggregating Sample-based Discriminative Characteristics

Day 3: Main conference (June 13)

Oral Session 2

Multimedia Content Analysis (9:30-11:00, Hall, Chair: Wei-Ta Chu)[OS2-1] Wenjie Zhang, Junchi Yan, Xiangfeng Wang and Hongyuan Zha: Deep eXtreme Multi-label Learning[OS2-2] Feiran Huang, Xiaoming Zhang, Chaozhuo Li, Zhonghua Zhao, Yueying He and Zhoujun Li: Multimodal Network Embedding via Attention based Multi-view Variational Autoencoder[OS2-3] Devanshu Arya and Marcel Worring: Exploiting Relational Information in Social Networks using Geometric Deep Learning on Hypergraphs[OS2-4] Matthias Zeppelzauer, Miroslav Despotovic, Muntaha Sakeena, David Koch and Mario Doller: Automatic Prediction of Building Age from Photographs[OS2-5] Kejun Zhang, Hui Zhang, Simeng Li, Changyuan Yang and Lingyun Sun: The PMEmo Dataset for Music Emotion Recognition

Poster Spotlight Session (12:30-13:00, Hall, Chair: Keiji Yanai)

[PS-1] Hanjiang Lai: Transductive Zero-Shot Hashing via Coarse-to-Fine Similarity Mining[PS-2] Xin Luo, Peng-Fei Zhang, Ye Wu, Zhen-Duo Chen, Hua-Junjie Huang and Xin-Shun Xu: Asymmetric Discrete Cross-Modal Hashing[PS-3] Xiang Zhang, Guohua Dong, Yimo Du, Chengkun Wu, Zhigang Luo and Canqun Yang: Collaborative Subspace Graph Hashing for Cross-modal Retrieval[PS-4] Ye Wu, Xin Luo, Xin-Shun Xu, Shanqing Guo and Yuliang Shi: Dictionary Learning based Supervised Discrete Hashing for Cross-Media Retrieval[PS-5] Bingqing Ke, Jie Shao, Zi Huang and Heng Tao Shen: Feature Reconstruction by Laplacian Eigenmaps for Efficient Instance Search[PS-6] Zachary Seymour and Zhongfei Zhang: Image Annotation Retrieval with Text-Domain Label Denoising[PS-7] Zachary Seymour and Zhongfei Zhang: Multi-label Triplet Embeddings for Image Annotation from User-Generated Tags[PS-8] Chandramani Chaudhary, Poonam Goyal, Joel R A Moniz, Navneet Goyal and Yi-Ping Phoebe Chen: Linguistic Patterns and Cross Modality-based Image Retrieval for Complex Queries[PS-9] Minh-Son Dao, Quang-Nhat-Minh Pham, Asem Kasem and Mohamed Saleem Haja Nazmudeen: A Context-Aware Late-Fusion Approach for Disaster Image Retrieval from Social Media[PS-10] Yugo Sato, Tsukasa Fukusato and Shigeo Morishima: Face Retrieval Framework Relying on User’s Visual Memory[PS-11] Xueping Wang, Weixin Li, Guodong Mu, Di Huang and Yunhong Wang: Facial Expression Synthesis by U-Net Conditional Generative Adversarial Networks[PS-12] Hongzhi Li, Joseph Ellis, Lei Zhang and Shih-Fu Chang: PatternNet: Visual Pattern Mining with Deep Neural Network[PS-13] Mingjie Zheng, Sheng-Hua Zhong, Songtao Wu and Jianmin Jiang: Steganographer Detection based on Multiclass Dilated Residual Networks[PS-14] Maguell L.T.L. Sandifort, Jianquan Liu, Shoji Nishimura and Wolfgang Hurst: An Entropy Model for Loiterer Retrieval across Multiple Surveillance Cameras[PS-15] Philipp Harzig, Christian Eggert and Rainer Lienhart: Visual Question Answering With a Hybrid Convolution Recurrent Model[PS-16] Shuai Liao, Efstratios Gavves and Cees Snoek: Searching and Matching Texture-free 3D Shapes in Images[PS-17] Duc Tien Dang Nguyen, Michael Riegler, Liting Zhou and Cathal Gurrin: Challenges and Opportunities within Personal Life Archives[PS-18] Xu Sun, Yuantian Wang, Tongwei Ren, Zhi Liu, Zheng-Jun Zha and Gangshan Wu: Object Trajectory Proposal via Hierarchical Volume Grouping[PS-19] Sungeun Hong, Woobin Im and Hyun Seung Yang: CBVMR: Content-Based Videoˮusic Retrieval Using Soft Intra-Modal Structure Constraint[PS-20] Yi Tang, Zhi Jin, Wenbin Zou and Xia Li: Multi-Scale Spatiotemporal Conv-LSTM Network for Video Saliency Detection[PS-21] Jianfei Xue and Koji Eguchi: Supervised Nonparametric Multimodal Topic Modeling Methods for Multi-class Video Classification[PS-22] Baohan Xu, Hao Ye, Yingbin Zheng, Heng Wang, Tianyu Luwang and Yu-Gang Jiang: Dense Dilated Network for Few Shot Action Recognition[PS-23] Haonan Qiu, Yingbin Zheng, Hao Ye, Yao Lu, Feng Wang and Liang He: Precise Temporal Action Localization by Evolving Temporal Proposals

Poster Session (14:00-16:00, Foyer, Chair: Keiji Yanai)

Posters of all the Best Session/Oral Session/Poster Papers will be presented.(Core time: 14:00-15:00 for odd number IDs, 15:00-16:00 for even number IDs

Doctoral Symposium (14:00-16:00, Hall, Chair: Martha Larson, Takahiro Ogawa)

[DS-1] Wan-Lun Tsai: Personal Basketball Coach: Tactic Training through Wireless Virtual Reality[DS-2] Andreas Leibetseder and Klaus Schoeffmann: Extracting and Using Medical Expert Knowledge to Advance in Video Processing for Gynecologic Endoscopy[DS-3] Noa Garcia: Temporal Aggregation of Visual Features for Large-Scale Image-to-Video Retrieval[DS-4] Naoki Saito, Takahiro Ogawa, Satoshi Asamizu, and Miki Haseyama: Tourism Category Classification on Image Sharing Services Through Estimation of Existence of Reliable Results[DS-5] Rashmi Gupta and Cathal Gurrin: Considering Documents in Lifelog Information Retrieval?

Keynote 2 (16:30-17:30, Hall, Chair: Shin’ichi Satoh)

Prototyping for Envisioning the Future by Shunji Yamanaka (The Univ. of Tokyo)

Oral Session 4

Video Analysis (17:30-18:30, Hall, Chair: Koichi Shinoda)[OS4-1] Yang Mi, Kang Zheng and Song Wang: Recognizing Actions in Wearable-Camera Videos by Training Classifiers on Fixed-Camera Videos[OS4-2] Romain Cohendet, Karthik Yadati, Ngoc Q. K. Duong and Claire-Helene Demarty: Annotating, understanding, and predicting long-term video memorability[OS4-3] Daniel Rotman, Dror Porat, Gal Ashour and Udi Barzelay: Optimally Grouped Deep Features Using Normalized Cost for Video Scene Detection

Banquet (19:00-, Hotel New Grand)

  • Hotel New Grand.
  • 9 mins walk from the conference location.

Day 4: Industrial day & ACMMM TPC Workshop (June 14)

Panel (9:30-10:30, Hall)

Title: Top-5 problems in multimedia retrievalPanelists: Tat-Seng Chua, Michael Houle, Ramesh Jain, Nicu Sebe, Rainer LienhartFacilitators: Chong-Wah Ngo, Vincent Oria

Industrial Talks (11:00-13:00, Hall, Chair: Go Irie, Tao Mei)

[IT-1] NEC Corporation, NEC’s Object recognition technologies and their industrial applications by Kota Iwamoto[IT-2] CyberAgent, Inc., Orion: An Integrated Multimedia Content Moderation System for Web Services by Yusuke Fujisaka[IT-3] LIFULL Co., Ltd., Promoting Open Innovations in Real Estate Tech: Provision of the LIFULL HOME’S Data Setand Collaborative Studies by Yoji Kiyota[IT-4] Hitachi, Ltd., Industrial applications of image recognition and retrieval technologies for public safety andIT services by Tomokazu Murakami

ACMMM TPC Workshop

14:30-16:30, Hall, Chair: Nicu Sebe[MT-1] Yu-Gang Jiang, Brain-inspired Deep Models for Visual Recognition[MT-2] Masataka Goto, Frontiers of Music Technologies[MT-3] Jia Jia, Mental Health Computing via Harvesting Social Media Data[MT-4] Qi Tian, Person Re-Identification: Recent Advances and Challenges[MT-5] Qin Jin, Multi-level Multi-aspect Multimedia Analysis17:00-19:00, Hall, Chair: Nicu Sebe[MT-7] Benoit Huet, Affective Multimodal Analysis for the Media Industry[MT-8] Xin Yang, Deep Neural Networks for Automated Prostate Cancer Detection and Diagnosis in Multi-parametric MRI[MT-9] Heng Tao Shen, Cross-Media Retrieval: State of the Art[MT-10] Rongrong Ji, Towards Compact Visual Analysis Systems[MT-11] Max Mühlhäuser, Multimedia Research: There’s life in the old dog yet

Information on ICMR2018


Venue Information

Yokohama Media and Communication Center
11 Nihon-Odori, Naka-ku, Yokohama, 231-0023, Japan