Sitemap
A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.
Pages
Posts
Future Blog Post
Published:
This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.
Blog Post number 4
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 3
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 2
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 1
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
portfolio
Portfolio item number 1
Short description of portfolio item number 1
Portfolio item number 2
Short description of portfolio item number 2 
publications
Feaco: Reaching robust feature-level consensus in noisy pose conditions
Published in Proceedings of the 31st ACM International Conference on Multimedia, 2023
Recommended citation: J Gu, J Zhang, M Zhang, W Meng, S Xu, J Zhang, X Zhang. "Feaco: Reaching robust feature-level consensus in noisy pose conditions." ACM MM 2023, 3628-3636.
HTCViT: an effective network for image classification and segmentation based on natural disaster datasets
Published in The Visual Computer, 2023
Recommended citation: Z Ma, W Li, M Zhang, W Meng, S Xu, X Zhang. "HTCViT: an effective network for image classification and segmentation based on natural disaster datasets." The Visual Computer, 39(8), 3285-3297.
AG-SDM: Aquascape generation based on stable diffusion model with low-rank adaptation
Published in Computer Animation and Virtual Worlds, 2024
Recommended citation: M Zhang, J Yang, Y Xian, W Li, J Gu, W Meng, J Zhang, X Zhang. "AG-SDM: Aquascape generation based on stable diffusion model with low-rank adaptation." Computer Animation and Virtual Worlds, 35(3), e2252.
Eventvad: Training-free event-aware video anomaly detection
Published in Proceedings of the 33rd ACM International Conference on Multimedia, 2025
Recommended citation: Y Shao, H He, S Li, S Chen, X Long, F Zeng, Y Fan, M Zhang, Z Yan, et al. "Eventvad: Training-free event-aware video anomaly detection." ACM MM 2025, 2586-2595.
Humandreamer: Generating controllable human-motion videos via decoupled generation
Published in Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR), 2025
Recommended citation: B Wang, X Wang, C Ni, G Zhao, Z Yang, Z Zhu, M Zhang, Y Zhou, X Chen, et al. "Humandreamer: Generating controllable human-motion videos via decoupled generation." CVPR 2025, 12391.
PanoDit: Panoramic videos generation with diffusion transformer
Published in Proceedings of the AAAI Conference on Artificial Intelligence, 2025
Recommended citation: M Zhang, Y Chen, R Xu, C Wang, JM Yang, W Meng, J Guo, H Zhao, et al. "PanoDit: Panoramic videos generation with diffusion transformer." AAAI 2025, 39(10), 10040.
PDFT: parameter-diminish fine-tuning for transformer-based models
Published in The Visual Computer, 2025
Recommended citation: M Zhang, W Meng, M Jia, J Gu, Y Shao, C Wang, R Xu, Z Ma, X Zhang. "PDFT: parameter-diminish fine-tuning for transformer-based models." The Visual Computer, 41(9), 6745-6755.
AASD: Accelerate Inference by Aligning Speculative Decoding in Multimodal Large Language Models
Published in Design Automation Conference (DAC), 2025
Recommended citation: et al., M Zhang (3rd author). "AASD: Accelerate Inference by Aligning Speculative Decoding in Multimodal Large Language Models." DAC 2025.
Controllable Panoramic Video Generation with 360-Degree Motion Consistency for Multiple Control Tasks using a Unified Framework
Published in International Conference on Computer Vision (ICCV) (under review), 2025
Recommended citation: M Zhang et al. "Controllable Panoramic Video Generation with 360-Degree Motion Consistency for Multiple Control Tasks using a Unified Framework." ICCV 2025 (under review).
AccidentX: A Large-Scale Multimodal BEV Dataset for Traffic Accident Analysis and Prevention
Published in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (under review), 2025
Recommended citation: M Zhang et al. "AccidentX: A Large-Scale Multimodal BEV Dataset for Traffic Accident Analysis and Prevention." IROS 2025 (under review).
Vision Also You Need: Navigating Out-of-Distribution Detection with Multimodal Large Language Model
Published in International Conference on Computer Vision (ICCV) (under review), 2025
Recommended citation: et al., M Zhang (4th author). "Vision Also You Need: Navigating Out-of-Distribution Detection with Multimodal Large Language Model." ICCV 2025 (under review).
Memory Efficient Point Cloud Segmentation with Spatial Group Attention
Published in ACM International Conference on Multimedia (ACM MM) (under review), 2025
Recommended citation: et al., M Zhang (6th author). "Memory Efficient Point Cloud Segmentation with Spatial Group Attention." ACM MM 2025 (under review).
Robust Detection in Complex Construction Sites: HiPA-DETR with Weather-Aware and Cross-Domain Generalization
Published in ACM International Conference on Multimedia (ACM MM) (under review), 2025
Recommended citation: et al., M Zhang (5th author). "Robust Detection in Complex Construction Sites: HiPA-DETR with Weather-Aware and Cross-Domain Generalization." ACM MM 2025 (under review).
Tr-dq: Time-rotation diffusion quantization
Published in Proceedings of the AAAI Conference on Artificial Intelligence, 2026
Recommended citation: Y Shao, D Lin, M Yan, S Chen, F Zeng, M Liao, A Ma, Z Yan, H Wang, et al. "Tr-dq: Time-rotation diffusion quantization." AAAI 2026, 40(11), 8869-8877.
talks
Talk 1 on Relevant Topic in Your Field
Published:
This is a description of your talk, which is a markdown file that can be all markdown-ified like any other post. Yay markdown!
Conference Proceeding talk 3 on Relevant Topic in Your Field
Published:
This is a description of your conference proceedings talk, note the different field in type. You can put anything in this field.
teaching
Teaching experience 1
Undergraduate course, University 1, Department, 2014
This is a description of a teaching experience. You can use markdown like any other post.
Teaching experience 2
Workshop, University 1, Department, 2015
This is a description of a teaching experience. You can use markdown like any other post.
