cv | Yuantian Huang

Basics

Name	Yuantian Huang
Research interest	Image Editing and Generation, Computer Graphics, Computer Vision
Url	https://sky24h.github.io/
Email	huang_yuantian (at) cyberagent.co.jp

Publications

● International Journal (Peer-reviewed)
1. Y. Huang, S. Iizuka, E. Simo-Serra, and K. Fukui, ''Controllable Multi-domain Semantic Artwork Synthesis'', Computational Visual Media (IF: 6.9), 2024. PDF, Website
● International Conference (Peer-reviewed)
1. H. Liu, X. Yang, T. Akiyama, Y. Huang, Q. Li, S. Kuriyama, T. Taketomi, ''TANGO: Co-Speech Gesture Video Reenactment with Hierarchical Audio-Motion Embedding and Diffusion Interpolation'', International Conference on Learning Representation, (ICLR Oral), 2025. PDF, Website
2. Y. Huang, S. Iizuka, and K. Fukui, ''Training-Free Zero-Shot Semantic Segmentation with LLM Refinement'', The British Machine Vision Conference (BMVC) 2024, 2024. PDF, Website
3. Y. Huang, S. Iizuka, and K. Fukui, ''Diffusion-based Semantic Image Synthesis from Sparse Layouts'', Computer Graphics International (CGI) 2023, 2023. PDF, Website
4. T. Okada, Y. Huang, G. Hao, and S. Iizuka, K. Fukui, ''Low-Level Feature Aggregation Networks for Disease Severity Estimation of Coffee Leaves'', 18th International Conference on Machine Vision and Applications (MVA), 2023. Website
5. Y. Huang, S. Iizuka, and K. Fukui, ''Free-View Expressive Talking Head Video Editing'', IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP, h5-index: 123, top 3% paper), 2023. PDF, Website
● Domestic Conference
1. Y. Huang, S. Iizuka, and K. Fukui, ''Free-View Expressive Talking Head Video Editing'', Visual Computing 2023, 2023. (Invited talk)
2. Y. Huang, S. Iizuka, and K. Fukui, ''Diffusion-based Semantic Image Synthesis from Sparse Layouts'', 26th Meeting on Image Recognition and Understanding, 2023. (Non-peer-reviewed, poster)
3. Y. Huang, S. Iizuka, E. Simo-Serra, and K. Fukui, ''High-quality Multi-domain Artwork Generation from Semantic Layouts'', The 24th Meeting on Image Recognition and Understanding, 2021. (Peer-reviewed, short oral)
4. Y. Huang, S. Iizuka, and K. Fukui, ''Controllable Artwork Synthesis via Two-stage Adversarial Networks'', The 23th Meeting on Image Recognition and Understanding, 2020. (Peer-reviewed, short oral)

Work

2024.04 - now
Research Engineer

CyberAgent AI Lab

R&D on image editing and generation, computer graphics, and computer vision.

Education

2021.04 - 2024.03
PhD

Computer Science,

University of Tsukuba, Japan

Doctoral Thesis: Controllable Visual Content Synthesis with Deep Generative Models

Supervisors: Assoc. Prof. Satoshi Iizuka, Prof. Kazuhiro Fukui
2017.10 - 2021.03
Research Student -> Master

Computer Science,

University of Tsukuba, Japan

Master's Thesis: Controllable Multi-domain Semantic Artwork Synthesis

Supervisors: Assoc. Prof. Satoshi Iizuka, Prof. Kazuhiro Fukui
2013.09 - 2017.06
Bachelor

Electronic Information Engineering,

Guangzhou University, China

Awards

2024.03.25

Department Chair's Award

University of Tsukuba

Awarded by the Department of Computer Science Chair at the University of Tsukuba for outstanding performance during the Ph.D. program.
2023.06.04

Top 3% Paper Recognition

IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023

One of the top 3% of all accepted papers at the ICASSP 2023.
2021 - 2024

Scholarship: "SPRING: Support for Pioneering Research Initiated by the Next Generation"

Japan Science and Technology Agency (JST)

A program to provide financial support (living & research allowance) for selected Ph.D students.

Skills

	Languages
	Chinese : Native speaker
	Japanese : Fluent
	English : Fluent

	Programming Languages
	Python : ★★★★★
	C++ : ★★★☆☆

	Deep Learning Frameworks
	PyTorch : ★★★★★
	TensorFlow : ★★★☆☆

	Tools & Platforms
	Linux : ★★★★★
	Docker : ★★★★☆
	GCP : ★★★★☆

Interests

	Video Games
	Simulation
	Strategy
	RPG

History

	Coding
	Implementing new ideas
	Automation of daily work
	Implementing SOTA models

	Sports
	Roller Skating
	Hiking

Projects

URL: https://sky24h.github.io/projects/

	● Research
	1. Online Demo for the paper "Free-View Expressive Talking Head Video Editing"
	2. Online Demo for the paper "Controllable Multi-domain Semantic Artwork Synthesis"
	3. Online Demo for the paper "Training-Free Zero-Shot Semantic Segmentation with LLM Refinement"

	● Fun
	1. A serverless GPU application that uses AnimateDiff to run a Text-to-Video task
	2. A serverless GPU application that uses Stable Diffusion XL to run a Text-to-Image task
	3. One-shot face animation using webcam, capable of running in real time.
	4. Simple implementation using ChatGPT (and GPT-4) API, deployed as a Telegram Bot.

Basics

Publications

● International Journal (Peer-reviewed)1. Y. Huang, S. Iizuka, E. Simo-Serra, and K. Fukui, ''Controllable Multi-domain Semantic Artwork Synthesis'', Computational Visual Media (IF: 6.9), 2024. PDF, Website

Work

Research Engineer

CyberAgent AI Lab

R&D on image editing and generation, computer graphics, and computer vision.

Education

PhD

Computer Science,

University of Tsukuba, Japan

Doctoral Thesis: Controllable Visual Content Synthesis with Deep Generative Models

Supervisors: Assoc. Prof. Satoshi Iizuka, Prof. Kazuhiro Fukui

Research Student -> Master

Computer Science,

University of Tsukuba, Japan

Master's Thesis: Controllable Multi-domain Semantic Artwork Synthesis

Supervisors: Assoc. Prof. Satoshi Iizuka, Prof. Kazuhiro Fukui

Bachelor

Electronic Information Engineering,

Guangzhou University, China

Awards

Department Chair's Award

University of Tsukuba

Awarded by the Department of Computer Science Chair at the University of Tsukuba for outstanding performance during the Ph.D. program.

Top 3% Paper Recognition

IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023

One of the top 3% of all accepted papers at the ICASSP 2023.

Scholarship: "SPRING: Support for Pioneering Research Initiated by the Next Generation"

Japan Science and Technology Agency (JST)

A program to provide financial support (living & research allowance) for selected Ph.D students.

Skills

Interests

Projects

URL: https://sky24h.github.io/projects/

● Research

● Fun

● International Journal (Peer-reviewed)
1. Y. Huang, S. Iizuka, E. Simo-Serra, and K. Fukui, ''Controllable Multi-domain Semantic Artwork Synthesis'', Computational Visual Media (IF: 6.9), 2024. PDF, Website