cv
Basics
Name | Yuantian Huang |
Research interest | Image Editing and Generation, Computer Graphics, Computer Vision |
Url | https://sky24h.github.io/ |
huang_yuantian (at) cyberagent.co.jp |
Publications
-
● International Conference (Peer-reviewed)
1. Y. Huang, S. Iizuka, and K. Fukui, ''Training-Free Zero-Shot Semantic Segmentation with LLM Refinement'', The British Machine Vision Conference (BMVC) 2024, 2024. PDF, Website
2. Y. Huang, S. Iizuka, and K. Fukui, ''Diffusion-based Semantic Image Synthesis from Sparse Layouts'', Computer Graphics International (CGI) 2023, 2023. PDF, Website
3. T. Okada, Y. Huang, G. Hao, and S. Iizuka, K. Fukui, ''Low-Level Feature Aggregation Networks for Disease Severity Estimation of Coffee Leaves'', 18th International Conference on Machine Vision and Applications (MVA), 2023. Website
4. Y. Huang, S. Iizuka, and K. Fukui, ''Free-View Expressive Talking Head Video Editing'', IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP, h5-index: 123, top 3% paper), 2023. PDF, Website -
● Domestic Conference
1. Y. Huang, S. Iizuka, and K. Fukui, ''Free-View Expressive Talking Head Video Editing'', Visual Computing 2023, 2023. (Invited talk)
2. Y. Huang, S. Iizuka, and K. Fukui, ''Diffusion-based Semantic Image Synthesis from Sparse Layouts'', 26th Meeting on Image Recognition and Understanding, 2023. (Non-peer-reviewed, poster)
3. Y. Huang, S. Iizuka, E. Simo-Serra, and K. Fukui, ''High-quality Multi-domain Artwork Generation from Semantic Layouts'', The 24th Meeting on Image Recognition and Understanding, 2021. (Peer-reviewed, short oral)
4. Y. Huang, S. Iizuka, and K. Fukui, ''Controllable Artwork Synthesis via Two-stage Adversarial Networks'', The 23th Meeting on Image Recognition and Understanding, 2020. (Peer-reviewed, short oral)
Work
- 2024.04 - now
Research Engineer
CyberAgent AI Lab
R&D on image editing and generation, computer graphics, and computer vision.
Education
-
2021.04 - 2024.03 PhD
Computer Science,
University of Tsukuba, Japan
Doctoral Thesis: Controllable Visual Content Synthesis with Deep Generative Models
Supervisors: Assoc. Prof. Satoshi Iizuka, Prof. Kazuhiro Fukui
-
2017.10 - 2021.03 Research Student -> Master
Computer Science,
University of Tsukuba, Japan
Master's Thesis: Controllable Multi-domain Semantic Artwork Synthesis
Supervisors: Assoc. Prof. Satoshi Iizuka, Prof. Kazuhiro Fukui
-
2013.09 - 2017.06
Awards
- 2024.03.25
Department Chair's Award
University of Tsukuba
Awarded by the Department of Computer Science Chair at the University of Tsukuba for outstanding performance during the Ph.D. program.
- 2023.06.04
Top 3% Paper Recognition
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023
One of the top 3% of all accepted papers at the ICASSP 2023.
- 2021 - 2024
Scholarship: "SPRING: Support for Pioneering Research Initiated by the Next Generation"
Japan Science and Technology Agency (JST)
A program to provide financial support (living & research allowance) for selected Ph.D students.
Skills
Languages | |
Chinese : Native speaker | |
Japanese : Fluent | |
English : Fluent |
Programming Languages | |
Python : ★★★★★ | |
C++ : ★★★☆☆ |
Deep Learning Frameworks | |
PyTorch : ★★★★★ | |
TensorFlow : ★★★☆☆ |
Tools & Platforms | |
Linux : ★★★★★ | |
Docker : ★★★★☆ | |
GCP : ★★★★☆ |
Interests
Video Games | |
Simulation | |
Strategy | |
RPG |
History |
Coding | |
Implementing new ideas | |
Automation of daily work | |
Implementing SOTA models |
Sports | |
Roller Skating | |
Hiking |
Projects
URL: https://sky24h.github.io/projects/
● Research | |
1. Online Demo for the paper "Free-View Expressive Talking Head Video Editing" | |
2. Online Demo for the paper "Controllable Multi-domain Semantic Artwork Synthesis" | |
3. Online Demo for the paper "Training-Free Zero-Shot Semantic Segmentation with LLM Refinement" |
● Fun | |
1. A serverless GPU application that uses AnimateDiff to run a Text-to-Video task | |
2. A serverless GPU application that uses Stable Diffusion XL to run a Text-to-Image task | |
3. One-shot face animation using webcam, capable of running in real time. | |
4. Simple implementation using ChatGPT (and GPT-4) API, deployed as a Telegram Bot. |