cv

Basics

Name Yuantian Huang
Research interest Image Editing and Generation, Computer Graphics, Computer Vision
Url https://sky24h.github.io/
Email huang_yuantian (at) cyberagent.co.jp

Publications

  • ● International Journal (Peer-reviewed)
    1. Y. Huang, S. Iizuka, E. Simo-Serra, and K. Fukui, ''Controllable Multi-domain Semantic Artwork Synthesis'', Computational Visual Media (IF: 6.9), 2024. PDF, Website
  • ● International Conference (Peer-reviewed)
    1. Y. Huang, S. Iizuka, and K. Fukui, ''Training-Free Zero-Shot Semantic Segmentation with LLM Refinement'', The British Machine Vision Conference (BMVC) 2024, 2024. PDF, Website
    2. Y. Huang, S. Iizuka, and K. Fukui, ''Diffusion-based Semantic Image Synthesis from Sparse Layouts'', Computer Graphics International (CGI) 2023, 2023. PDF, Website
    3. T. Okada, Y. Huang, G. Hao, and S. Iizuka, K. Fukui, ''Low-Level Feature Aggregation Networks for Disease Severity Estimation of Coffee Leaves'', 18th International Conference on Machine Vision and Applications (MVA), 2023. Website
    4. Y. Huang, S. Iizuka, and K. Fukui, ''Free-View Expressive Talking Head Video Editing'', IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP, h5-index: 123, top 3% paper), 2023. PDF, Website
  • ● Domestic Conference
    1. Y. Huang, S. Iizuka, and K. Fukui, ''Free-View Expressive Talking Head Video Editing'', Visual Computing 2023, 2023. (Invited talk)
    2. Y. Huang, S. Iizuka, and K. Fukui, ''Diffusion-based Semantic Image Synthesis from Sparse Layouts'', 26th Meeting on Image Recognition and Understanding, 2023. (Non-peer-reviewed, poster)
    3. Y. Huang, S. Iizuka, E. Simo-Serra, and K. Fukui, ''High-quality Multi-domain Artwork Generation from Semantic Layouts'', The 24th Meeting on Image Recognition and Understanding, 2021. (Peer-reviewed, short oral)
    4. Y. Huang, S. Iizuka, and K. Fukui, ''Controllable Artwork Synthesis via Two-stage Adversarial Networks'', The 23th Meeting on Image Recognition and Understanding, 2020. (Peer-reviewed, short oral)

Work

  • 2024.04 - now
    Research Engineer
    CyberAgent AI Lab
    R&D on image editing and generation, computer graphics, and computer vision.

Education

  • 2021.04 - 2024.03
    PhD
    Computer Science,
    University of Tsukuba, Japan
    Doctoral Thesis: Controllable Visual Content Synthesis with Deep Generative Models
    Supervisors: Assoc. Prof. Satoshi Iizuka, Prof. Kazuhiro Fukui
  • 2017.10 - 2021.03
    Research Student -> Master
    Computer Science,
    University of Tsukuba, Japan
    Master's Thesis: Controllable Multi-domain Semantic Artwork Synthesis
    Supervisors: Assoc. Prof. Satoshi Iizuka, Prof. Kazuhiro Fukui
  • 2013.09 - 2017.06
    Bachelor
    Electronic Information Engineering,
    Guangzhou University, China

Awards

Skills

Languages
Chinese    : Native speaker
Japanese : Fluent
English     : Fluent
Programming Languages
Python : ★★★★★
C++      : ★★★☆☆
Deep Learning Frameworks
PyTorch       : ★★★★★
TensorFlow : ★★★☆☆
Tools & Platforms
Linux    : ★★★★★
Docker : ★★★★☆
GCP      : ★★★★☆

Interests

Video Games
Simulation
Strategy
RPG
History
Coding
Implementing new ideas
Automation of daily work
Implementing SOTA models
Sports
Roller Skating
Hiking

Projects

URL: https://sky24h.github.io/projects/
● Research
1. Online Demo for the paper "Free-View Expressive Talking Head Video Editing"
2. Online Demo for the paper "Controllable Multi-domain Semantic Artwork Synthesis"
3. Online Demo for the paper "Training-Free Zero-Shot Semantic Segmentation with LLM Refinement"
● Fun
1. A serverless GPU application that uses AnimateDiff to run a Text-to-Video task
2. A serverless GPU application that uses Stable Diffusion XL to run a Text-to-Image task
3. One-shot face animation using webcam, capable of running in real time.
4. Simple implementation using ChatGPT (and GPT-4) API, deployed as a Telegram Bot.