🌊HiFlow: Training-free High-Resolution Image Generation
with Flow-Aligned Guidance

Jiazi Bu1,5* Pengyang Ling2,5* Yujie Zhou1,5* Pan Zhang5† Tong Wu4

Xiaoyi Dong3,5 Yuhang Zang5 Yuhang Cao5 Dahua Lin3,5,7 Jiaqi Wang5,6†

1 Shanghai Jiao Tong University    2 University of Science and Technology of China
3 The Chinese University of Hong Kong    4 Stanford University    5 Shanghai AI Laboratory
6 Shanghai Innovation Institute    7 CPII under InnoHK
(* Equal Contribution   † Corresponding Author)

[Paper]      [Code]


Flux (1024×1024) HiFlow (4096×4096)
Flux (1024×1024) HiFlow (4096×4096)
Flux (1024×1024) HiFlow (4096×4096)

Flux (1024×1024) HiFlow (4096×4096)
Flux (1024×1024) HiFlow (4096×4096)
Flux (1024×1024) HiFlow (4096×4096)

Flux (1024×1024) HiFlow (4096×4096)
Flux (1024×1024) HiFlow (4096×4096)
Flux (1024×1024) HiFlow (4096×4096)

Abstract

Text-to-image (T2I) diffusion/flow models have drawn considerable attention recently due to their remarkable ability to deliver flexible visual creations. Still, high-resolution image synthesis presents formidable challenges due to the scarcity and complexity of high-resolution content. To this end, we present HiFlow, a training-free and model-agnostic framework to unlock the resolution potential of pre-trained flow models. Specifically, HiFlow establishes a virtual reference flow within the high-resolution space that effectively captures the characteristics of low-resolution flow information, offering guidance for high-resolution generation through three key aspects: initialization alignment for low-frequency consistency, direction alignment for structure preservation, and acceleration alignment for detail fidelity. By leveraging this flow-aligned guidance, HiFlow substantially elevates the quality of high-resolution image synthesis of T2I models and demonstrates versatility across their personalized variants. Extensive experiments validate HiFlow's superiority in achieving superior high-resolution image quality over current state-of-the-art methods.

Methodology

HiFlow constructs reference flow from low-resolution sampling trajectory to offer initiation alignment, direction alignment, and acceleration alignment, enabling flow-aligned high-resolution image generation. Specifically, HiFlow involves a cascade generation paradigm: First, a virtual reference flow is constructed in the high-resolution space based on the step-wise estimated clean samples of the low-resolution sampling flow. Then, during high-resolution synthesizing, the reference flow offers guidance from sampling initialization, denoising direction, and moving acceleration, aiding in achieving consistent low-frequency patterns, preserving structural features, and maintaining high-fidelity details. Such flow-aligned guidance from the sampling trajectory facilitates better merging of the structure synthesized at the low-resolution scale and the details synthesized at the high-resolution scale, enabling superior generation.

Qualitative Comparison

Image qualitative comparisons with other baselines. HiFlow yields high-resolution images characterized by high-fidelity details and coherent structure.


Image qualitative comparisons with training-based methods. HiFlow demonstrates the capability to generate high-resolution images with quality comparable to leading training-based models (UltraPixel, DALL·E 3, and Flux-Pro).

Versatile Applications of HiFlow

Here we demonstrate our results for more applications including LoRA, ControlNet, and Quantization. Additionally, we showcase our results on SDXL, a U-Net based T2I diffusion model.

BibTex

If you find this work helpful, please cite the following paper:

  @article{bu2025hiflow,
    title={HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance},
    author={Bu, Jiazi and Ling, Pengyang and Zhou, Yujie and Zhang, Pan and Wu, Tong and Dong, Xiaoyi and Zang, Yuhang and Cao, Yuhang and Lin, Dahua and Wang, Jiaqi},
    journal={arXiv preprint arXiv:2504.06232},
    year={2025}
  }
  

Project page template is borrowed from FreeScale.