🌊HiFlow: Training-free High-Resolution Image Generation
with Flow-Aligned Guidance

Text-to-image (T2I) diffusion/flow models have drawn considerable attention recently due to their remarkable ability to deliver flexible visual creations. Still, high-resolution image synthesis presents formidable challenges due to the scarcity and complexity of high-resolution content. To this end, we present HiFlow, a training-free and model-agnostic framework to unlock the resolution potential of pre-trained flow models. Specifically, HiFlow establishes a virtual reference flow within the high-resolution space that effectively captures the characteristics of low-resolution flow information, offering guidance for high-resolution generation through three key aspects: initialization alignment for low-frequency consistency, direction alignment for structure preservation, and acceleration alignment for detail fidelity. By leveraging this flow-aligned guidance, HiFlow substantially elevates the quality of high-resolution image synthesis of T2I models and demonstrates versatility across their personalized variants. Extensive experiments validate HiFlow's superiority in achieving superior high-resolution image quality over current state-of-the-art methods.
HiFlow constructs reference flow from low-resolution sampling trajectory to offer initiation alignment, direction alignment, and acceleration alignment, enabling flow-aligned high-resolution image generation. Specifically, HiFlow involves a cascade generation paradigm: First, a virtual reference flow is constructed in the high-resolution space based on the step-wise estimated clean samples of the low-resolution sampling flow. Then, during high-resolution synthesizing, the reference flow offers guidance from sampling initialization, denoising direction, and moving acceleration, aiding in achieving consistent low-frequency patterns, preserving structural features, and maintaining high-fidelity details. Such flow-aligned guidance from the sampling trajectory facilitates better merging of the structure synthesized at the low-resolution scale and the details synthesized at the high-resolution scale, enabling superior generation.
Image qualitative comparisons with other baselines. HiFlow yields high-resolution images characterized by high-fidelity details and coherent structure.
Image qualitative comparisons with training-based methods. HiFlow demonstrates the capability to generate high-resolution images with quality comparable to leading training-based models (UltraPixel, DALL·E 3, and Flux-Pro).
Here we demonstrate our results for more applications including LoRA, ControlNet, and Quantization. Additionally, we showcase our results on SDXL, a U-Net based T2I diffusion model.
If you find this work helpful, please cite the following paper:
@article{bu2025hiflow, title={HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance}, author={Bu, Jiazi and Ling, Pengyang and Zhou, Yujie and Zhang, Pan and Wu, Tong and Dong, Xiaoyi and Zang, Yuhang and Cao, Yuhang and Lin, Dahua and Wang, Jiaqi}, journal={arXiv preprint arXiv:2504.06232}, year={2025} }
Project page template is borrowed from FreeScale.