为什么基于扩散和去噪的生成式方法不受维度诅咒的影响,哪怕数据分布在很高维的空间?我们最近被JMLR接受的新论文:http://t.cn/A6RlLaEQ 揭示了后面的秘密:只要数据分布的内在维度很低,那么生成模型就会非常高效。另外,低秩低维Gauss混合模型可以很好的近似几乎所有真实世界的数据,这也正是我十年前所写的教科书:《广义主成分分析》http://t.cn/RUox5jx 中所提出的观点。Why diffusion denoising-based generative methods do not suffer the curse of dimensionality even though the data may lie on extremely high-dim spaces? Our new work, accepted by the JMLR:http://t.cn/A6RlLaEQ reveals the not-so-surprising secret: as long as the intrinsic dimension of the distribution is low, the generative process can be extremely efficient and effective! It seems that a mixture of low-rank Gaussians is a universal model for all informative real-world data. as we stipulated in a former textbook of mine: Generalized Principal Component Analysis: http://t.cn/AXS2PuAz published exactly ten years ago!
发布于 中国香港
