super hub Canonical reference

GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models

Aditya Ramesh, Alex Nichol, Bob McGrew, Pamela Mishkin, Prafulla Dhariwal, Pranav Shyam · 2021 · cs.CV · arXiv 2112.10741

Canonical reference. 81% of citing Pith papers cite this work as background.

134 Pith papers citing it

Background 81% of classified citations

open full Pith review browse 134 citing papers more from Aditya Ramesh arXiv PDF

abstract

Diffusion models have recently been shown to generate high-quality synthetic images, especially when paired with a guidance technique to trade off diversity for fidelity. We explore diffusion models for the problem of text-conditional image synthesis and compare two different guidance strategies: CLIP guidance and classifier-free guidance. We find that the latter is preferred by human evaluators for both photorealism and caption similarity, and often produces photorealistic samples. Samples from a 3.5 billion parameter text-conditional diffusion model using classifier-free guidance are favored by human evaluators to those from DALL-E, even when the latter uses expensive CLIP reranking. Additionally, we find that our models can be fine-tuned to perform image inpainting, enabling powerful text-driven image editing. We train a smaller model on a filtered dataset and release the code and weights at https://github.com/openai/glide-text2im.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 23 baseline 2 method 2

citation-polarity summary

background 22 baseline 2 use method 2 support 1

claims ledger

abstract Diffusion models have recently been shown to generate high-quality synthetic images, especially when paired with a guidance technique to trade off diversity for fidelity. We explore diffusion models for the problem of text-conditional image synthesis and compare two different guidance strategies: CLIP guidance and classifier-free guidance. We find that the latter is preferred by human evaluators for both photorealism and caption similarity, and often produces photorealistic samples. Samples from a 3.5 billion parameter text-conditional diffusion model using classifier-free guidance are favored

authors

Aditya Ramesh Alex Nichol Bob McGrew Pamela Mishkin Prafulla Dhariwal Pranav Shyam

co-cited works

representative citing papers

Consistency Models

cs.LG · 2023-03-02 · conditional · novelty 8.0

Consistency models achieve fast one-step generation with SOTA FID of 3.55 on CIFAR-10 and 6.20 on ImageNet 64x64 by directly mapping noise to data, outperforming prior distillation techniques.

Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow

cs.LG · 2022-09-07 · unverdicted · novelty 8.0

Rectified flow learns straight-path neural ODEs for distribution transport, yielding efficient generative models and domain transfers that work well even with a single simulation step.

Prompt-to-Prompt Image Editing with Cross Attention Control

cs.CV · 2022-08-02 · unverdicted · novelty 8.0

Cross-attention control in text-conditioned models enables localized and global image edits by editing only the input text prompt.

An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion

cs.CV · 2022-08-02 · unverdicted · novelty 8.0

Textual Inversion learns a single embedding vector from a few images to represent personal concepts inside the text embedding space of a frozen text-to-image model, enabling their composition in natural language prompts.

Adversarial Domain Prompt Tuning and Generation for Single Domain Generalization

cs.CV · 2026-06-19 · unverdicted · novelty 7.0

PAPT uses adversarial prompt tuning on diffusion models to generate domain-style images while preserving category features, claiming superior single-domain generalization performance.

FlowBender: Feedback-Aware Training for Self-Correcting Conditional Flows

cs.CV · 2026-06-18 · unverdicted · novelty 7.0

FlowBender introduces closed-loop training that lets conditional flow models learn correction policies from their own task-specific alignment errors, outperforming supervised and guidance baselines on fidelity and plausibility.

Towards More General Control of Diffusion Models Using Jeffrey Guidance

cs.LG · 2026-06-11 · unverdicted · novelty 7.0

Jeffrey guidance applies Jeffrey's rule of conditioning to diffusion models to target prescribed marginal distributions while preserving conditional structure, demonstrated via embedding matching and fairness enforcement.

Adv-TGD: Adversarial Text-Guided Diffusion for Face Recognition Impersonation Attacks

cs.CV · 2026-06-10 · unverdicted · novelty 7.0

Adv-TGD is a text-guided diffusion attack that achieves 85.9% black-box ASR on four face recognition models while preserving PSNR 28.18 dB and SSIM 0.981.

MaskAlign: Token-Subset Representation Alignment for Efficient Diffusion Training

cs.CV · 2026-06-07 · unverdicted · novelty 7.0

MaskAlign uses random token-subset alignment and pre-mask mixing to reduce diffusion models' reliance on complete clean-image token sets during representation alignment.

AsyncPatch Diffusion: spatially-flexible image generation

cs.CV · 2026-06-05 · unverdicted · novelty 7.0

AsyncPatch Diffusion introduces asynchronous per-region noise levels in diffusion models, proves a valid ELBO, and uses a controlled sampler to support spatially adaptive generation and native inpainting.

DRIFT: From Robustness Gaps to Invariance Manifolds for AI-Generated Image Detection

cs.CV · 2026-06-05 · unverdicted · novelty 7.0

DRIFT learns a structured invariance manifold from real images via one-class supervision on decomposed robust and fragile subspaces of a frozen VFM to detect AI-generated images through margin violations.

Reflection Separation from a Single Image via Joint Latent Diffusion

cs.CV · 2026-06-02 · unverdicted · novelty 7.0

A joint latent diffusion model with cross-layer self-attention and disjoint sampling separates reflection and transmission layers from single images more effectively than prior methods on real-world benchmarks.

GLENS: Global Search via Learning from Solver Iterates with Diffusion Models

cs.LG · 2026-05-29 · unverdicted · novelty 7.0

GLENS uses diffusion models on solver iterates to generate high-quality and diverse initial guesses for multimodal non-convex optimization, leading to faster solver convergence.

DRM: Diffusion-based Reward Model With Step-wise Guidance

cs.CV · 2026-05-25 · unverdicted · novelty 7.0

DRM turns a pre-trained diffusion model into a step-wise reward model and uses it for dense RL training (Step-wise GRPO) and guided sampling to improve final image quality.

VDE: Training-Free Accelerating Rectified Flow Model via Velocity Decomposition and Estimation

cs.CV · 2026-05-22 · unverdicted · novelty 7.0

VDE accelerates rectified flow models like Flux by 3.22x with LPIPS of 0.069 via velocity decomposition into parallel/orthogonal components plus periodic full-pass anchoring.

Probability-Conserving Flow Guidance

cs.CV · 2026-05-19 · unverdicted · novelty 7.0

AdaMaG is a guidance rule for generative models derived from decomposing continuity-equation effects into divergence and score-parallel terms, with a proof that divergence diverges near the manifold and a time-dependent bound that improves realism at no extra cost.

Generating HDR Video from SDR Video

cs.CV · 2026-05-14 · unverdicted · novelty 7.0

A multi-exposure video model predicts bracketed linear SDR sequences from single nonlinear SDR input, which a merging model combines into HDR video preserving shadow and highlight detail.

ImageAttributionBench: How Far Are We from Generalizable Attribution?

cs.CV · 2026-05-13 · unverdicted · novelty 7.0

ImageAttributionBench is a benchmark dataset demonstrating that state-of-the-art image attribution methods lack robustness to image degradation and fail to generalize to semantically disjoint domains.

From Diffusion to Rectified Flow: Rethinking Text-Based Segmentation

cs.CV · 2026-05-06 · unverdicted · novelty 7.0

RLFSeg repurposes pretrained generative models via Rectified Flow for direct latent-space image-to-mask mapping in text-based segmentation, outperforming diffusion-based methods especially in zero-shot cases.

Learning to Theorize the World from Observation

cs.LG · 2026-05-05 · unverdicted · novelty 7.0

NEO is a probabilistic neural model that induces compositional programs as a learned Language of Thought from non-textual observations and executes them via a shared transition model to enable explanation-driven generalization.

Benchmarking Layout-Guided Diffusion Models through Unified Semantic-Spatial Evaluation in Closed and Open Settings

cs.CV · 2026-04-28 · conditional · novelty 7.0

Introduces closed-set C-Bench and open-set O-Bench for layout-guided diffusion models, a unified semantic-spatial scoring protocol, and ranks six models after generating and evaluating 319,086 images.

GENFIG1: Visual Summaries of Scholarly Work as a Challenge for Vision-Language Models

cs.CV · 2026-04-05 · unverdicted · novelty 7.0

GENFIG1 is a new benchmark that tests whether vision-language models can create effective Figure 1 visuals capturing the central scientific idea from paper text.

FrameDiT: Diffusion Transformer with Matrix Attention for Efficient Video Generation

cs.CV · 2026-03-10 · unverdicted · novelty 7.0

FrameDiT proposes Matrix Attention for DiTs to achieve SOTA video generation with improved temporal coherence and efficiency comparable to local factorized attention.

SVG360: Editable Multiview Vector Graphics from a Single SVG

cs.CV · 2025-11-20 · unverdicted · novelty 7.0

SVG360 lifts a single SVG to a view-conditioned representation, uses spatial memory to propagate consistent parts across views, and applies structure-aware vectorization to produce editable multiview SVGs.

citing papers explorer

Showing 50 of 134 citing papers.

Consistency Models cs.LG · 2023-03-02 · conditional · none · ref 44 · internal anchor
Consistency models achieve fast one-step generation with SOTA FID of 3.55 on CIFAR-10 and 6.20 on ImageNet 64x64 by directly mapping noise to data, outperforming prior distillation techniques.
Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow cs.LG · 2022-09-07 · unverdicted · none · ref 53 · internal anchor
Rectified flow learns straight-path neural ODEs for distribution transport, yielding efficient generative models and domain transfers that work well even with a single simulation step.
Prompt-to-Prompt Image Editing with Cross Attention Control cs.CV · 2022-08-02 · unverdicted · none · ref 28 · internal anchor
Cross-attention control in text-conditioned models enables localized and global image edits by editing only the input text prompt.
An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion cs.CV · 2022-08-02 · unverdicted · none · ref 19 · internal anchor
Textual Inversion learns a single embedding vector from a few images to represent personal concepts inside the text embedding space of a frozen text-to-image model, enabling their composition in natural language prompts.
Adversarial Domain Prompt Tuning and Generation for Single Domain Generalization cs.CV · 2026-06-19 · unverdicted · none · ref 40 · internal anchor
PAPT uses adversarial prompt tuning on diffusion models to generate domain-style images while preserving category features, claiming superior single-domain generalization performance.
FlowBender: Feedback-Aware Training for Self-Correcting Conditional Flows cs.CV · 2026-06-18 · unverdicted · none · ref 44 · internal anchor
FlowBender introduces closed-loop training that lets conditional flow models learn correction policies from their own task-specific alignment errors, outperforming supervised and guidance baselines on fidelity and plausibility.
Towards More General Control of Diffusion Models Using Jeffrey Guidance cs.LG · 2026-06-11 · unverdicted · none · ref 2 · internal anchor
Jeffrey guidance applies Jeffrey's rule of conditioning to diffusion models to target prescribed marginal distributions while preserving conditional structure, demonstrated via embedding matching and fairness enforcement.
Adv-TGD: Adversarial Text-Guided Diffusion for Face Recognition Impersonation Attacks cs.CV · 2026-06-10 · unverdicted · none · ref 8 · internal anchor
Adv-TGD is a text-guided diffusion attack that achieves 85.9% black-box ASR on four face recognition models while preserving PSNR 28.18 dB and SSIM 0.981.
MaskAlign: Token-Subset Representation Alignment for Efficient Diffusion Training cs.CV · 2026-06-07 · unverdicted · none · ref 16 · internal anchor
MaskAlign uses random token-subset alignment and pre-mask mixing to reduce diffusion models' reliance on complete clean-image token sets during representation alignment.
AsyncPatch Diffusion: spatially-flexible image generation cs.CV · 2026-06-05 · unverdicted · none · ref 32 · internal anchor
AsyncPatch Diffusion introduces asynchronous per-region noise levels in diffusion models, proves a valid ELBO, and uses a controlled sampler to support spatially adaptive generation and native inpainting.
DRIFT: From Robustness Gaps to Invariance Manifolds for AI-Generated Image Detection cs.CV · 2026-06-05 · unverdicted · none · ref 29 · internal anchor
DRIFT learns a structured invariance manifold from real images via one-class supervision on decomposed robust and fragile subspaces of a frozen VFM to detect AI-generated images through margin violations.
Reflection Separation from a Single Image via Joint Latent Diffusion cs.CV · 2026-06-02 · unverdicted · none · ref 38 · internal anchor
A joint latent diffusion model with cross-layer self-attention and disjoint sampling separates reflection and transmission layers from single images more effectively than prior methods on real-world benchmarks.
GLENS: Global Search via Learning from Solver Iterates with Diffusion Models cs.LG · 2026-05-29 · unverdicted · none · ref 49 · internal anchor
GLENS uses diffusion models on solver iterates to generate high-quality and diverse initial guesses for multimodal non-convex optimization, leading to faster solver convergence.
DRM: Diffusion-based Reward Model With Step-wise Guidance cs.CV · 2026-05-25 · unverdicted · none · ref 29 · internal anchor
DRM turns a pre-trained diffusion model into a step-wise reward model and uses it for dense RL training (Step-wise GRPO) and guided sampling to improve final image quality.
VDE: Training-Free Accelerating Rectified Flow Model via Velocity Decomposition and Estimation cs.CV · 2026-05-22 · unverdicted · none · ref 32 · internal anchor
VDE accelerates rectified flow models like Flux by 3.22x with LPIPS of 0.069 via velocity decomposition into parallel/orthogonal components plus periodic full-pass anchoring.
Probability-Conserving Flow Guidance cs.CV · 2026-05-19 · unverdicted · none · ref 8 · internal anchor
AdaMaG is a guidance rule for generative models derived from decomposing continuity-equation effects into divergence and score-parallel terms, with a proof that divergence diverges near the manifold and a time-dependent bound that improves realism at no extra cost.
Generating HDR Video from SDR Video cs.CV · 2026-05-14 · unverdicted · none · ref 227 · internal anchor
A multi-exposure video model predicts bracketed linear SDR sequences from single nonlinear SDR input, which a merging model combines into HDR video preserving shadow and highlight detail.
ImageAttributionBench: How Far Are We from Generalizable Attribution? cs.CV · 2026-05-13 · unverdicted · none · ref 50 · internal anchor
ImageAttributionBench is a benchmark dataset demonstrating that state-of-the-art image attribution methods lack robustness to image degradation and fail to generalize to semantically disjoint domains.
From Diffusion to Rectified Flow: Rethinking Text-Based Segmentation cs.CV · 2026-05-06 · unverdicted · none · ref 29 · internal anchor
RLFSeg repurposes pretrained generative models via Rectified Flow for direct latent-space image-to-mask mapping in text-based segmentation, outperforming diffusion-based methods especially in zero-shot cases.
Learning to Theorize the World from Observation cs.LG · 2026-05-05 · unverdicted · none · ref 93 · internal anchor
NEO is a probabilistic neural model that induces compositional programs as a learned Language of Thought from non-textual observations and executes them via a shared transition model to enable explanation-driven generalization.
Benchmarking Layout-Guided Diffusion Models through Unified Semantic-Spatial Evaluation in Closed and Open Settings cs.CV · 2026-04-28 · conditional · none · ref 27 · internal anchor
Introduces closed-set C-Bench and open-set O-Bench for layout-guided diffusion models, a unified semantic-spatial scoring protocol, and ranks six models after generating and evaluating 319,086 images.
GENFIG1: Visual Summaries of Scholarly Work as a Challenge for Vision-Language Models cs.CV · 2026-04-05 · unverdicted · none · ref 19 · internal anchor
GENFIG1 is a new benchmark that tests whether vision-language models can create effective Figure 1 visuals capturing the central scientific idea from paper text.
FrameDiT: Diffusion Transformer with Matrix Attention for Efficient Video Generation cs.CV · 2026-03-10 · unverdicted · none · ref 35 · internal anchor
FrameDiT proposes Matrix Attention for DiTs to achieve SOTA video generation with improved temporal coherence and efficiency comparable to local factorized attention.
SVG360: Editable Multiview Vector Graphics from a Single SVG cs.CV · 2025-11-20 · unverdicted · none · ref 26 · internal anchor
SVG360 lifts a single SVG to a view-conditioned representation, uses spatial memory to propagate consistent parts across views, and applies structure-aware vectorization to produce editable multiview SVGs.
BeyondMimic: From Motion Tracking to Versatile Humanoid Control via Guided Diffusion cs.RO · 2025-08-11 · conditional · none · ref 65 · internal anchor
BeyondMimic combines compact motion tracking with a unified guided latent diffusion model to master diverse agile behaviors from human demos and solve unseen downstream tasks via test-time classifier guidance.
FaSTA$^*$: Fast-Slow Toolpath Agent with Subroutine Mining for Efficient Multi-turn Image Editing cs.CV · 2025-06-26 · unverdicted · none · ref 27 · internal anchor
FaSTA* combines LLM fast planning with A* search and inductive subroutine mining to create an efficient agent for multi-turn image editing tasks.
COCO-Inpaint: A Benchmark for Detecting and Localizing Inpainting-Based Image Manipulations cs.CV · 2025-04-25 · unverdicted · none · ref 41 · internal anchor
COCO-Inpaint supplies a large-scale dataset and evaluation protocol focused on inpainting-based image forgeries to benchmark existing detection methods.
Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation cs.CV · 2024-10-17 · unverdicted · none · ref 59 · internal anchor
Janus decouples visual encoding into task-specific pathways inside a single autoregressive transformer to unify multimodal understanding and generation while outperforming earlier unified models.
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation cs.CV · 2024-06-10 · conditional · none · ref 22 · internal anchor
Scaled vanilla autoregressive models based on Llama achieve 2.18 FID on ImageNet 256x256 image generation, beating popular diffusion models without visual inductive biases.
ANCHOR: LLM-driven Subject Conditioning for Text-to-Image Synthesis cs.CV · 2024-04-15 · unverdicted · none · ref 7 · internal anchor
ANCHOR dataset exposes T2I model weaknesses on multi-subject abstractive captions; SAFE uses LLMs for subject extraction and embedding enhancement to improve consistency.
ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment cs.CV · 2024-03-08 · unverdicted · none · ref 37 · internal anchor
ELLA introduces a timestep-aware semantic connector to link LLMs with diffusion models for improved dense prompt following, validated on a new 1K-prompt benchmark.
Learning Interactive Real-World Simulators cs.AI · 2023-10-09 · conditional · none · ref 98 · internal anchor
UniSim learns a universal real-world simulator from orchestrated diverse datasets, enabling zero-shot deployment of policies trained purely in simulation.
Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference cs.CV · 2023-10-06 · unverdicted · none · ref 73 · internal anchor
Latent Consistency Models enable high-fidelity text-to-image generation in 2-4 steps by directly predicting solutions to the probability flow ODE in latent space, distilled from pre-trained LDMs.
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning cs.CV · 2023-07-10 · unverdicted · none · ref 14 · internal anchor
A single motion module trained on videos adds temporally coherent animation to any personalized text-to-image model derived from the same base without additional tuning.
Scalable Diffusion Models with Transformers cs.CV · 2022-12-19 · unverdicted · none · ref 35 · internal anchor
DiTs achieve SOTA FID of 2.27 on ImageNet 256x256 by scaling transformer-based latent diffusion models, with performance improving consistently as Gflops increase.
LAION-5B: An open large-scale dataset for training next generation image-text models cs.CV · 2022-10-16 · accept · none · ref 54 · internal anchor
LAION-5B is an openly released dataset of 5.85 billion CLIP-filtered image-text pairs that enables replication of foundational vision-language models.
Phenaki: Variable Length Video Generation From Open Domain Textual Description cs.CV · 2022-10-05 · unverdicted · none · ref 32 · internal anchor
Phenaki generates arbitrary-length videos from sequences of text prompts by tokenizing videos with causal temporal attention and generating tokens with a text-conditioned masked transformer, trained jointly on images and videos.
Imagen Video: High Definition Video Generation with Diffusion Models cs.CV · 2022-10-05 · unverdicted · none · ref 14 · internal anchor
Imagen Video generates high-definition text-conditional videos via a cascade of base and super-resolution diffusion models, achieving high fidelity and controllability.
Human Motion Diffusion Model cs.CV · 2022-09-29 · unverdicted · none · ref 14 · internal anchor
MDM is a classifier-free diffusion model that generates expressive human motions by predicting clean samples rather than noise, supporting text and action conditioning and outperforming prior methods on standard benchmarks.
Diffusion Posterior Sampling for General Noisy Inverse Problems stat.ML · 2022-09-29 · unverdicted · none · ref 51 · internal anchor
Diffusion models solve noisy (non)linear inverse problems via approximated posterior sampling that blends diffusion steps with manifold gradients without strict consistency projection.
Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding cs.CV · 2022-05-23 · accept · none · ref 41 · internal anchor
Imagen achieves state-of-the-art photorealistic text-to-image generation by scaling a text-only pretrained T5 language model within a diffusion framework, reaching FID 7.27 on COCO without training on it.
Hierarchical Text-Conditional Image Generation with CLIP Latents cs.CV · 2022-04-13 · accept · none · ref 36 · internal anchor
A hierarchical prior-decoder model using CLIP latents generates more diverse text-conditional images than direct methods while preserving photorealism and caption fidelity.
Video Diffusion Models cs.CV · 2022-04-07 · unverdicted · none · ref 36 · internal anchor
A diffusion model for video generation extends image architectures with joint image-video training and improved conditional sampling, delivering first large-scale text-to-video results and state-of-the-art performance on video prediction and unconditional generation benchmarks.
High-Resolution Image Synthesis with Latent Diffusion Models cs.CV · 2021-12-20 · conditional · none · ref 59 · internal anchor
Latent diffusion models achieve state-of-the-art inpainting and competitive results on unconditional generation, scene synthesis, and super-resolution by performing the diffusion process in the latent space of pretrained autoencoders with cross-attention conditioning, while cutting computational and
Adversarial Diffusion Across Modalities: A Fusion Survey of Attacks, Defenses, and Evaluation for Text, Vision, and Vision-Language Models cs.CR · 2026-06-25 · unverdicted · none · ref 47 · internal anchor
A narrative survey that catalogs fifty papers on diffusion-based adversarial techniques across text, vision, and vision-language models, proposes a six-class taxonomy of diffusion roles plus a unified five-dimension evaluation framework, and releases a companion catalog.
Flow-based Policy Adaptation without Policy Updates cs.RO · 2026-06-04 · unverdicted · none · ref 54 · internal anchor
GLOVES learns flow models from limited expert demonstrations to selectively correct actions from non-expert policies or operators toward expert distributions using reverse-flow OOD detection as an intervention gate.
Organizational Control Layer: Governance Infrastructure at the Execution Boundary of LLM Agent Systems cs.MA · 2026-06-03 · unverdicted · none · ref 106 · internal anchor
OCL is a governance layer for LLM agents that cuts unsafe executions from 88% to near-zero and raises valid success from 12% to 96% in adversarial buyer-seller negotiations across frontier LLMs.
Equilibrated Diffusion: Frequency-aware Textual Embedding for Equilibrated Image Customization cs.CV · 2026-06-01 · unverdicted · none · ref 27 · internal anchor
Equilibrated Diffusion decomposes concepts in frequency space to independently optimize subject and style embeddings, plus mask-guided diffusion and residual reference attention, for improved subject fidelity and text alignment over baselines.
Guidance Contrastive Token Credit Assignment for Discrete Policy Optimization cs.CV · 2026-05-28 · unverdicted · none · ref 18 · internal anchor
GCPO performs per-token credit assignment in discrete policy optimization by setting token advantages proportional to the difference in model predictions under positive versus negative prompts, outperforming GRPO and DAPO on text-to-image and chain-of-thought tasks.
HydraPrompt: An Adaptive and Asymmetric Framework of Vision-Language Models for Synthetic Image Detection cs.CV · 2026-05-26 · unverdicted · none · ref 50 · internal anchor
HydraPrompt uses an Asymmetric Prompt Adapter with fixed real prompts and adaptive fake prompts plus a Conditional Supervised Contrastive loss to achieve SOTA synthetic image detection on benchmarks.

GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models

hub tools

citation-role summary

citation-polarity summary

claims ledger

authors

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer