Feb 28, 2026

Can AI Step Outside Its Training Data? The Principles and Methods of Eliciting Novelty

TL;DR

“AI is just sampling from the probability distribution of its training data, so nothing new can come out” — this naive intuition is roughly right but, taken literally, too strong. Mapped onto Boden’s three categories of creativity, combinational creativity (new combinations of existing elements) and exploratory creativity (discovering unexplored regions in an existing space) are possible in principle. Transformational creativity (changing the conceptual system itself), however, is not guaranteed by the architecture. But this boundary is fuzzy, and human creativity has the same structure. Real AI output feels “average and safe” not because of a principled limit but because of four pressures: frequency bias, conservative decoding, RLHF safety tuning, and Model Collapse. Each of these has corresponding countermeasures, and novelty can be elicited.

Is AI Just a Reproduction Engine for Existing Patterns?

“AI is just sampling from the probability distribution of its training data, so won’t it fail to produce innovative ideas or visuals that aren’t in the training data?”

Anyone who’s used AI recently has probably had this doubt at some point. I myself often feel “this is kind of bland” or “I’ve seen this before” when I have AI write text or generate images. That experience reinforces this doubt.

This view is roughly correct, but taken literally it’s too strong. So in this article I’ll consider two questions in order.

In principle, can AI produce something genuinely new?
If it can, how can we get it to?

As a frame for the discussion, I’ll use cognitive scientist Margaret Boden’s three categories of creativity.

Combinational creativity: making new combinations of existing elements. Things like ukiyo-e × cyberpunk, or jazz × electronic music
Exploratory creativity: finding unexplored regions inside an existing space of rules or styles. Like generating a piece Bach never wrote, but in Bach’s style
Transformational creativity: changing the rules or conceptual system itself. The invention of non-Euclidean geometry, the birth of cubism

This three-way split is a framework widely adopted in computational creativity and applies directly to evaluating AI creativity. From here on I’ll use this axis consistently to map “what AI can and can’t do.”

Part 1: Can It Produce Something New in Principle?

Pattern Learning, Not Rote Memorization

First, the foundation. Models do not simply memorize training data and spit it out.

Interestingly, neural networks have the capacity to perfectly memorize even random labels. And yet, in actual training, generalization happens — meaning they make sensible predictions on data they haven’t seen. Models extract and learn patterns like grammar, concepts, composition, and causality as internal representations, and from those they can probabilistically generate new combinations.

GPT-3 generates news articles indistinguishable from human-written ones, but those texts are new combinations of tokens that don’t exist in the training data. It’s the same principle that lets you produce a Japanese sentence in a form you’ve never seen before.

In other words, “outputs that didn’t appear as training examples” are produced routinely. AI is not a copy machine. This fact is the prerequisite for combinational creativity.

Interpolation and Composition: Combinational and Exploratory Creativity

Now let’s look, along Boden’s three categories, at the creativity AI actually exhibits.

Combinational creativity — new combinations of existing parts

LLMs can combine concepts learned from different contexts in the training data in response to a prompt. Tell it “analyze anime script structure from an economics perspective” and you get a cross-domain piece that doesn’t fit either field on its own. In image generation, “ukiyo-e × cyberpunk” is a typical case.

But there’s an important caveat. Lake & Baroni (2018) showed that neural networks fail at systematic compositional generalization. That is, “apparently new combinations” can be generated, but the human-like ability to systematically generate new combinations from rules has limits. AI’s composition is an extension of pattern matching, not logical compositional ability.

A concrete example clarifies this. In the SCAN benchmark Lake & Baroni used, the model is trained on “jump” and “twice” individually and then tested on unseen combinations like “jump twice.” A human, understanding the rule “twice = repeat 2 times,” can apply it instantly to any verb. But neural networks fail at this kind of systematic application. AI’s combinational ability is “interpolation from similar combinations in the training data,” not “understanding rules and applying them systematically to unseen inputs.”

Exploratory creativity — discovering unexplored regions of an existing space

In image generation models, VAEs, GANs, and diffusion models learn high-dimensional latent spaces. Because that space is continuous, interpolating between two styles can produce visuals that match none of the training images.

There’s an important distinction here. “Output a human has never seen” and “output outside the model’s distribution” are not the same. Inside the distribution the model learned, but new to human eyes — that is the actual nature of exploratory creativity.

Interim conclusion of Part 1: Combinational and exploratory creativity are possible from the principles of LLMs and diffusion models. However, compositional ability has limits in systematicity.

The Boundary of Invention: What About Transformational Creativity?

What about the third — transformational creativity, the ability to change the conceptual system itself?

The invention of non-Euclidean geometry. The birth of cubism. The shift to atonal music. These didn’t find something new “inside” existing rules — they rewrote the rules themselves.

Franceschelli & Musolesi (2025) argue that LLMs exhibit combinational and exploratory creativity but cannot achieve transformational creativity under the current training paradigm. The reasoning is straightforward: LLMs have fixed weights and a fixed objective function, with no mechanism to spontaneously rewrite the rule system after training.

But there’s a subtle counterargument. Once composition becomes complex enough, it can be behaviorally indistinguishable from transformation. Suppose an LLM connects 100 existing concepts in unprecedented ways and an observer looking at the output declares “this is a paradigm shift.” Internally, no rule has changed — a combination has merely been generated within fixed weights. But from the output alone, you can’t tell whether it’s the product of combinational or transformational creativity. As Ritchie (2006) points out, looking at the agent’s internal structure you can say “the rules haven’t changed,” but from the observer’s vantage point “the rules appear to have changed.” Whether something is “transformational” isn’t an objective fact — it depends on the perspective from which you evaluate.

And there’s another perspective we shouldn’t forget. Isn’t human creativity also recombination of past parts? Koestler’s bisociation theory defines creativity as “simultaneous activation of two or more unrelated cognitive systems.” Mednick’s remote associates hypothesis likewise frames creativity as “combination of distant elements.” If human creativity is fundamentally combinational too, then the difference between AI and human creativity may be one of “degree” rather than “kind.”

Note also a caveat in the opposite direction. Diffusion models have been observed memorizing training data and regenerating near-identical images, so “aiming for novelty but landing too close to existing works” is not zero risk either.

Conclusion of Part 1: Mapped onto Boden’s three categories, combinational and exploratory creativity are yes, in principle. Transformational creativity isn’t guaranteed by the architecture, but its boundary is fuzzy, and human creativity has the same structure.

Part 2: How to Get Something New Out

Why Does It Currently Lean “Average”? — Four Pressures

In Part 1 we confirmed “in principle, it can produce.” So why does real AI output feel average and safe? It’s not a problem of principle but of the following four pressures.

Pressure A: Training-time frequency bias

Generative models tend to output what is most “plausible.” The tendency of Seq2Seq dialogue models to output safe and bland responses like “I don’t know” has been pointed out for years. Maximum likelihood estimation (MLE) is, by design, biased toward frequent patterns in the training data, so frequent expressions, typical compositions, and clichéd phrasings have high probability. This suppresses combinational creativity — rare combinations have low probability and are rarely chosen.

Pressure B: Conservative decoding

How you sample from the probability distribution also matters a lot. Decoding strategies have a decisive impact on text quality and diversity, as has been demonstrated.

flowchart LR
    A["Probability distribution"] --> B{"Decoding strategy"}
    B -->|"low temp / greedy"| C["Converges to bland output"]
    B -->|"high temp / nucleus"| D["Diverse, sharper outputs too"]

    style C fill:#fee,stroke:#c66
    style D fill:#efe,stroke:#6a6

温度が低いほど確率が高確率トークンに集中し、高いほど均等に分散する

■ T=0.3（鋭い）■ T=0.7■ T=1.0（デフォルト）■ T=2.0（平坦）

Low temperature, greedy, and beam search concentrate on high-probability candidates, so exploratory creativity is suppressed — they have a hard time reaching unexplored regions of the latent space.

Pressure C: RLHF safety tuning

RLHF has contributed to improved usefulness and reduced toxicity, but in settings where optimization toward “safe/useful” is strong, edgy expressions and bold suggestions become harder to surface. “Don’t fail,” “don’t go viral in the wrong way,” “don’t ruffle feathers” — pressure in this direction can suppress both combination and exploration.

That said, not all alignment methods uniformly kill creativity. Constitutional AI takes a design where reasoning capability — rather than constraint-based rules — secures safety, and DPO (Direct Preference Optimization) is more amenable to preserving diversity by controlling divergence from a reference model. The strength of Pressure C varies with the choice of alignment method.

Pressure D: Model Collapse — distributional degeneration via AI-generated data

Whereas Pressures A–C bias the output of “the model you have today” toward the average, this pressure is a structural problem that propagates to next-generation models. Shumailov et al. (2023) showed that recursively training on AI-generated content causes distributional tails to vanish — Model Collapse. With AI-generated text and images proliferating across the web, mixing them into future models’ training data is hard to avoid. As a result, rare expressions, minority styles, and extreme ideas — the tails of the distribution — get shaved off generation by generation, and outputs converge further to “average.”

The point is this. It’s not “can’t produce something new”; it’s just “not configured to produce it.” Although Model Collapse isn’t something an individual setting can fully resolve — it’s an industry-wide training data quality problem.

Eliciting Novelty: Countermeasures by Pressure

Now that the four pressures are identified, here are the countermeasures for each.

Countermeasure for Pressure A: break frequency bias

Add conditions to narrow the distribution — instead of “write something good,” explicitly state target reader, style, premises, NG examples, and definition of success. The more concrete the conditions, the further the distribution shifts away from “the world’s average,” and the probability advantage of frequent patterns breaks down
Move the distribution itself with specialized data — use RAG or LoRA to pull the model from “the world’s average” to “the distribution of that specific world.” That said, empirical research showing these methods directly improve creativity is limited; better to understand them as “distribution shifts”

Countermeasure for Pressure B: widen the search space

Multi-candidate generation → selection pipeline — instead of demanding the final output in a single shot, generate multiple candidates and then select. Instructions like “produce 7 candidates with mutually different stances, then rank by how sharp they are” are a reasonable way to reach low-probability but high-novelty candidates
Adjust temperature and top-p — if your environment exposes them, raising sampling parameters directly widens the search range. But this is coarse control; raise temperature too much and quality collapses
More principled decoding methods — Contrastive Decoding takes the difference between a large model (expert) and a small model (amateur) to achieve diversity and quality jointly, beating nucleus sampling and top-k without additional training. Typical Sampling is grounded in information theory and selects tokens “close to the expected information content,” reducing repetition while preserving naturalness. A more principled approach than temperature tweaking
Multi-Agent Debate — giving multiple LLM agents different initial perspectives and having them debate elicits diverse outputs that a single agent couldn’t reach. The key is initial-perspective diversity (diversity-aware initialization) — just calling the same model multiple times converges to homogeneous reasoning paths

Countermeasure for Pressure C: redefine the evaluation criteria

Replace the evaluation criteria with your own objective — instead of “good for a general audience,” set concrete success conditions like “Company A’s production team can understand it in 1 minute.” NG constraints like “no general statements,” “no ‘it depends’” are also effective
Enumerate the conventional options first, then instruct it to avoid them — “first list the typical 5 candidates, then list 5 candidates avoiding those.” Explicitly steer away from the “safe direction” RLHF has biased toward. The quantitative novelty improvement of this technique is unproven, so I present it as a heuristic
Choose alignment methods that preserve diversity — on the model provider side, alignments that control distance from a reference model, like DPO, or methods that secure safety via reasoning, like Constitutional AI, are more amenable to preserving diversity than RLHF. As a user, choosing models that adopt these methods is an indirect countermeasure

Countermeasure for Pressure D: be conscious of training data quality

This pressure isn’t something users can directly resolve. However, using non-AI-generated data — like text humans wrote or photos you took — as input to LoRA or RAG can partially avoid Model Collapse’s effects within your usage scope
In the long run, methods to detect and remove AI-generated content from training data are an industry-wide challenge

Pressure	Countermeasure	Why it works
A: Frequency bias	Concrete conditions / RAG, LoRA	Breaks the probability advantage of frequent patterns
B: Conservative decoding	Multi-candidate + selection / Contrastive Decoding / Typical Sampling / Multi-Agent Debate	Widens search space, secures diversity in a principled way
C: RLHF safety tuning	Redefine objective / diversity-preserving alignment like DPO	Removes the gravitational pull toward “safe”
D: Model Collapse	Use non-AI-generated data / industry-level quality management	Prevents loss of distribution tails

Conclusion

This article examined two questions in sequence.

Question 1: In principle, can AI output something new?

Mapped onto Boden’s three categories of creativity, combinational creativity (new combinations of existing elements) and exploratory creativity (finding unexplored regions of an existing space) are possible in principle. However, compositional ability has limits in systematicity. Transformational creativity (changing the conceptual system itself) isn’t guaranteed by the architecture, but its boundary is fuzzy and human creativity has the same structure.

Question 2: How can we get it to output something new?

The reason real AI output feels “average” isn’t a principled limit but four pressures — frequency bias, conservative decoding, RLHF safety tuning, and Model Collapse. The first three can be addressed by concretizing conditions, widening the search space with Contrastive Decoding or Multi-Agent Debate, and redefining the evaluation criteria. Model Collapse is a structural challenge that individual ingenuity can’t fully solve, but using non-AI-generated data partially avoids its effects.

When AI output feels “average,” I think that’s a signal not of the limits of AI’s creativity but that the settings to remove the pressures aren’t enough yet.

That’s all.

References

Boden, M. A. (2004). The Creative Mind: Myths and Mechanisms (2nd ed.). Routledge.
Franceschelli, G. & Musolesi, M. (2025). On the creativity of large language models. AI & Society
Nair, L. et al. (2024). Creative Problem Solving in Large Language and Vision Models. ACL Findings
Zhang, C. et al. (2017). Understanding deep learning requires rethinking generalization. arXiv:1611.03530
Brown, T. et al. (2020). Language Models are Few-Shot Learners. arXiv:2005.14165
Lake, B. & Baroni, M. (2018). Generalization without Systematicity. arXiv:1711.00350
Kingma, D.P. & Welling, M. (2013). Auto-Encoding Variational Bayes. arXiv:1312.6114
Ho, J. et al. (2020). Denoising Diffusion Probabilistic Models. arXiv:2006.11239
Li, J. et al. (2016). A Diversity-Promoting Objective Function for Neural Conversation Models. arXiv:1510.03055
Holtzman, A. et al. (2020). The Curious Case of Neural Text Degeneration. arXiv:1904.09751
Ouyang, L. et al. (2022). Training language models to follow instructions with human feedback. arXiv:2203.02155
Carlini, N. et al. (2023). Extracting Training Data from Diffusion Models. arXiv:2301.13188
Koestler, A. (1964). The Act of Creation.
Ritchie, G. (2006). The Transformational Creativity Hypothesis. New Generation Computing
Veale, T. & Cardoso, F.A. (2020). Leaps and Bounds. New Generation Computing
Li, X.L. et al. (2022). Contrastive Decoding: Open-ended Text Generation as Optimization. arXiv:2210.15097
Meister, C. et al. (2022). Locally Typical Sampling. arXiv:2202.00666
Bai, Y. et al. (2022). Constitutional AI: Harmlessness from AI Feedback. arXiv:2212.08073
Rafailov, R. et al. (2023). Direct Preference Optimization: Your Language Model is Secretly a Reward Model. arXiv:2305.18290
Shumailov, I. et al. (2023). The Curse of Recursion: Training on Generated Data Makes Models Forget. arXiv:2305.17493
Zhu, X. et al. (2026). Demystifying Multi-Agent Debate: The Role of Confidence and Diversity. arXiv:2601.19921