lilpacy.infoJA
Modular Typographic Scale: Font Size Design Based on Music Theory

Modular Typographic Scale: Font Size Design Based on Music Theory


TL;DR

Web typography has a theory called the “modular scale” for deciding font sizes. Proposed by Tim Brown in 2011, it borrows musical frequency ratios (perfect fourth = 1.333, perfect fifth = 1.5, golden ratio = 1.618, and so on) and computes font sizes via the formula size = base × ratio^n. It’s a redefinition of the harmonic size systems of the metal-type era for the web, and it’s adopted in major frameworks like Tailwind CSS Typography.

Why Does Font Sizing Need a “Theory”?

When deciding font sizes for a website, most developers pick values by intuition. h1 is 32px, h2 is 24px, body is 16px — they line up some round numbers and check the visual balance by eye.

But this method lacks consistency. Why 32px and not 30px? Is the size gap between h2 and h3 in proportion to the gap between h1 and h2? Every time the designer changes, the size system breaks down, and the bigger the project gets, the more the sense of unity is lost.

The solution that proposes a mathematical foundation for this problem is Tim Brown’s “More Meaningful Typography” (A List Apart, 2011).

The Modular Scale Formula

The idea of the modular scale is simple. Decide on a base size and a ratio, and all font sizes are derived automatically.

size(n)=base×ration\text{size}(n) = \text{base} \times \text{ratio}^n

  • n=0n = 0: body text (the base size itself)
  • n=1,2,3,n = 1, 2, 3, \dots: headings and other higher-level elements
  • n=1,2,n = -1, -2, \dots: captions, small text, and other lower-level elements
ベースサイズ: 16px(ドラッグで変更)
長三度 1.25完全四度 1.333完全五度 1.5黄金比 1.618
def modular_scale(base: float, ratio: float, n: int) -> float:
    # size(n) = base × ratio^n
    return base * (ratio ** n)

# Example: base 16px, perfect fourth (1.333)
base, ratio = 16, 1.333
for n in range(4, -3, -1):
    print(f"n={n:+d}: {modular_scale(base, ratio, n):.1f}px")

Output:

n=+4: 50.5px
n=+3: 37.9px
n=+2: 28.4px
n=+1: 21.3px
n=+0: 16.0px
n=-1: 12.0px
n=-2: 9.0px

How to assign each value in the sequence to HTML elements is the implementer’s call, not a matter of theory. One example:

ElementnSize (base=16px, ratio=1.333)
h1+337.9px
h2+228.4px
h3+121.3px
body016.0px
small / caption-112.0px

What modular scale theory provides is “a harmonic numeric sequence of sizes”; which element you place at which level is a design choice. For instance, Tailwind CSS Typography assigns blockquote n=1 (1.333em), while Bootstrap and Normalize.css don’t override blockquote’s font-size at all. Whether to make blockquote larger isn’t a corollary of scale theory — it’s a framework or theme implementation decision.

This blog’s blockquote { font-size: 1.333em } is consistent with the scale in that it uses the perfect fourth ratio, but that itself is the implementer’s choice.

Relation to Music Theory

Why “non-round” numbers like 1.333 and 1.5? They come from frequency ratios of musical intervals.

When two pitches “sound pleasing together,” their frequencies are in integer ratios. For example, the frequency ratio of C and G (perfect fifth) is 3:2, and C and F (perfect fourth) is 4:3. The theory of consonant intervals in music goes back to Pythagoras in ancient Greece.

Tim Brown borrowed this “auditory harmony” and applied it as “visual harmony.” The hypothesis is: ratios that sound pleasing might also look pleasing.

A list of major ratios:

IntervalFrequency RatioDecimalCharacteristic
Minor Second16:151.067Almost uniform, very small steps
Major Second9:81.125Subtle
Major Third5:41.250Soft, gentle
Perfect Fourth4:31.333Calm, balanced
Perfect Fifth3:21.500The most commonly used
Golden Ratio1.618Dynamic, strong contrast

The larger the ratio, the steeper the size jumps; the smaller, the gentler. The perfect fourth (1.333) can be called a moderate choice — “not too prominent, not too buried.”

graph LR
    A["Major Third 1.25"] --> B["Perfect Fourth 1.333"]
    B --> C["Perfect Fifth 1.5"]
    C --> D["Golden Ratio 1.618"]
    style A fill:#e8f5e9
    style B fill:#fff9c4
    style C fill:#e3f2fd
    style D fill:#fce4ec

Double-Stranded Scale

In “More Meaningful Typography,” Tim Brown also proposes a “double-stranded scale” using two base values rather than one.

For example, basing on body text 16px and a large heading 48px lets you build a more flexible scale by integrating the values derived from raising each base to ratio powers.

Important numbers in our design can serve as bases, and the ratio between those numbers, or a ratio meaningful to our content, can serve as our ratio.

Tim Brown, “More Meaningful Typography”

In other words, “important numbers in your design can serve as bases, and the ratio between them — or a ratio meaningful to your content — can serve as the scale’s ratio.”

At the same time, Tim Brown leaves an important caveat.

Math is no substitute for design. When the scale produces a number that doesn’t look right, override it. Just leave a comment in the code.

Math isn’t almighty. If a value derived from the scale doesn’t look right in the design, override it — but leave a comment in the code. It’s practical that he recommends combining theory with visual judgment rather than fixating on the theory.

A History from the Era of Metal Type

The idea of the modular scale isn’t actually Tim Brown’s invention. Even in the metal-type era, font sizes had harmonic systems.

The Point System of Metal Type

In the mid-18th century, French type founder Pierre-Simon Fournier created the prototype of the point system. Later, Firmin Didot refined it; the Didot point (1pt ≈ 0.376mm) became the standard on continental Europe.

In type foundries, standardized size series like the following were established:

5pt, 6pt, 7pt, 8pt, 9pt, 10pt, 11pt, 12pt, 14pt, 16pt, 18pt, 20pt, 24pt, 28pt, 32pt, 36pt...

Look closely at this sequence and you’ll see that from 12pt onward, the increases are roughly at a constant ratio. The flow 12→14→16→18→24→36 isn’t perfectly even, but follows logarithmic scaling: the larger the value, the wider the step.

Through DTP into the Web

flowchart LR
    A["Type founding (18C)"] -->|"Standardization of points"| B["DTP (1980s)"]
    B -->|"PostScript pt = 1/72 inch"| C["Web (1990s-)"]
    C -->|"16px base + em/rem"| D["Modular scale (2011-)"]
  • Type era: Standard size sets emerged naturally from physical constraints
  • DTP era: PostScript (1pt = 1/72 inch ≈ 0.353mm) became the digital standard
  • Web era: Browser default of 16px + relative units like em/rem became mainstream
  • Modular scale: Tim Brown re-defined the system using ratios from music theory

You could say the modular scale is what made it possible to reproduce mathematically the harmony that type-foundry craftsmen had felt empirically — “this combination of sizes is beautiful.”

Adoption in Frameworks

Tailwind CSS Typography

Tailwind CSS Typography adopts values close to the modular scale across the size variants of the prose class.

ClassBase sizeblockquote ratioComputed
prose-sm14px1.286em18px
prose-base16px1.25em20px
prose-lg18px1.333em24px
prose-xl20px1.5em30px
prose-2xl24px1.5em36px

prose-lg’s blockquote is exactly 1.333em (the perfect fourth).

Frameworks That Don’t Adopt It

On the other hand, not every framework adopts the modular scale.

  • Bootstrap: No explicit font-size for blockquote (inherits from parent)
  • Normalize.css: Only normalizes margins, doesn’t touch font-size

These take a more conservative approach — “leave font-size decisions to the designer.” The modular scale is a “recommended best practice” but not the one and only correct answer.

Tools

If you want to actually try the modular scale, the following tools are useful.

  • Type Scale: Pick a ratio visually → automatic CSS generation
  • Modular Scale: A computation tool developed by Tim Brown himself

References

That’s all.