A Polished AI-Built Site Can Still Carry Structural Residue

A website can look finished and still carry traces of how it was made.

That is the main finding from this pilot SiteBlob audit.

The sites that matter most are not the obviously broken ones. They are the ones that pass a quick visual review. The homepage looks clean. The spacing is fine. The product shots feel competent. The site loads. Nothing immediately screams "unfinished."

But when SiteBlob looked below the surface, many confirmed AI-built sites still carried measurable residue.

Nearly 8 in 10 confirmed AI-built sites that reached a basic quality bar still crossed SiteBlob's S24 residue threshold.

In exact terms, among confirmed AI-built websites with conventional quality scores of at least 50, 45 of 57 sites, or 78.9 percent, also had S24 values of at least 25.

That sentence needs unpacking.

A normal quality score asks a familiar question: does the website seem reasonably usable, complete, and technically acceptable?

S24 asks a different question: does the finished website still carry patterns that may point to the way it was assembled?

S24 is SiteBlob's secondary structural-residue composite. It is broader than one technical check. It looks across multiple signal families, including implementation patterns, HTML and CSS structure, metadata behavior, layout regularity, symmetry, stylometric patterns in the copy, visible-template cues, runtime behavior, mobile behavior, accessibility baselines, and other production-pattern signals.

That is why this finding matters.

A site can pass a basic quality review and still carry enough builder-like, AI-like, or workflow-like residue to deserve closer inspection.

The short version

The pilot audit compared 131 publicly accessible websites attributed to five AI-builder families with 10 human-led modern open-source control sites.

The strongest practical result was not that bad AI sites look bad. Everyone already knows that.

The stronger result was this:

Polished AI-built sites often still carried residue.

Among confirmed AI-built sites that scored at least 50 on ordinary quality checks, 78.9 percent crossed the S24 residue threshold.

A broader version of the same pattern also appeared. Among confirmed AI-built sites that avoided an F grade, 56 of 76 sites, or 73.7 percent, crossed the same S24 threshold.

Those two percentages describe slightly different groups.

The 78.9 percent figure looks only at AI-built sites that reached a basic quality score of 50 or higher. In plain English, these were the AI-built sites that looked good enough to avoid being dismissed as obvious failures.

The 73.7 percent figure looks at a wider group: AI-built sites that did not receive an F grade. Because that group is broader, the percentage changes slightly. But the direction stays the same.

In both cuts of the data, most non-failing or reasonably scoring AI-built sites still showed substantial S24 residue.

The control comparison was much smaller, so it should be read carefully. Still, it gives useful context. Among the human-led open-source controls, 4 sites reached conventional quality scores of at least 50, and 0 of those 4 crossed the S24 ≥ 25 threshold.

This does not prove that S24 can identify unknown AI-built websites. It should not be used that way.

It does suggest something more practical:

Visual polish and structural cleanliness are not the same thing.

What S24 is actually measuring

S24 is not a normal website quality score.

A normal quality score is closer to a checklist. Does the page load? Is it readable? Does it have obvious trust signals? Is the mobile experience usable? Are there serious technical issues?

S24 is different. It is a secondary residue composite.

It asks whether the delivered website carries observable patterns that may reflect the builder, generator, export process, AI-assisted workflow, or production shortcuts behind it.

Those patterns can appear in several places:

the way HTML is structured
the way CSS is concentrated or repeated
the regularity of spacing, sections, and symmetry
the sameness or texture of the page copy
metadata and canonical behavior
visible-template repetition
runtime errors or browser-side noise
mobile layout behavior
accessibility and semantic baselines
trust-surface depth
repeated structural fragments across pages

None of these signals is perfect on its own.

A human-built site can have weak HTML. A hand-coded site can have repetitive layouts. A normal marketing page can have bland copy. A no-code export can be messy without being AI-generated.

That is why S24 should not be treated as proof of authorship.

It is better understood as a residue score. It tells you when a website deserves a deeper browser-rendered review.

The important mismatch: looking good is not the same as being clean underneath

Most website reviews are visual.

Someone opens the homepage, checks the hero section, clicks a few links, maybe scrolls on mobile, and decides whether the site feels ready.

That review can catch obvious problems.

It will not always catch structural residue.

A polished screenshot can hide weak semantics, thin metadata, repeated section patterns, overly regular layouts, stylometric sameness, fragile mobile behavior, runtime noise, or a shallow trust surface.

That is the mismatch this pilot audit surfaced.

Some AI-built websites did not fail ordinary quality checks. They were not disasters. They were not unusable. They were not visually embarrassing.

But they still carried enough residue to stand out on S24.

That matters because the website people approve is not just the screenshot. It is the delivered system: the DOM, the metadata, the CSS, the mobile layout, the browser behavior, the copy structure, and the maintenance surface left behind.

Scatter plot showing the relationship between conventional website quality and full S24 structural-residue composite in the pilot cohort — Figure: In this pilot cohort, conventional website quality and S24 structural residue behaved as different observed dimensions. Some higher-scoring attributed AI-built websites still showed substantial S24 values.

Why this matters for founders

Founders often approve the page they can see.

That is understandable. When you are moving quickly, the visible site feels like the product. If the homepage looks credible, the headline makes sense, and the page loads, it is tempting to call the job done.

But a fast-shipped AI-built site can look expensive while still feeling oddly incomplete once real users, developers, search engines, or accessibility tools interact with it.

The risk is not always dramatic failure.

Sometimes the risk is quieter:

the site looks generic but nobody can explain why
the copy feels smooth but interchangeable
the mobile layout technically works but feels crowded
the metadata exists but does not line up cleanly
the trust signals are present but shallow
the code is hard to maintain or extend
the browser console shows issues nobody checked
the site feels finished in a screenshot but thin in use

That is why "it looks good" is not enough of a launch review.

Why this matters for agencies

Visual approval is not QA.

This is especially important for agencies using AI website builders, no-code exports, template systems, or AI-assisted production workflows.

A client can approve the design and still receive a site that carries cleanup work below the visible layer.

That cleanup may not show up on day one. It often appears later, when someone tries to add pages, improve SEO, fix accessibility problems, migrate the site, localize the content, or debug mobile behavior.

The polished AI-built sites are sometimes the easiest ones to miss because they do not trigger alarm during the first review.

They look finished enough to pass.

That is exactly why they need a second kind of review.

Why this matters for developers

Generated output still needs review.

That does not mean every AI-assisted site is bad. It means the review target has to be the deployed system, not the promise of the tool that created it.

A useful review should inspect:

rendered HTML
DOM structure
CSS delivery patterns
metadata and canonical behavior
JavaScript runtime behavior
mobile layout behavior
accessibility basics
repeated section structures
copy texture and stylometric sameness
trust surfaces and content depth

The question is not only whether the site works.

The question is what the production workflow left behind.

The SEO angle needs careful wording

Some structural-residue patterns overlap with areas that can matter for technical SEO.

Weak metadata, awkward canonical handling, fragile JavaScript rendering, mobile layout problems, thin semantic structure, and noisy runtime behavior can all intersect with crawlability, rendering, canonicalization, accessibility, and page experience review.

But this pilot audit did not measure ranking loss.

It did not measure Search Console outcomes, indexing outcomes, manual actions, field Core Web Vitals, or traffic changes.

So the conclusion is not "Google penalizes AI websites."

The careful conclusion is this:

Structural residue can overlap with implementation areas that matter for discoverability, maintainability, and technical SEO.

That makes those areas worth reviewing before a site is published.

What structural residue can look like in plain English

You do not need to understand every SiteBlob signal to understand the practical issue.

Structural residue can look like a site where the visible layer is clean, but the underlying system feels oddly thin or repetitive.

It may show up as repeated section formulas, unusually regular spacing, concentrated inline styling, weak semantic HTML, missing accessibility basics, metadata mismatches, shallow trust pages, repeated copy structures, runtime errors, or mobile layouts that barely hold together.

It can also show up in the writing.

Some AI-built pages have copy that is grammatically fine but structurally flat. The sections move in predictable patterns. The claims sound polished but unspecific. The page feels assembled from plausible blocks rather than written from real product knowledge.

Again, none of this proves authorship.

But it does tell you where to look.

Chart showing observed prevalence differences for structural signal families between attributed AI-built websites and human-led controls in the pilot cohort — Figure: In this pilot cohort, several structural signal families showed larger observed differences than simple visible-template cues.

What to check before publishing an AI-built site

Before publishing an AI-built or AI-assisted website, do not stop at the screenshot.

Check the rendered page.

Open the site in the browser and inspect what was actually delivered. Look at the DOM, not only the source. Check whether the page structure is semantic and whether major sections are meaningfully labeled.

Check the metadata.

Review the title, description, canonical tag, Open Graph data, schema, and robots behavior. Make sure the page is not carrying placeholder or mismatched metadata from the builder workflow.

Check the mobile layout.

Use a real narrow viewport. Look for crowded sections, clipped elements, awkward stacking, oversized images, and tap targets that only look acceptable on desktop.

Check the copy.

Look for vague claims, repeated rhythms, empty benefit statements, and pages that sound smooth but say very little.

Check the browser behavior.

Look for runtime errors, broken requests, hydration issues, layout shifts, and assets that are heavier than they need to be.

Check the trust surface.

A polished homepage is not enough. Review the about page, contact path, pricing clarity, policy pages, proof points, screenshots, examples, and any claims that a real buyer would want to verify.

Then run a browser-rendered audit.

SiteBlob is designed for that second pass: not just asking whether the screenshot looks good, but reviewing what the browser actually receives, renders, and exposes.

Research artifacts

For readers who want the supporting material behind this pilot audit, the supplementary artifact kit is available here: download the SiteBlob research artifact kit.

The kit includes figures, captions, table exports, cohort summaries, and disclosure-safe aggregate results. It does not include individual scanned domains, raw scanner reports, scoring code, feature extraction logic, or rule weights.

The takeaway

A polished screenshot is not the same as a clean deployment.

In this pilot audit, many confirmed AI-built sites that looked acceptable by ordinary quality checks still carried measurable S24 residue.

That does not make S24 an authorship verdict.

It does make it useful as a review layer.

The practical lesson is simple:

Before you publish an AI-built site, check what the builder left behind.

SiteBlob helps review what the browser actually receives, renders, and exposes.

Note: This article is based on an exploratory SiteBlob audit. A public preprint DOI and supplementary artifact link will be added when available. The findings should not be used as proof that an unknown website was built with AI.