We Scanned 131 AI-Built Websites, and the 'AI Look' Wasn't the Biggest Tell

People think AI-built websites are obvious.

They imagine the same gradient hero, the same rounded cards, the same vague startup copy, the same oversized headline, and the same polished-but-empty SaaS sections.

That is the popular theory of the "AI website look."

But in this pilot SiteBlob audit, that was not the strongest tell.

The more useful signal was not the screenshot.

It was what the browser received, rendered, and exposed.

SiteBlob scanned 131 publicly accessible websites attributed to five AI-builder families and compared them with 10 human-led modern open-source control sites.

The first surprise was that ordinary quality did not separate the groups very well.

The average conventional quality score was 45.0 for the confirmed AI-built websites and 43.6 for the controls. The empirical AUC for conventional quality was 0.539, which means ordinary quality was a weak separator in this pilot cohort.

The second surprise was even simpler.

The visible "AI look" barely separated the groups at all.

Visual template tropes appeared in 50.4 percent of confirmed AI-built websites and 50.0 percent of the controls.

Almost identical.

So if your AI website audit starts and ends with gradients, cards, motion, and generic SaaS sections, you may be looking at the least useful layer.

The stronger difference appeared beneath the interface.

The short version

This was an exploratory browser-rendered audit, not a population study.

The goal was not to build a magic detector that proves whether an unknown website was made with AI.

The goal was narrower and more practical:

When we compare attributed AI-built websites with human-led controls, which observable website patterns actually differ?

The answer was not ordinary visual polish.

It was structural residue.

By structural residue, we mean observable traces in the delivered website that may reflect how the site was assembled. These traces can appear in implementation, layout, metadata, copy, accessibility, browser behavior, mobile behavior, and repeated production patterns.

SiteBlob groups many of these traces into a secondary composite called S24.

S24 is not a normal quality score. It is not just asking whether a site looks nice or loads correctly. It looks across several signal families, including HTML and CSS structure, embedded styling, visible-template cues, layout regularity, symmetry, stylometric patterns in copy, metadata behavior, trust-surface depth, accessibility baselines, runtime behavior, and other production-pattern signals.

That is why the screenshot was not enough.

A site can look modern and still carry residue from a builder, generator, export process, or rushed AI-assisted workflow.

What was scanned

The audit compared 131 successfully scanned websites attributed to five AI-builder families with 10 human-led modern open-source controls.

The unit of observation was the website.

SiteBlob scanned public pages and looked at what the browser could actually observe after the site was delivered.

That included:

public HTML
CSS structure
metadata
rendered DOM
runtime behavior
mobile layout behavior
accessibility and semantic baselines
site structure
visible template cues
copy and stylometric patterns
trust-surface signals

Attribution was established before scanning and independently from the measured outcomes.

That matters because the audit was not trying to guess authorship from the same signals it later measured. It was comparing already attributed groups to see which patterns were actually useful.

The visual story was weaker than expected

A lot of people expect AI-built sites to fail the eye test.

Sometimes they do.

But that was not the strongest pattern in this pilot audit.

Conventional website quality overlapped substantially between confirmed AI-built websites and controls. The AI-built group averaged 45.0. The controls averaged 43.6.

That is not the kind of gap that supports a simple story like "AI sites are obviously worse."

The common visual tropes were even less useful.

Visual template tropes appeared in 50.4 percent of the confirmed AI-built websites and 50.0 percent of the controls.

In plain English: the visible AI-looking layer was basically a coin flip in this cohort.

That does not mean visual clichés are fake. It means they are not enough.

Gradient backgrounds, rounded cards, generic hero sections, and polished SaaS layouts are now common across the web. Human designers use them. Templates use them. No-code tools use them. AI builders use them too.

So the visible look can be memorable without being very diagnostic.

Chart comparing conventional website quality scores and full S24 distributions for attributed AI-built websites and human-led controls in the pilot cohort — Figure: In this pilot cohort, conventional quality scores overlapped substantially, while the secondary full S24 composite showed stronger observed separation.

The stronger tell was structural residue

The clearer difference appeared below the screenshot layer.

One recurring bundle stood out:

embedded styling
limited observable trust surface
semantic or accessibility baseline gaps

That bundle appeared in 108 of 131 confirmed AI-built websites, or 82.4 percent.

It appeared in 1 of 10 controls, or 10.0 percent.

It also appeared across every confirmed builder family in the pilot cohort.

This is much more interesting than saying "AI sites use gradients."

The bundle points to a production pattern. The site may look acceptable, but the delivered system still carries traces of how it was assembled: styling choices, shallow trust surfaces, weak semantic structure, or accessibility gaps that a visual screenshot does not reveal.

That does not make the bundle proof of AI authorship.

A human-built site can have the same weaknesses. A rushed agency site can have them. A no-code export can have them. A template site can have them.

But as an audit signal, the bundle was far more useful than the visible AI-look clichés.

Figure showing recurring structural-residue bundles across attributed AI-built websites and human-led controls in the pilot cohort — Figure: In this pilot cohort, a recurring bundle involving embedded styling, limited observable trust surface, and semantic or accessibility gaps was common across attributed AI-built websites and uncommon in the observed human-led controls.

Why screenshots are weak evidence

A screenshot hides most of the deployment story.

It does not show how the HTML is structured.

It does not show whether the CSS is clean, concentrated, duplicated, or embedded in ways that make the site harder to maintain.

It does not show whether the metadata is coherent.

It does not show whether headings and landmarks are meaningful.

It does not show runtime errors, broken requests, mobile crowding, thin trust surfaces, or copy patterns that repeat across sections.

That is why public conversations about vibe-coded websites can get stuck.

People argue about whether a page "looks AI" when the more useful question is what the finished page exposes to the browser.

The screenshot is only one layer.

The browser sees much more.

What S24 adds to the review

S24 is SiteBlob's secondary structural-residue composite.

It is designed to look beyond ordinary quality and visible polish.

A normal quality score can tell you whether a website seems usable, complete, and technically acceptable.

S24 asks a different question:

Does the finished website carry enough structural, stylistic, metadata, layout, copy, or implementation residue to deserve a deeper review?

That includes things like:

concentrated embedded styling
repeated structural fragments
overly regular section layouts
symmetry and spacing patterns
stylometric sameness in copy
weak semantic HTML
accessibility baseline gaps
thin or shallow trust surfaces
metadata and canonical issues
runtime noise
mobile layout fragility
visible-template cues
other production-pattern signals

This is why S24 should be handled carefully.

It is not an authorship verdict.

It should not be used to claim that an unknown website was definitely built with AI.

It is better understood as a review layer. It helps surface the gap between how a website looks and what its delivered structure suggests.

Why this matters for AI website builders

AI website builders are useful.

They make it easier to get from idea to live page. They help founders move faster. They help non-technical teams test positioning, launch landing pages, and create first versions without waiting weeks.

The problem is not that AI builders exist.

The problem is assuming that a good-looking generated site has already been reviewed as a deployed system.

A builder-mediated workflow can produce a page that looks launch-ready on the first screen. That does not guarantee clean semantic HTML, healthy metadata, strong accessibility basics, stable JavaScript rendering, credible trust surfaces, or a maintainable structure.

The useful question is not:

"Does this look AI-made?"

The useful question is:

"What did the publishing process leave behind?"

Why this matters for agencies and freelancers

For agencies, the risk is not only bad output.

The bigger risk is polished output that passes visual approval too early.

A client reviews the homepage. The hero looks good. The cards are aligned. The colors feel professional. The site goes live.

Three weeks later, the cleanup starts.

Someone notices weak metadata. Someone has to add missing trust pages. Someone finds accessibility gaps. Someone tries to extend the site and realizes the structure is brittle. Someone discovers that the mobile layout only looked fine on the original preview width.

Visual approval is not QA.

If AI-assisted production is part of the workflow, the handoff needs a second pass that reviews the delivered system, not just the visible design.

Why this matters for SEO

Search engines do not evaluate a screenshot.

They process the delivered website.

That means HTML, metadata, canonical tags, mobile behavior, JavaScript rendering, internal structure, and content clarity all matter.

This pilot audit did not measure ranking loss. It did not measure Search Console outcomes, indexing outcomes, manual actions, field Core Web Vitals, or traffic changes.

So the conclusion is not:

"Google penalizes AI websites."

The better conclusion is:

Some structural-residue patterns overlap with implementation areas that can matter for discoverability, maintainability, and technical SEO.

If a site ships with weak metadata, awkward canonical behavior, fragile JavaScript rendering, mobile layout problems, thin semantics, shallow trust surfaces, or runtime noise, those are worth reviewing before publication.

That is true whether the site was made by an AI builder, a no-code tool, a template, or a human team moving too fast.

What to check before trusting the visible layer

Before you approve an AI-built or AI-assisted website, look beyond the screenshot.

Check the rendered HTML.

Check whether headings, landmarks, sections, links, and page structure are meaningful.

Check the CSS.

Look for concentrated inline styling, duplicated structure, brittle layout choices, or builder-export patterns that will make later edits painful.

Check the metadata.

Review the title, description, canonical tag, Open Graph data, schema, and robots behavior.

Check the mobile layout.

Use a narrow viewport, not just a desktop preview. Look for crowding, clipping, awkward stacking, oversized media, and tap targets that feel cramped.

Check the copy.

Look for vague claims, repeated rhythms, empty benefit statements, and sections that sound polished but say very little.

Check the browser behavior.

Look for runtime errors, broken requests, hydration issues, layout shifts, and unnecessary asset weight.

Check the trust surface.

A homepage is not enough. Review the about page, contact path, pricing clarity, policies, proof points, examples, screenshots, and any claims a real buyer would want to verify.

Then run a browser-rendered audit.

That second pass is where the more useful signals appear.

Research artifacts

For readers who want the supporting material behind this pilot audit, the supplementary artifact kit is available here: download the SiteBlob research artifact kit.

The kit includes figures, captions, table exports, cohort summaries, and disclosure-safe aggregate results. It does not include individual scanned domains, raw scanner reports, scoring code, feature extraction logic, or rule weights.

The takeaway

The "AI look" was not the biggest tell in this pilot audit.

Ordinary quality scores overlapped.

Visual template tropes were almost identical between confirmed AI-built sites and controls.

The stronger signal appeared beneath the interface, in structural residue.

That does not make S24 a magic authorship detector.

It does make structural-residue review useful.

A polished screenshot is not the same as a clean deployment.

SiteBlob helps review what the browser actually receives, renders, and exposes.

Scan your site with SiteBlob and look beyond the screenshot

Note: This article is based on an exploratory SiteBlob audit. A public preprint DOI and supplementary artifact link will be added when available. The findings should not be used as proof that an unknown website was built with AI.