Does WordPress Output Semantic HTML? What You Need to Know

Does Your CMS Actually Output Semantic HTML? WordPress and Drupal, Audited


When a business chooses WordPress or Drupal for their website, there is often an implicit assumption: these are professional, well-supported platforms. The technical foundation must be handled.

That assumption is partially correct. And partially dangerous.

Both WordPress and Drupal are legitimate enterprise-grade platforms. But neither one guarantees that your finished website outputs clean semantic HTML. The CMS is the factory. What matters for SEO is what the factory produces — and that is determined not by the platform brand, but by the theme, the page builder, and the development decisions sitting on top of it.

Most Malaysian business websites are built on WordPress with a third-party theme and a page builder. On that combination, the probability of clean semantic HTML output is low. Not because of WordPress — but because of what was installed on top of it.


Why This Distinction Matters

A CMS manages your content. It provides the publishing interface, the database, the user management, and the delivery mechanism. What it outputs as HTML is largely determined by its theme — the layer of templates that converts your content into a web page.

Think of it this way: the CMS is the kitchen. The theme is the recipe. And the page builder is the cook who decides to improvise everything.

When developers or agencies choose themes and page builders based on visual output, they are selecting the recipe based on how the food photographs — without checking whether it is nutritionally sound. The website looks good. The HTML it produces may be structurally incoherent.

This is not a theoretical risk. It is the most common technical SEO deficit I find when auditing Malaysian business websites.


WordPress — Audited

WordPress powers an estimated 43% of all websites globally. Its ubiquity makes this audit particularly important.

What WordPress Core Gets Right

The WordPress block editor (Gutenberg) outputs reasonably semantic HTML when you use its native blocks. A paragraph block produces a <p> tag. A heading block produces the correct <h2> or <h3>. The default Twenty Twenty-Four theme includes <main>, <article>, <header>, and <nav> in its base templates. On its own, WordPress core is structurally respectable.

Where WordPress Installations Break Down

The problem begins the moment a third-party theme or page builder is installed — which is effectively every commercial WordPress project.

Page builders are the primary offender. Tools like Elementor, WPBakery, and Divi generate pages through a drag-and-drop interface that produces layers of <div> containers, wrapper elements, and inline styling. The visual output can be polished. The structural output is often a cascade of non-semantic tags that carry no meaning for search engines.

A heading styled through a page builder widget may render as a <div> with a custom CSS class instead of an <h2>. A content block may exist inside five nested <div> containers with no <article> or <section> wrapper in sight. The crawler sees the same structural weight assigned to a button label as to your primary article content.

Premium themes carry similar risk. The word "premium" describes the pricing, not the code quality. Many popular commercial themes produce non-semantic output, particularly when they include their own bundled page builder or shortcode system.

Plugin conflicts add a further layer. Plugins that inject content — related posts, review widgets, promotional banners — often place that content outside logical structural boundaries, adding topical noise to pages where it does not belong.

WordPress Verdict: WordPress can produce clean semantic HTML. Most WordPress installations in the Malaysian market do not — because of theme and page builder choices made during development.


Drupal — Audited

Drupal is less common in the Malaysian SME market but is often the choice for larger organisations, government entities, and companies with significant content management requirements.

What Drupal Gets Right

Drupal has maintained stronger semantic defaults than WordPress for longer. Since Drupal 8, the platform uses the Twig templating engine, which enforces a cleaner separation between content structure and visual presentation. Developers working in Twig are more explicitly required to think about the HTML they are producing — the workflow does not encourage structural shortcuts the way a drag-and-drop builder does.

Drupal's default content type templates produce appropriate semantic wrappers. The architecture of the platform — modules, content types, display modes — lends itself to structural discipline when a skilled development team is involved.

Where Drupal Installations Break Down

Contributed modules can introduce non-semantic wrappers, particularly older modules that predate Drupal 8's Twig migration. Installing a popular contributed module does not guarantee its output meets current semantic standards.

Custom theme overrides by developers focused on visual delivery can produce the same <div>-heavy output as a WordPress page builder. The platform does not enforce semantic correctness — it enables it. Whether that enablement is used depends on the developer.

Drupal 7 sites — and there are still a meaningful number running in Malaysia — predate the Twig era and frequently output non-semantic markup by default. If your organisation is on Drupal 7, structural HTML quality should be assumed to need remediation.

Drupal Verdict: Drupal is structurally sounder by design, with better defaults and a development workflow that encourages semantic discipline. Enterprise customisation and legacy versions introduce the same risks found in WordPress.


The Real Culprit: The Layer Between You and the Platform

Both audits point to the same conclusion: the CMS brand is not the variable that determines your HTML quality. The development layer is.

Page builders trade structural correctness for ease of use. That is a reasonable trade-off when you understand what you are trading. Most businesses do not know the trade is being made.

When an agency recommends Elementor because "it makes the website easy to update," they are making a structural decision on your behalf without disclosing its SEO implications. The visual convenience is real. So is the semantic cost.

Before your next website project or redesign, the question to ask is not "which CMS should we use?" The question is: "what will this CMS and theme combination actually output as HTML, and who will verify that it is structurally correct?"


How to Audit Your Current Website Without a Developer

You do not need technical expertise to get a preliminary answer.

Step 1 — View Page Source

Right-click any page on your website and select "View Page Source." Press Ctrl+F and search for <main>, <article>, <section>. If you find mostly <div> tags and none of these semantic landmarks, the structure likely needs attention.

Step 2 — Check Your Heading Hierarchy

Open Chrome DevTools (F12), go to the Elements panel, and look at your heading tags. Does the page start with one <h1>? Do <h2> and <h3> follow in logical order? A page with no <h1>, or one that begins with <h3>, has a broken topic signal.

Step 3 — Run Google Lighthouse

In Chrome, open DevTools, go to the Lighthouse tab, and run an Accessibility audit. The accessibility score is a reliable proxy for semantic HTML quality — because accessible HTML and semantic HTML are largely the same thing. A score below 80 warrants investigation.

Step 4 — Use the W3C Markup Validator

Visit validator.w3.org, enter your URL, and review the output. Structural errors in your HTML will surface here without requiring you to read code.


Questions to Ask Your Developer Before the Next Project

These four questions will tell you more about your website's structural quality than any agency presentation:

  1. "Which theme and page builder are you recommending, and does it output semantic HTML by default?"
  2. "Will the heading hierarchy be structurally correct across all page types — homepage, article, service page, contact?"
  3. "Can you show me the Lighthouse accessibility score from your last three delivered projects?"
  4. "How will you verify semantic structure before the site goes live?"

If the developer treats these as unusual questions, treat that as a structural risk signal.


Key Takeaway for Business Owners

The platform name on your website — WordPress, Drupal — tells you almost nothing about the quality of its HTML output. The theme, the page builder, and the developer's discipline determine whether your website communicates clearly to search engines. Most Malaysian business websites are built for visual approval. Structural SEO quality is assumed, not verified.


What to Do Next

A technical audit of your existing website will identify whether your CMS and theme combination are producing semantic HTML. If you are planning a redesign or new website, a structural brief should precede the design brief. The HTML specification should be set before the visual design begins — not retrofitted after launch.

Request a Website Technical Audit →


Bryan Chung
Digital Solutions Strategist
Entertop Sdn Bhd

Comments

Popular posts from this blog

Best DNS for Fastest Browsing Speed in Malaysia

What will Beaverbuilder tell you about Wordpress 5.5

WordPress Web Design Price List in Malaysia