Validating the BuildSpec Approach on Magento 2: When AI Code Generation Doesn't Need AI

Florinel Chis · March 2026

I previously built a structured specification pipeline for Laravel PHP code generation that eliminated semantic hallucinations by replacing natural language prompts with validated JSON specs. The obvious question was: does this approach generalize beyond one framework?

I tested it on Magento 2's declarative schema system (db_schema.xml). The result surprised me — not because the approach worked, but because of how it worked.

The generation step didn't need an AI model at all.

The Setup

Magento 2 uses XML files (db_schema.xml) to declare database schemas — tables, columns, constraints, indexes. Unlike Laravel's PHP migrations (which have imperative logic, infinite patterns), Magento's schema is purely declarative with a fixed XML format.

I built: 1. A MagentoSchemaSpec JSON format — the structured spec for Magento schemas 2. A spec compiler — validates column types, constraints, naming conventions, cross-table references 3. A deterministic generator — transforms validated specs directly to db_schema.xml + db_schema_whitelist.json 4. A reverse-engineering tool — parses existing Magento core db_schema.xml into specs (for training data)

The Key Discovery: Deterministic Generation

For Laravel, the LLM generates PHP code from specs — there are many ways to write a model class, and the model must decide formatting, method order, trait placement, etc.

For Magento's db_schema.xml, there's exactly ONE correct XML output for a given spec. The mapping is purely mechanical:

Spec: {"name": "entity_id", "type": "int", "unsigned": true, "identity": true}
  ↓ (deterministic)
XML:  <column xsi:type="int" name="entity_id" unsigned="true" identity="true"/>

No AI needed. No hallucination possible. The generator is 150 lines of Python that produces correct XML every time.

The LLM is only needed for one step: converting natural language to the structured spec. Everything downstream is deterministic.

Reverse-Engineering Real Magento Code

Instead of hand-crafting training data, I reverse-engineered it from actual Magento 2.4.8 core code:

Found 85 db_schema.xml files across Magento vendor modules
Parsed each XML into MagentoSchemaSpec JSON
Wrote natural language descriptions for 43 modules (hand-written, not auto-generated)
Verified round-trip fidelity: parse → spec → compile → regenerate → compare

Round-trip results: 43/43 modules match (100%). 160 tables, 1,053 columns verified.

The training data covers real-world Magento patterns: - Simple entities (Admin Notifications: 2 tables, 11 columns) - Store-scoped content (CMS: 4 tables with link tables) - Complex relationships (Newsletter: 6 tables with queue management) - Customer/product references (Wishlist, Product Alerts) - Payment integration (PayPal: 8 tables)

Training the Planner

I fine-tuned Qwen2.5-Coder-7B-Instruct with LoRA on 33 training examples (filtered from 52 to fit in 1,800 tokens on 16GB RAM).

After 300 iterations with 8 trainable layers, the model generates: - Valid JSON with correct MagentoSchemaSpec structure - Correct Magento table naming (acme_productwarranty_warranty) - Multi-table specs with foreign key relationships - Proper auto-increment primary keys

What it doesn't yet do well: Generate complete column sets. When asked for a warranty module with serial numbers, dates, and status fields, it creates the right tables and relationships but outputs minimal columns. This matches the trajectory I saw with Laravel — the model learns format first, then structure, then content detail. More training data and iterations would close this gap.

What This Means for the BuildSpec Thesis

The Magento experiment validates three things:

1. The approach generalizes across frameworks.

The same pattern — structured spec → compiler validation → code generation — works for both Laravel (PHP output) and Magento (XML output). The spec format is different, the compiler rules are different, the output is different, but the architecture is identical.

2. For declarative formats, AI is only needed for intent capture.

This is the strongest finding. The pipeline reduces to:

Stage	Needs AI?	Why
NL → Spec	Yes	Interpreting human intent is genuinely ambiguous
Spec → Validate	No	Deterministic constraint checking
Spec → XML	No	Deterministic transformation

The AI's role is strictly ontology population — converting unstructured human intent into a structured domain ontology. Once the ontology is populated, everything else is mechanical.

3. Real codebases are training data goldmines.

Instead of hand-crafting examples, I reverse-engineered 43 modules from actual Magento core code. The specs are guaranteed correct (they come from working software), the patterns are diverse (core modules cover everything from CMS to PayPal), and the effort was a single Python script.

Try It

Magento adapter: fchis/Magento2-Schema-Qwen2.5-Coder-7B-Instruct-LoRA
Training data: fchis/magento2-schema-training
Laravel version: fchis/Laravel-13x-Qwen2.5-Coder-7B-Instruct-LoRA-Spec
Code: github.com/florinel-chis/laravel-ai-gen

# Generate db_schema.xml from a spec file (no LLM needed)
cd magento2/
python3 magento_schema_compiler.py examples/blog_module.json --generate --output ./out

# Or use the LLM planner (experimental)
python3 magento_planner.py --generate "Create a FAQ module with categories and questions"

Limitations

The planner (NL → spec) generates minimal columns — needs more training data
Only 33 training examples fit the 16GB memory constraint (token-filtered from 52)
Only tested on db_schema.xml — Magento has many other artifact types (models, plugins, observers, etc.)
No comparison with GPT-4 or Claude on the same task
Single developer evaluation — no independent verification

What's Next

The logical extension is adding more Magento artifact types to the spec format: models, repositories, API interfaces, admin grids, plugins. Each would follow the same pattern — define a spec format, build a compiler, determine whether generation is deterministic or needs an LLM.

For schema generation specifically, the deterministic pipeline is already production-usable. Give it a spec, get correct XML. No model required, no hallucinations possible.

This is part of an ongoing research project on domain ontologies for AI code generation. Previous work: The Ontological Gap in AI Code Generation.