"Each barrier is no more than a target to tear down."
How a single practical need — parsing CSV files correctly in VBA — grew into a complete scripting language, a peer-reviewed research paper, and tools being adopted by developers worldwide.
I discovered that Microsoft Excel is not fully compliant with RFC-4180, and provides no built-in mechanism to dump CSV content directly into a VBA array. Excel's legacy import tools, Power Query, and even Power BI each fail on files that deviate from their narrow assumptions — like files using the apostrophe as a text qualifier instead of double quotes.
| API | Records | Fields | Correct? |
|---|---|---|---|
| From Text (Legacy) | 5 | 4 | ✗ |
| Power Query | 5 | 6 | ✗ |
| CSV Interface | 4 | 2 | ✓ |
"I'm sure your tool fills some specific use case but I don't think you've done a very good job of explaining/selling it."
— r/vba community memberI built a complete, RFC-4180-compliant CSV parser in pure VBA with configurable text qualifiers, delimiters, escape characters, and direct array import — the capability Excel itself lacks.
CSV Interface became the foundation from which every subsequent project emerged.
🔗 VBA CSV Interface on GitHub →
When told my tool was poorly explained, I didn't argue — I showed a side-by-side comparison where CSV Interface produced the correct output and every Microsoft tool failed. Let the results do the talking.
CSV Interface needed to automatically detect the dialect of unknown files. Existing tools treated this as character-level pattern matching — counting delimiter frequencies. But that's fundamentally the wrong approach when a file like 1,5;33,33;15,55 makes commas and semicolons genuinely ambiguous.
"Interesting but then becomes somewhat pointless."
— sancarn (stdLambda author), r/vbaInstead of asking "which character appears most consistently?", I asked: "which candidate delimiter produces the most uniform table?" This reframed dialect detection from signal processing to structural inference.
| Metric | CSVsniffer MADSE | CleverCSV | csv.Sniffer |
|---|---|---|---|
| Weighted F1 | 0.9378 | 0.8425 | 0.8049 |
| Failure Ratio | 2.86% | 7.99% | 19.83% |
| Reliability Factor | 87.80% | 72.96% | 67.54% |
📄 Published as "Detecting CSV File Dialects by Table Uniformity Measurement and Data Type Inference" in SAGE's Data Science journal.
The community dismissed the problem because they were solving the wrong one. I didn't accept their framing. I asked "what defines a correct parse?" and the answer — table uniformity — shifted the entire field.
CSV Interface needed users to define computed fields dynamically. VBA has no native string-based expression evaluation, so I needed a library. Volpi MathParser was abandoned. stdLambda was well-crafted but carried two architectural limits:
⏱️ ~50x performance cliff on math functions (sin, cos, tan) — not a tuning issue, an architectural one.
🔢 Positional variables ($1, $2) — incompatible with programmatic variable injection from another library.
"It definitely is unfortunate that accessing and editing the locals table of stdLambda is non-trivial. Definitely something for the library to improve upon."
— sancarn, acknowledging the limitationPhase 1 — VBA Expressions: Named variables with order-independent assignment: Run("x = 1; y = 2; z = 3"). "Analyze once, evaluate many" architecture.
Phase 2 — ASF: The evaluator naturally extended into a full JavaScript-like scripting language in pure VBA — with parser, compiler, VM, closures, classes with inheritance, module system, and VS Code tooling with 181 snippets.
This isn't scope creep — it's scope discovery. Each limitation pointed to a deeper need. Once you have a proper parser, you're one step from closures; one step from there to first-class functions; one step from there to a complete language. I chose the overwhelming path because it was the correct one.
VBA is a language frozen in time — no closures, no first-class functions, no module system. Millions of enterprise workbooks depend on it, but the language hasn't evolved in decades.
twinBASIC requires a new toolchain — not feasible in locked-down enterprises. stdLambda is valuable but constrained by VBA's grammar. Python/COM bridges add deployment complexity.
ASF implements a new language on top of VBA rather than trying to extend VBA's syntax. This means zero external dependencies, no migration required, and modern features running on decades-old infrastructure.
🔌 Pure VBA · 🔄 Coexists with existing code · ✨ Closures, classes, modules · ⚡ VM-optimized · 🛠️ VS Code tooling
Where others see VBA as a dead language to escape from, I see an ecosystem to elevate. I don't abandon users stuck on legacy platforms — I bring the future to them.
The Table Uniformity method — born from a dismissed Reddit question — is now adopted across languages and being considered for the world's most-used programming language.
Joel Natividad (maintainer of qsv) ported the Table Uniformity method to Rust. Achieves 99.55% accuracy on W3C-CSVW. Published on crates.io, uses CSVsniffer's benchmark datasets.
Andy Terrel (CPython contributor) proposed rewriting csv.Sniffer using Table Uniformity: "my research keeps leading to the table uniformity algorithms." Core dev Serhiy Storchaka endorsed the approach. Stephen Rosen confirmed: "I am in favor of you doing this work."
Published in SAGE's Data Science journal with rigorous F1 score analysis across 5 datasets, compared against CleverCSV, DuckDB, and Python's csv.Sniffer. The paper is now cited by independent implementations.
Every project started from a concrete need — never abstract curiosity. CSV Interface → dialect detection → expression evaluation → ASF. Each solution revealed the next problem.
I called stdLambda "fabulous" — and meant it. But honest evaluation means diagnosing fundamental limitations, not just surface bugs. The 50x cliff and positional variables aren't complaints; they're diagnoses.
The decisive move was always recognizing the community was solving the wrong problem. Dialect detection isn't character counting — it's structural inference. VBA modernization isn't syntax extension — it's platform engineering.
I pursue solutions to their logical end. An expression evaluator became a scripting language. A delimiter guesser became a peer-reviewed paper. This isn't scope creep — it's scope discovery.
I don't declare solutions superior — I measure them. F1 scores across datasets. Benchmarks against competitors. Peer review. Empirical validation transforms personal projects into trusted contributions.
The "specific use case" became the foundation. The "pointless" guesser reached Python's stdlib discussion. The expression evaluator that wasn't necessary for being a reinvented wheel became a complete language. Every dismissal was fuel.
| Project | Origin | Community Verdict | Outcome |
|---|---|---|---|
| VBA CSV Interface | Excel fails on non-standard CSV | "Specific use case" | Foundation for all subsequent projects |
| CSVsniffer | CSV Interface needed dialect detection | "Somewhat pointless" | Peer-reviewed paper · 93.78% F1 · Rust port · Python stdlib discussion |
| VBA Expressions → ASF | CSV Interface needed expression evaluation | stdLambda suggested as sufficient | Complete scripting language with VM; outperforms stdLambda |
| ASF Platform | VBA lacks modern features | twinBASIC / stdLambda suggested | Full JS-like language in pure VBA, VS Code tooling, module system |
Identify the real problem. Build the real solution. Prove it with real data. Let the results answer every doubt.
Find this journey interesting? Explore more on GitHub ↗