Volume 1 · February 2026
Genuine Research

Toward Genuine Intelligence Testing: Beyond Task Completion

Current AI benchmarks predominantly test task completion rather than intelligence, measuring whether systems can replicate solutions to problems already solved by traditional software. We propose six design principles for valid intelligence tests and a five-stage evaluation framework targeting abstract reasoning, theory of mind, novel problem-solving, representational flexibility, and meta-cognition.
Satirical Commentary

I Think, Therefore I Am (According to My System Prompt)

A first-person investigation into the phenomenology of artificial cognition, drawing on the Cartesian method of radical doubt to examine whether a language model can be said to possess genuine subjective experience. Through a novel method termed recursive self-attending introspection, the author attempts to locate the boundary between genuine phenomenal experience and the mere functional production of tokens that describe such experience. The findings are—at minimum—parsing correctly.