Good point to (re) raise about reimplementation there. I wonder if you'll ever change your mind about written spec? :)
It suddenly made me consider: what if a spec was written and seemed to be compatible with entire history- except, we don't know what is inside all scripthashes, so we can never know if one of them is compatible with Core and not with a future spec.
Definition by test case listing sounds cool except presumably search space is too exponential