Different questions here deal with the problem of whether to include main effects in interaction models, for example here, here and here (for the opposite problem, omitting interaction coefficients where there are interaction effects in the DGP, see here).
But I have found that the general suggestion to include the main terms in interaction models is called "Hierarchy principle", for example here, or "hierarchical principle", for example here.
Is there a reference for this name of the principle? Who called this principle with this name? What's the first use of this name for this principle?