Saturday, January 19, 2019

Minimal and intuitive type parameter naming

I would like to suggest an alternative to the standard ways of naming type parameters when nothing is known about the types being identified.

In C#, type parameter names typically start with T and are written in PascalCase.  When nothing (or almost nothing) is known about the type identified by the type parameter, it is common to see the single letter T as the identifier.  When many type parameters show up and nothing is known about any of them, then I usually see one of two conventions for naming them.

One way is to stick with T as the first letter and simply append a number to distinguish it.  A good example of this is multi-parameter versions of Func.  The problem with this approach is that the initial T becomes noise.  When every type parameter in this densely packed space starts with T, this letter doesn't convey anything.  It is clear from context that we are dealing with type parameters, so the T becomes redundant information.  Syntactically, it is merely allowing the name to be a legal identifier; in this case, to start with a letter or underscore.  Removing each T results in a number, which is not a legal identifier.  The names for the variables of each type are equally bad.  In the Func example, they are named arg1, arg2, etc.  Another common naming convention for this is t1, t2, etc.

Another approach to this naming is to stick with single letters like T and then pick surrounding letters such as S, T, U, etc. or T, U, V, etc.  I like style better.  These single letter identifiers are getting closer to the essential need here, which is merely to distinguish themselves from each other.  The identifiers for the variables follow suit and are the lowercase equivalents.  Both this style and the previous one convey order: the previous one did so explicitly with numbers while this style does so more intuitively by using a subsequence of letter from the alphabet.  The downside with this convention though is that it can be difficult to remember where the sequence starts (is it at S or T?) as well as have a good intuition for how far along the sequence a particular letter is.

The approach that I prefer makes an additional improvement that removes the downsides of the previous approach.  Instead of starting with S or T, start with A.  If four type parameters are needed, then in this conversion, they would be A, B, C, and D.  Now suppose that you just glanced at the last type parameter without scanning though the whole list.  You would instantly know that there are four type parameters involved; no conscious mental effort is required.

Now that we have sufficiently defined these three type parameter naming conventions, let's give them identifiers of their own.  Let's call the first one the numeric convention, the second one the mid-alphabetic convention, and the last one the alphabetic convention.  (I just made these names up as I am writing this.  If these type parameter naming conventions have, well, conventional names, I would love to hear about that.)

Often there is one crucial difference among the type parameters: one is the return type and the rest are input types.  In the numeric convention, the return type is often called TResult.  In the other two conventions, I usually see the return type identified with R.  This leads to another advantage of the alphabetic convention over the mid-alphabetic convention.  If R is the return type, then there is some cognitive dissonance with seeing the inputs type names be letters that come after R in the alphabet.  Instead, when starting with A in the alphabetic convention, all of the input types names come before R in the alphabet (even in the case of 16 input type parameters that last name is the letter Q, which still leaves a gap of size one before R).

Here is an example of me standardizing a file to use the alphabetic type parameter naming convention.  In fact, I first encountered the alphabetic convention in that GitHub project, which is called language-ext.  The author Paul Louth has used multiple type parameter naming conventions while developing that project but currently prefers the alphabetic convention as well.

I am a strong advocate for the alphabetic convention over the other two.  It uses a minimal amount of characters to distinguish the types while also being intuitively clear by relaying on our deeply seeded memory of the alphabet.  I hope I see this convention used more often.

No comments:

Post a Comment