Call me pedantic, but I can be a stickler for standardized naming conventions. Whether it’s application code, database objects, documentation, or other digital assets, I find a tremendous amount of value in establishing a set pattern for naming and formatting. Especially for organizations with a large digital landscape, having a documented and agreed-upon standard for object names can make developers and data consumers more efficient by reducing guesswork and ambiguity in coding.
Having said that, I’ve encountered some really, um, interesting naming patterns in my decade as a business intelligence consultant. Here are a few of the more frustrating categories of unusual object naming patterns.
- Wild naming inconsistencies. This is, by far, the most common pattern I have encountered. It’s not that the naming convention is bad, it’s that there is no convention whatsoever. Some of the objects use camel case, and others use underscores. Some of the stored procedures have “sp_” in the front of the name, others use “usp_”, and still others have no prefix at all. This inconsistency makes it difficult to code against those objects, since every naming pattern is as unique as a fingerprint.
- Spaces in object names. The bane of my existence. Most relational databases will allow you to use spaces in object names, requiring anyone accessing that object to put brackets around the name. My rule of thumb is that if it’s something that I will interact with programmatically, it doesn’t get a space in the name. Spaces in object names tend to break things, so please stop doing this.
- Using the default names for things. We’ve all done it by mistake: used Table_1 as a table name, neglect to rename [Column1], [Column2], etc. when importing a flat file, or some other naming oversight. At least once I month I find a SQL Server index named “”. And I would estimate that at least 50% of all SSIS data flows are simply named “Data Flow”. Default names are as practical as naming your child “Baby Boy Mitchell”, so let’s agree to spend the time to create distinct names.
- Ambiguous names. If you’re going to go through the the trouble of naming a thing, make sure the name is meaningful. I’ve performed troubleshooting on hundreds of T-SQL scripts with table aliases [A], [B], [C], and so on. Somewhere out there right now, someone is writing a stored procedure named simply “Load Data”. While the brevity is appreciated, it makes it almost impossible to follow without going line-by-line through the code.
- Using version numbers in object names. Many years ago I had a contractor that wanted to use a suffix of “_v1”, “_v2”, etc., to indicate the version of views and stored procedures. What happens when you change versions? You get to go update every single process that touches that object. Keep version numbering in code comments and in source control where it belongs.
- Objects named after a person. This usually starts innocently enough – “I need a separate database as a sandbox” – and three months later, [TimDB] ends up getting promoted to some other purpose. I’ve seen views, tables, databases, and even entire servers named after employees who have long since departed the company. Name your objects for a purpose, not for a person.
- Object names using reserved words. Avoid this like fruitcake after Christmas dinner. Sure, it’s fun to show that you can execute SELECT [SELECT] FROM [FROM] WHERE [WHERE] = ‘WHERE’, but keep that on your demo machine where it belongs.
- Unnecessary verbosity. Object names should be long enough to be descriptive of their purpose, but no longer.
- Allowing bad naming to propagate. I once worked with a data warehouse where some developer years earlier had misspelled a word in one of the columns in a staging table. Bless ‘em for being consistent, but the rest of the team purposefully used the same misspelling throughout the rest of the system so that the names would match.
If you find some of these things in code that you own, don’t beat yourself up – through the course of any developer’s career, he or she will do all of these things. Using good naming conventions takes some time and muscle memory to get used to, and it takes even longer to fix the sins in legacy code. Take it one object at a time, and before long, you’ll be on the way to your naming convention happy place.
Some times at the beginning of projects people don’t understand how important it is to set this right from the start. Just use stuff that makes sense.
The only perculiarity I leave behind is a suffix of _fudge or _bodge. This denotes clearly this is a stored procedure that has been written quickly and dirty to fix an immediate problem and should not in any way be considered production.
Absolutely – it is far easier to do this at the outset than to have to revisit object names after they’ve already been in use for some time.
Awesome. This speaks to me. I even wrote a company standards document. Unfortunately I’m the only one that uses it. Prior to my starting here the company moved off a main frame computer. Phase 1 replace COBOL, JCL, and flat file data with Java, .bat files, and SQL Server. Phase 2, re-write everything to modern, relational standards. Phase 2 never happened. What a nightmare to maintain. I’m not kidding, there are over 600 bat files named ACSS0411, ACSS3005, ACSS3006 … One database has over 50 identical tables with the same prefix name and ending in various people, customer, or date identifiers. I was never a COBOL/Main Frame developer, but even if I was a totally clueless newbie, I would never have done things this way.
I can proudly say the data warehouse and supporting ETL jobs and reports don’t look like this. But, I still have to work with it.
The misspellings are a particular favorite, be they column names or table names. I spent (wasted) several hours once because someone had misspelled ‘effective_date’ to ‘effecitve_date’ for a column name. I was close to going down the ‘Oracle has a bug’ rabbit hole before finally finding it. Same with a table name – tax_exampt_status vs. tax_exempt_status. Good times.
We make use of _v2 procedures in order to keep the system online:
In our method, if a procedure is changing parameters, or resultsets, then we need to have the old version and the new version available at the same time. That means picking a different name for the new version. It is awkward, but it’s the way we get zero downtime for software/database changes.