+1 for avoiding dynamically constructed identifiers when possible. Fulltext search across multiple files is available in most tools, let it be useful. It sucks having to search for a substring, hoping you guessed the way it gets constructed. Plus, it might not even occur to you that this is what you need to try.
Very good points. A codebase that gets this VERY wrong is Gitlab. I think it might be a dumb characteristic of Ruby programs, but they generate identifiers all over the place. I once had to literally give up following some code because I could not find what it was calling anywhere. Insanity.
Another point: don’t use
-
in names. Eventually you’ll have to write them down in a programming language, at which point you have to change the name. CSS made this mistake.foo-bar
in CSS maps tofooBar
in Javascript. Rust also made this mistake with crate names. A crate calledfoo-bar
magically becomesfoo_bar
in Rust code.I’ve been working in Ruby on Rails lately (unfortunately) and yeah it’s extremely bad at this. There’s so much hidden implicit behavior everywhere.
Ruby on Rails is the worst thing to ever happen to Ruby. The language is great, much better than Python from a tooling perspective. And then Rails came along and ruined everything. I sincerely believe that the reason Python won out is solely because of Rails, because Ruby is better in every other regard.
The dash
-
vs underscore_
is also a common “problem” with CLI arguments--file-name
, that are mapped to variable namesfile_name
.Yeah, translating between cases isn’t exactly a problem IME. Might be neat to have a case-aware grep though, so you can get kebab-case, snake_case, camelCase and PascalCase all done in one go.
Has nothing to do with Ruby and everything to do with rails. Rails is terrible.
Funny because Rails is often touted as the main reason to use Ruby!
I haven’t heard that in 15 years. The main reason to use Ruby is that it’s a better Python. The tooling is 10x better, it has all the same language features and more, it has easily enough packages to handle any situation you want, and it doesn’t have all the bike shedding that Python has.
You shouldn’t be writing applications in a scripting language anyway. They’re for scripting, it’s in the name.
Well I’m no fan of Python either but it doesn’t describe itself as a scripting language (and neither does Ruby) so I think you’re way off there.
And I dunno about Ruby being a better Python. It looks way worse to me. In particular the story for static type annotation seems pretty dire. The syntax is worse, it’s less popular, and even slower!
I can believe the tooling is better though. Python’s is abysmal (unless they officially adopt
uv
- ray of hope there but I have zero faith the Python Devs would make such an obvious decision).Ruby and Python are both scripting languages and have been since being invented. Static type annotations are a dumb thing to add to a language like Python and Ruby, they’re not static languages. If you want static typing you should be using a different language. The syntax is most definitely not worse, and that’s not an opinion, Python’s for comprehensions are nightmares of readability, and hardly make sense 5 minutes after you write them. Ruby prioritizes readability over everything else.
Speed is almost exactly the same, with Ruby winning on many benchmarks. The only people saying Python wins are Python programmers. There was a post on the clojure community the other day comparing specific instances and while clojure was winning them all, Ruby was in second on most of them.
Python only looks better from the outside. I spent years coding in both at the exact same time at the same company. The only two things Python wins on is number of packages, and even that is a dumb metric (looking at you npm), and Click, which is an absolutely fantastic CLI framework.
No, no, one of the main benefits of OOP is information hiding. If your code is too greppable, developers can circumvent the information hiding.
(Sarcasm)
I sort of agree with some points, especially the ones about dynamic identifier creation and renaming identifiers, but those last 2 to me sounds a lot like you don’t know how to search beyond the really basic “I want this string here”, I’m assuming that it’s an effort to enable whoever comes next to search and find everything they should find mindlessly, not knowing the project, since the author talks about navigating foreign code bases, but I think compromises can be made when you should expect just a bit more effort from contributors for the sake of a more rationally organised code base
It’s really about lowering cognitive load when making edits. It’s not necessarily that someone can’t figure out how to do something more sophisticated, but that they’re more likely to get things right if the code is just kind of straightforwardly dumb.
The last two are definitely situational – changing things like that might lower cognitive load for one kind of work but raise it significantly for another – but I can see where they’re coming from with those suggestions.
Even the camel/snake case renaming can be handled with the right regex, but dynamic identifiers are a mortal enemy. I remember the first time I came across a rails codebase… shudder
This is one of the reasons why I don’t like short variable names, especially single letters (unless for very narrow use and obvious like
i
).There was a senior dev at my first job that we called Lord Voldemort and he was the king of ungreppable variable names. Short, full of common characters, and none of them actually described what they were doing. I swear he only used characters that appeared in C++ keywords, so looking for
fo
would invariably tag every for statement in the file.He also had hooks set up to notify when anyone was in his area of the code and you’d always get a two-hour phonecall where he’d slowly wear you down and browbeat you into backing out your changes. Every time I pulled a ticket in his codebase I’d internally shudder. He was friends and/or had dirt on the CTO so he just remained in that role and made everyone’s life hell.
I agree with the first point. Always go for clarity over cleverness.
I somewhat disagree with the second point. Consistency is important. Stick with the same name when possible. But I think mixing camel case and snake case should be avoided. It can make the code less ”greppable” IMO, because now you need to remember which casing was used for each variable.
Kind of agree on the third point. I think flatness should be preferred when possible and when it makes sense. Easier to find the variables with the eyes rather than having to search through nested structures.
Greppability also contributed to this thingy
int main() { // dam }
in Mozilla C-style and GNU C-style projects. Of course, it’s a remnant of the past (
grep ^main
), but kgmgaehgka.For code bases where this is a thing, you could use greps context lines:
grep ---before-context 1 "^main"
Using git grep os one of the most practical things I do. Whether to look for definitions, usages, or getting a list or overview of endpoints on an api, I use it for all. It’s ubiquitous, works everywhere.
Yes, other tools exist that give you this information in a clear way. But the practicality of grep is amazing.
I think the 3 points are decent guidance in general but I feel you probably should have included some examples of when it doesn’t make sense to follow them. Like everything in life, the actual realization of something is more complicated and we should provide guidance that speaks to nuances that might affect design/implementation decisions. It’s something I think we lost (or the loss accelerated) within the last 15-20 years. Now everything is “you have to do this or you’re terrible at programming” and the nuances are lost as the entire thing is framed in a way to try to grab attention/views. I don’t mean to imply you’re doing that here, just a general observation that articles and videos on programming rarely include more nuanced things.
Anyways, I agree with the overall content of the post but felt I’d provide some counter examples for each point. Admittedly they may not be the best but calling out something like them I think would be worth doing so readers have a wider view of the topic and can make more informed decisions.
Point 1: This is great general advice; be consistent with your names. However, it’s simply not feasible in certain situations. Are you building a data access library? You’re going to need dynamically named things. Maybe your system has thousands of tables (yes it happens, the real world is messy). I would much rather work on a system that uses dynamic names which enforces naming consistency than deal with some switch statement covering hundreds/thousands of things. Not only would the code be cleaner and easier to deal with that way but it would have the added benefit of running everything through the same naming logic and therefore helping with name consistencies.
Point 2: Name consistency is important (see end of above) but don’t force it when it doesn’t make sense. If you have two distinct systems/services that each operate in different domains but share some underlying data source. Maybe the enrollment service calls something an enrollment but the billing system calls it a line item. The freedom to name things appropriately for how they’re used is important and should be another tool in your belt. It also helps business users/managers/etc… and programmers have a shared understanding of domain terminology/requirements/etc…
Point 3: I’d agree for the most part and this is generally great advice. Sometimes it makes sense to go hierarchical. For example human readable configs can benefit from hierarchical structures since we like to process information by grouping things. I’d rather just have a json or yaml section called DataSources than have to repeat the “datasources.datasource1.name”, “datasources.datasource2.name” and so forth for every single datasource defined in the config.
I think the points you made are great. We should use them when appropriate though and knowing when it’s appropriate or not is something we should try to teach along with the rules themselves.
I am not a big fan of the first example. If all that a function is doing is pasting its argument into a template string, then I’d rather see that pattern expressed explicitly in a single line of code than have to mentally infer this pattern myself by reading two separately expressed cases in six lines of code.
(It’s not that big of a deal, but when reading through a lot of code to figure out what is going on, these little extra mental exertions start to really add up.)
It comes off as simulating enums with strings.
And yeah, even the string interpolation seems kind of excessive when it’s just appending
_address
. Js is even kinda infamous for how willing it is to do that with+
.