"Yes, There is Code in Your Codebase" - How to Waste Your Time Testing Nothing

Before delving into the contents of this article, it's important to establish the right perspective. Unit testing has long been a focal point of discussion in the world of software. Countless seasoned engineers have written articles after articles, exploring ways to utilize unit testing to improve their development efficiency and deliver higher quality product.

I, as a 4th year university student, who just finished a short internship we're about to cover, don't consider myself experienced enough to provide good takes. But I think it is beneficial to document my experience, for myself to revisit in the future, and see if my opinions have changed.

The incorporation of unit testing into your workflow presents a multitude of approaches, and much like the agile methodology, there isn't a universal "one-size-fits-all" solution. If you encounter anyone claiming otherwise, I would suggest taking that statement with a whole silo worth of salt.

I started working for a company as a student intern. The company was building web apps for customers across the globe. During my time working (remotely) with them, I had the chance to work with many talented engineers. I learned a lot about how to use the knowledge we learned at school, and applying them in real world.

But in this article, I want to talk about some of my thoughts when working with this team, mainly how they integrates "Unit Testing" into their workflow.

What is Unit Testing?

According to Wikipedia (as of 2023/08), unit testing stands for:

A software testing method by which individual units of source code—sets of one or more computer program modules together with associated control data, usage procedures, and operating procedures—are tested to determine whether they are fit for use.

So the thing I was confused about with this definition, is "what exactly is a module" in this context.

And in the same Wikipedia page, it wrote

In procedural programming, a unit could be an entire module, but it is more commonly an individual function or procedure. In Object-Oriented Programming, a unit is often an entire interface, such as a class, or an individual method.

And it adds this...

By writing tests first for the smallest testable units, then the compound behaviors between those, one can build up comprehensive tests for complex applications.

So, if my understanding of this concept is correct, I'm supposed to write a test that is capable of making sure a smallest testable unit, which is decided base on the programming paradigm used.

The Task I Was Given

The company at that time made a new policy that all repos need to reach a certain percentage of coverage, so as the intern who had nothing else better to do, I was task to write unit tests for our code base.

I guess I need to roughly explain what the team I was working with was building. I'm not sure what I can and cannot share, so I will avoid any details that might get me in trouble.

The application we're working on have some servers that does most of the resource heavy computing, and a lot of different propriety internal software packages to process data and display on the client browser.

The UI part of this application is defined in a series of JSON files, detailing what UI component is used, binds to what function in what JavaScript file., etc. (side note: I don't know why but this framework really gives me a JDSL vibe)

But for some UI components, we need to write a component class, and call a series of other UI components to construct the component, then bind them to specific methods.

This framework also have a context service, and most states are updated by calling a function like context.updateContext("contextName", value). I'm not sure if any type checking is performed by this service.

The First Module

After carefully reviewing the provided instructions and consulting various tutorials and online documentation, I embarked on the process of writing tests. This was an exciting moment for me, as it marked the first time in months that I had the opportunity to engage in actual code writing.

The initial module assigned to me was rather straightforward, allowing me to complete it swiftly. Looking back, I can admit that it was a bit messy, but considering it was my first unit test, I am fairly content with the outcome. The requirement was 30%, and if I recall correctly, I achieved approximately 80%.

"Good job", said one of our team member, merging my code without really reading it, "now onto the next one".

In the following weeks, I was assigned more modules to work on, each with varying levels of complexity and file counts. While writing these tests, I began to notice some patterns.

For each test, I needed to create a minimum of 4 mocks, and not all of them were reusable for every run. This was because the code being tested wasn't exactly what I would personally call "tidy" or "well-structured". For some test that need to reach a specific line 3 functions down the line that can not be reached otherwise, I wrote almost 10 mocks, each with its own little quirks that needed to be taken care of individually, without affecting other mocks.

Sometimes, I had to mock standard JavaScript APIs and write logic that was more complex than the actual code I was testing, all in the pursuit of achieving adequate test coverage.

As for the non-exported functions, due to our in-house JavaScript framework's highly customized nature and its unique approach to bundling ESM, I couldn't use existing solutions to test non-exported functions directly.

To address this issue, I often had to set up complex tests just to reach a few lines of code that contributed little value. Furthermore, these setups were often specific to that particular part of the code, making them non-reusable for similar tests.

During this process, I had to terminate tests every now and then, just to ensure the best unit-test library in the world, Jest, won't eat up all the memories and completely freeze my laptop.

Despite these challanges, I somehow still managed to pushed working tests to the develop branch, and achieved most of the targets.

"Good job, looks good to me", said the scrum master in our weekly standup meeting. "try to finish all the assigned targets."

So I did, but the code I wrote to make everything work was so terrible that I felt ashamed to show it to my teammates.

"Wonderful job (he actually said this)", said the manager in the iteration review meeting (which took 2hrs+ btw).

So...What went wrong?

As mentioned, the design of the framework itself contributed significantly to the challenges we faced when writing tests. However, I'm unable to provide specific details about its design.

What I can discuss is the overall quality of the repository.

Organizing codes and their responsibilities

This repository contains numerous lengthy functions that are in charge of too many responsibilities, and their details are not separated into smaller, easily testable chunks.

It is common to encounter functions controlling two unrelated logic. While it may be true that in this case the second half of a function always follows the first half, combining them in the same function creates complications when testing the second half. And in other cases where the code is way more complex than this, this practice makes testing way more complex than it should be.

Failure to separate the responsibilities also leads to the need for unnecessary mocks. In a function spanning over 200 lines, I often find myself creating intricate mocks just to trigger a specific line.

Code hoarding

When I joined the team, the project was already two years old. While it doesn't sound like a long time, the accumulation of small technical debts made in the past 2 years collectively built up and turned into a monster beyond our control.

Typically, in such cases, the team should perform code reviews and refactoring to maintain the project's quality. Our team follows this practice, but for some reason, they seem hesitant to remove code, and doesn't really like the idea of rewrite. I think it's because rewrites doesn't bring any immediate value, and the mindset is often "if it ain't broke, why fix it?". But I would argue it does bring a lot of value in the long run by improving maintainability, readability, and making small performance improvements.

And honestly, these rewrites should've been done before the original commit is pushed, followed by thorough reviews by other team members. I know they're capable of writing better codes, but they seemed satisfied as long as the code executes.

When they do refactor code, they often leave a lot of comments explaining their reasoning. What they failed to do was removing the function they just refactored, even when the entire function is now essentially nothing more than an empty shell filled with unnecessary comments.

Another example is, there was a piece of code that removes event listeners. But the problem is, there is no event listener to remove in the first place. I have searched through every file, every package, there's no way this piece of code can be executed. Maybe it was written to be a temporary fix at some point, but now, it serves no purpose. It just sits there, doing absolutely nothing, gathering digital dust, consuming memory as snacks.

This practice of code hoarding results in a clutter of dead code and redundant comments in our project. Not only do they make writing tests difficult, but they also hinder future changes. Every time someone makes a modification, they need to take these dead codes into account.

It really makes me wonder, what is the point of testing if issues that anyone can spot with their eyes aren't fixed first?

The Coverage Requirement

Some of you probably already saw a problem with how we write tests. We weren't writing test to test our code, we were writing tests to raise coverage.

Using test coverage as the metric to ensure a codebase's quality makes everyone forget why we test things in the first place. We test, to find problems, to make sure we won't accidently break anything. If we're just blindly chasing the coverage target, then all the additional 30 minutes of CI action tells you is "Yes, there is indeed code in your codebase", and in a perfectly prepared environment, they do execute.

When writing these tests, I noticed a significant portion of them seemed to be for code that serves no practical purpose, or doesn't really require testing. Like if this function is only used to call four other functions, do I really need to test it? Personally I think it's more suited to be tested with other method of testing, not here in unit testing. Aren't we supposed to test the "smallest testable units"?

Often, I found myself crafting overly complex tests just to boost the coverage number. The tests itself was written to fit the code, not the other way around. And when the actual code changes to fit new requirements, these tests is completely outdated and needs to be rewritten again.

It was not for lack of trying. I have attempted to consider various scenarios each function might encounter. However, with the code being a tangled mess, there's just no way anyone can write tests that provides any value.

Thoughts

I want to clarify I don't intend to criticize the team. They didn't initially plan to integrate unit testing into this project, and the requirement wasn't set by them either. Everyone on that team were so nice to me, and happy to answer my dumb questions. I really learned a lot during the short period I worked there. However, what this experience has highlighted is that blindly applying unit testing to an existing project without a robust review process can often result in a waste of time and resources, with minimal gain.

I'm not opposed to the idea of using unit testing to improve code quality; it's a valuable tool in the right context. However, it's not a one-size-fits-all, magical solution to every problem. In this particular case, unit testing seemed to add more overhead to our already slow CI process, but didn't give us any useful information.

It's possible that my limited experience contributed to my difficulty in seeing the value in this approach. Here's how I would personally apply unit testing:

  • Expect to Be Tested Even if you're not actively testing your code, it's beneficial. It encourages you to write code in a more maintainable manner from the start.
  • Coverage Isn't Everything Instead of writing tests to validate every possible code path, focus on identifying and testing the critical portions of your codebase that genuinely need thorough testing.
  • Find and Address Flaws Testing should help you discover flaws and areas for improvement. If writing a test feels overly complex, it's often a sign that there's room for enhancing the code being tested.

In essence, unit testing should be a tool used thoughtfully and strategically, rather than a checkbox to meet a coverage requirement.