Legacy is, usually, a nice word until you apply it as an adjective to your work. Then it becomes a hideous monster lurking in the shadows behind your compiler waiting to chop off your head. Or the performance of your app.
Let’s take an example: an app that uses some library directly ported from C# to JavaScript using the old method of running the code and modifying accordingly when the browser complains. Additionally the code has two different responsibilities and it’s not documented or even explained. The resulting library is slow. Specially for certain operations. And the project manager is complaining. Sounds like a dream job.
The purpose of the library, as mentioned in this article but never in the code or the docs, is two folded: firstly to load, process and execute configured queries; secondly, to log every non-select query into an special format so the synchronization mechanism can send it directly to the server. The problem is: in order to achieve the first purpose, the library uses the format required for the second one. Upon careful examination you can find that, in fact, the second purpose is achieved by less than five lines of code scattered into more than 1,600 lines in two files.
Let’s rephrase this in other way: to use a library to execute a code we need to create a XML string in our side of the code, pass it to the library, that in turns parses the XML and extracts the data that we’ve so carefully encoded. Just because a separate functionality needs that XML. No wonder the code is slow. Very slow. For every query it’s encoding the parameters of the query into an XML and then parsing the resulting XML to extract the parameters and pass them to the real query.
The sensible way to solve our problem (i.e., this back and forth encoding is killing our performance) is to start isolating pieces of the code. Identifying the different parts can be tricky and it’s a job best suited to patient people but, in the end, you must be able to have three isolated parts:
- the code that stores the synchronization information,
- the part where you read the stored queries from the database and
- the part where you execute the query applying the parameters.
The first part is the most important since is the only piece of legacy stuff you need to keep in your code. You can completely refactor the rest of the library to better suit your needs, but you need to maintain compatibility with the rest of the system, no matter its quality or age.
In the original implementation we needed to encode, in our side of the code, the parameters for the code. In the new library there’s a class just for creating the XML from the data of my app and making that XML to the sync mechanism, only if and when needed. Now the class that handles stored queries just need to do that: load the SQL, apply any needed parameters and send it to the class that executes all the queries, stored or not.
As for the two other parts, our library already has a class to execute queries so, once isolated we can delete the code from the legacy code that execute its queries, keeping us only with the reading and processing of the stored queries. Single Responsibility Principle: a class must be responsible of doing just one thing.
So the final implementation allows us to get rid of most of the code: from almost 1,700 to 250 lines of code, without taking into account the classes already implemented and used across the app. Instead of using two big unreadable undocumented classes we have three simpler, leaner and cleaner that achieve the same goal. Just much faster.