If you’re an analyst or if you need to write quite often SQL code for deep-dive analysis or dashboard building, one thing you might consider is how you need to structure your code.
Think about it. 6 months after starting deep-diving on multiple topics you’ll be literally buried by the amount of files filled up with code and finding what you wrote 3 months ago will be … quite challenging. Stakeholders will come to you or you will need to update a data extraction, an analysis: if you haven’t stored your code, you’ll spend an incredible amount of time just searching for your previous code. No one loves that.
Data analysts need to find a way to structure their multiple analysis and be able to find what they’ve been writing a while ago, in the blink of an eye. This becomes even more challenging if the datamodel turns out to be dense: having to deal with thousands of SQL tables, an incredible amount of filtering possibilities or looking for the correct database, especially if you’ve already been through it, will be driving you up to the wall.
No one wants to lose time and re-do what they’ve already been doing. Getting the code structure right from the start is a big matter and will help you increase your productivity as time goes by. You want do have it right, don’t you?
One – misleading – approach would be to store data analysis into excel or text files, along with Excel or PowerPoint documents and having one folder per topic. While sounding like a good idea, the more you’ll have code and deep-dived on many topics, the more difficult it will be for you to get this information quickly. You’ll have first to remember the exact topic, open the right document and if you need to pick some information from one analysis and cross it with another piece of analysis already done, then this will be a nightmare.
To avoid this situation and better organize your data analytics code, the best solution might be the following. You need to have 1 and only 1 single repository named SQL. Inside it, you can have some high level topics like “customers”, “merchants”, “marketing”, “business unit” – whatever topic you think is relevant to your job perimeter. Then inside each folder, you can write and save “.sql” files and only this kind of files: no excel, no word, nothing, only code. You would have everything at the same place.
Once you’ll have adopted this solution, you need to work with a file editor like “sublime text”or “atom io” which will be able to ease your job and simplify the search of code. Since these code editor softwares have embedded navigation trees and handle the “grep style” functions (CTRL + F but for all files), finding a portion of code will be a piece of cake and navigating to the right file will be smooth.
You will be able also to save it and use Git to save it, share it and push it to a server if needed. You will be able to save versions of your SQL code, its evolution and find easily how you did a previous analysis. Cool, hun?
And you: what’s your method to handle analytics code?