• Uncategorised

Those parameters are hyper and they need tuning.


Growing up back in the 80’s and 90’s, watching TV was a luxury in comparison to what it is today. Not only did the TV station start broadcasting at 3pm and finish at midnight (not that it mattered that they operated that late, as I was never allowed up past 8pm anyway), but receiving a good enough signal was as much a science as it was an art.


In today’s post, I want to bring you a step closer to the world of machine learning to help you understand a phrase – Hyperparameter Tuning – that comes up quite frequently in the model development process. I will be comparing the process to tuning our TV during my childhood to help draw some comparisons.
Back then, ever so often, in order to pick up broadcast signals from the stations, our TV set would require tuning. For this problem, we can define our objective as identifying the frequency that provides the strongest broadcast signal.


Our TV had a search functionality where it would cycle through different frequencies and identify those with strong enough signals and we could save that frequency down as a channel on our TV. Great!
The problem here is that before we even began the search process, we needed to “set some parameters” to give us the best chance at finding a strong signal. We will call them hyperparameters (since they sit outside of the TV search process). Some examples of hyperparameters we needed to set were:

➡️External aerial height: Our aerial was located outside the house, attached to a pole, we found that varying the height on the pole in which the aerial sat could impact our signal strength.

➡️External aerial direction: Rotating the aerial gave different results.

➡️Presence or absence of a metallic clothes hanger behind the TV: If you know, you know (maybe this only happened in my house).

➡️Search range: We would set the TV to either search in the Ultra High Frequency (UHF) range, Very High Frequency (VHF) range or Very Low Frequency (VLF) range.


In this example, we have four hyperparameters which all require tuning i.e. choosing the best value for each one whereby it gives our objective function (identifying the frequency that provides the strongest broadcast signal ) the best chance of success.
Due to in depth experience, we usually had a very good idea how to set those hyperparameters, so our search process was relatively quick.


In a machine learning problem, the objective is usually to find the “best” model (“best” depends on the problem), but before we start searching for the best model, we need to set up some hyperparameters to give the algorithm the good chance at finding an acceptable solution. Sometimes, the ideal values of hyperparameters are not known so we may have to search for the optimal combination of hyperparameters first before fitting a model. Hyperparameter tuning can be very computationally expensive because you have to run your search process for different combinations of hyperparameter values. This search process takes even longer on large datasets.


Datascientists have to trade between tuning a lot of hyperparameters and how long the process will take to run. So the next time a data scientist says they are tuning the hyperparameters of their model, you will have a better idea of what they mean.

You may also like...