In our analysis, K-means was implemented using sci-kit-learn library with a maximum iteration of 500 and a threshold of <img src="/na101/home/literatum/publisher/tandf/journals/content/ktmp20/2022/ktmp20.v009.i04/23328940.2022.2086777/20221101/images/ktmp_a_2086777_ilm0001.gif" alt="" />1×10−41×10−4<math><mn>1</mn><mo>×</mo><mrow><msup><mn>10</mn><mrow><mo>−</mo><mn>4</mn></mrow></msup></mrow></math>. Variables used for K-means were the runners’ marathon net timings and the Tdb and Twb experienced by the runners. Since K-means depends on Euclidean distance to perform clustering, it is critical that variables are of the same order of magnitude to ensure that clustering does not rely on variables with larger orders of magnitude. As the runners’ marathon net timings are a few orders of magnitude larger than Tdb and Twb, all variables were standardized to range from 0 and 1 using the Min-Max Scaler prior to K-means algorithm implementation as follows (Equation 1): <img src="/na101/home/literatum/publisher/tandf/journals/content/ktmp20/2022/ktmp20.v009.i04/23328940.2022.2086777/20221101/images/ktmp_a_2086777_m0001.gif" alt="" />(1) Z=X−min(X)max(X)−min(X)Z=X−minXmaxX−minX<math><mi>Z</mi><mo>=</mo><mrow><mfrac><mrow><mi>X</mi><mo>−</mo><mo form="prefix">min</mo><mfenced open="(" close=")"><mi>X</mi></mfenced></mrow><mrow><mo movablelimits="true">max</mo><mfenced open="(" close=")"><mi>X</mi></mfenced><mo>−</mo><mo form="prefix" movablelimits="true">min</mo><mfenced open="(" close=")"><mi>X</mi></mfenced></mrow></mfrac></mrow></math>(1) where Z is the scaled value based on the Min-Max Scaler and X is the original value of the variable.
The statistical analysis employed for this study is diverse and shows a deep understanding of what needs to be done in order to limit results to specific findings. My understanding of statistics is very limited compared to what these scientists were able to use and implement and their study. They also used "sci-kit-learn library" which is a machine learning program online. This allows them to crunch the numbers precisely, with less margin for human error.