Reviewer #2 (Public Review):
Ivanov et al. examined how auditory representations may become invariant to reverberation. They illustrate the spectrotemporal smearing caused by reverberation and explain how dereverberation may be achieved through neural tuning properties that adapt to reverberation times. In particular, inhibitory responses are expected to be more delayed for longer reverberation times. Consistently, inhibition should occur earlier for higher frequencies where reverberation times are naturally shorter. In the manuscript, these two dependent relationships were derived not directly from acoustic signals but from estimated relationships between reverberant and anechoic signal representations after introducing some basic nonlinearity of the auditory periphery. They found consistent patterns in the tuning properties of auditory cortical neurons recorded from anesthetized ferrets. The authors conclude that auditory cortical neurons adapt to reverberation by adjusting the delay of neural inhibition in a frequency-specific manner and consistent with the goal of dereverberation.
Strengths:<br /> This main conclusion is supported by the data. The dynamic nature of the observed changes in neural tuning properties are demonstrated mainly for naturalistic sounds presented in persistent virtual auditory spaces. The use of naturalistic sounds supports the generalization of their findings to real live scenarios. In addition, three control investigations were conducted to backup their conclusions: they investigated the build-up of the adaptation effect in a paradigm switching the reverberation time after every 8 seconds; they analyzed to which degree the observed changes in tuning properties may result from differences in the stimulus sets and unknown non-linearities; and, most convincingly, they demonstrated after-effects on anechoic probes.
Weaknesses:<br /> 1) The strength of neural adaptation appears overestimated in the main body of the text. The effect sizes obtained in control conditions with physically identical stimuli (anechoic probes, Fig. 3-Supp. 3B; build-up after switching, Fig. 3-Supp. 4B-C) are considerably smaller than the ones obtained for the main analyses with physically different stimuli. In fact, the effect sizes for the control conditions are similar to those attributed to the physical differences alone (Fig. 3-Supp. 2B).<br /> 2) All but one analysis depends on so-called cochleagrams that very roughly approximate the spectrotemporal transfer characteristics of the auditory periphery. Basically, logarithmic power values of a time-frequency transformation with a linear frequency scale are grouped into logarithmically spaced frequency bins. This choice of auditory signal representation appears suboptimal in various contexts:<br /> On the one hand, for the predictions generated from the proposed "normative model" (linear convolution kernels linking anechoic with reverberant cochleagrams), the non-linearity introduced by the cochleagrams are not necessary. The same predictions can be derived from purely acoustical analyses of the binaural room impulse responses (BRIRs). Perfect dereverberation of a binaural acoustic signal is achieved by deconvolution with the BRIR (first impulse of the BRIR may be removed before deconvolution in order to maintain the direct path).<br /> On the other hand, the estimation of neural tuning properties (denoted as spectro-temporal receptive fields, STRFs) assumes a linear relationship between the cochleagram and the firing rates of cortical neurons. However, there are well-described nonlinearities and adaptation mechanisms taking place even up to the level of the auditory nerve. Not accounting for those effects likely impedes the STRF fits and makes all subsequent analyses less reliable. I trust the small but consistent effect observed for the anechoic probes (Fig. 3-Supp. 3B) the most because it does not rely on STRF fits.<br /> Finally, the simplistic nature of the cochleagram is not able to partial out the contribution of peripheral adaptation from the adaptation observed at cortical sites.