On 2016 Nov 24, Lydia Maniatis commented:
As should be evident from the corresponding PubPeer discussion, I have to disagree with all of Guy’s claims. I think logic and evidence are on my side. Not only is there not “considerable evidence that humans acquire knowledge of how depth cues work from experience,” the evidence and logic are all on the opposite side. The naïve use of the term “object” and reference to how objects change “as we approach or touch them and learn about how they change in size, aerial perspective, linear perspective etc” indicates a failure to understand the fundamental problem of perception, i.e. how the proximal stimulus, which does not consist objects of any size or shape, is metamorphosed into a shaped 3D percept. Perceiving 3D shape presupposes depth perception. As Gilchrist (2003) points out in a critical Nature review of Purves and Lotto’s book, “Why we see what we do:” “Infant habituation studies show that size and shape are perceived correctly on the first day of life. The baby regards a small nearby object and a distant larger object as different even when they make the same retinal image. But newborns can recognize an object placed at two different distances as the same object, despite the different retinal size, or the same rectangle placed at different slants. How can the newborn learn something so sophisticated in matter of hours?” Gilchrist also addresses the logical problems of the “learning” perspective (caps mine): “In the 18th C, George Berkeley argued that touch educates vision. However, this merely displaces the problem. Tactile stimulation is even more ambiguous than retinal stimulation, and the weight of the evidence show that vision educates touch, not vice versa. Purves and Lotto speak of what the ambiguous stimulus “turned out to signify in past experience.” But exactly how did it turn out thus? WHAT IS THE SOURCE OF FEEDBACK THAT RESOLVES THE AMBIGUITY?” “Learning” proponents consistently fail to acknowledge, let alone attempt to answer, this last question. As I point out on PubPeer, if touch helps us to learn to see, then the wide use of touchscreens by children should presumably compromise 3D perception, since the tactile feedback is presumably indicative of flatness at all times.
The confusion is evident in Guy’s reference to the “trusted cue – occlusion implying depth.” Again, there is a naïve use of the term “occlusion.” Obviously, the image observers see on the screen isn’t occluded, it’s just a pattern of colored points. With respect to both the screen and the retinal stimulation, there is no occlusion because there are no objects. Occlusion is a perceptual, not a physical, fact as far as the proximal stimulus is concerned. So the cue itself is an inferred construct intimately linked to object perception. So we’re forced to ask, what cued the cue…and so on, ad infinitum. Ultimately, we’re forced to go back to brass tacks, to tackle figure ground organization via general principles of organization. Even if we accepted that there could (somehow) be unambiguous cues, we would still have the problem that each retinal image is unique, so we would need a different cue - and thus an infinite number of cues- to handle all of the ambiguity. Which makes the use of “cues” redundant.
So the notion that “one might not need much to allow a self-organising system of cue to rapidly ‘boot-strap’itself into a robust system in which myriad sensory cues are integrated optimally” is clearly untenable if we try to actually work through what it implies. The concept of ‘cue recruitment’ throws up a lot of concerns only because even its provisional acceptance requires that we accept unacceptable assumptions.
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.