584 Matching Annotations
  1. Jul 2021
    1. response relevant

      low frequency stimuli must be response relevant to activate DLPFC!

    2. response

      Here, 'conflict' on unexpected trials depends on the efficiency of prepotent control. Good controllers will elicit more conflict and incease RT. This is opposite to control in Stroop: here good controllers will decrease / resolve conflict more effectively and decrease RT.

    3. interaction

      Interaction that makes it so ACC becomes more activated for unexpected > expected on refresh than repeat.

    4. greater during the Refresh

      However this is not really a conflict - simple a control effort...

    5. greater during Unexpected versus Expected

      Clear effect that could be interpreted as (response) conflict

    6. Refresh-Expected to Repeat-Expected trials

      Subgoaling is most clearly targeted for refresh vs repeat expected

    7. n ACC

      ACC seems to be involved in the Refresh vs Repeat contrast - This is not a conflict, is it an expectation? I believe it was 50/50 refresh repeat setup.

    8. Repeat-Unexpected to Repeat-Expected

      In repeat unexpect v expect condition here, ACC is not reported as differentially active. Conflict would definitely predict that is should though!

    9. prepared response

      In refresh condition, participants were also triggered with a bias cue. It is argued that a subgoal is now, comparing the maintained bias word with the word in position cued by the response cue.

      True refresh only happens on Refresh-Unexpected and can thus be compared with Repeat-Unexpected to get a 'pure' refresh response.

      Subgoaling will happen on both refresh trials, unexpected and expected.

    10. need for control beyond

      Important: Questioning how far the response conflict account can take you.

      This seems somewhat orthogonal to the RL-ERN question.


    1. correct feedback stimuli, com-pared to error feedback stimuli, engaged a distributednetwork of brain areas consisting of the bilateral caudatenuclei, the right putamen, the right rostral and posteriorcingulate cortices, bilateral middle and inferior prefron-tal gyri, bilateral superior temporal gyri, and the rightlateral and medial occipital cortex

      Much more evidence for positive stimuli generating activity - but also not ACC!

    2. a strong prediction of the reinforcement learning the-ory is increased activation following error feedbackcompared to correct feedback, in the absence of con-flict related to action planning or expectancy violation

      But this is not consistent with the more modern idea of a Reward Positivity -> increase activity to positive feedback?

    3. a conflictview of ACC functioning would not necessarily predictits activation following error feedback, unless this feed-back elicits conflict between conceptual representations,requires a task set reconfiguration, or violates a strongexpectancy

      check in with PRO how it explains this?


    1. A neural area within this ROI (area 32:x = 4, y = 18, z = 44) was indeed more activated by error feedback thanby correct feedback on trials with random mappings

      But significance when defined as ROI...

    2. a comparison between error feedback and correct feedback on trialswith random mappings did not reveal any neural areas in which feed-back-related error activity was greater than feedback-related correctactivity

      No general SPM activity in dACC...

    1. According to the RL-ERN theory, the ERN isgenerated when a phasic decrease in mesencephalicdopaminergic activity (indicating that ongoing events areworse than expected) disinhibits the apical dendrites ofneurons in anterior cingulate motor cortex;

      So this is a bit cringe these days?

    1. inconsistent with theresponse inhibition hypothesis: larger inhibitory effortshould result in more effective suppression of the erro-neous response, thus in a shorter RT.

      ACC for response inhibition and for conflict detection make opposite prediction: ACC should go up with better (decreased) RT if it inhibits responses. But if it detects conflict, it would go up with lower RT which is commonly observed.

    2. only in the RI condition would oneinhibit an incorrectly primed response

      One emerged ACC theory by this point is that it inhibits responses in general (ie go/nogo)

  2. citeseerx.ist.psu.edu citeseerx.ist.psu.edu
    1. (b) a system, involving the ACC, that is responsible only for last-minute conflict resolution.

      'reactive control'

    2. (a) a system coming into play in anticipation of demanding activities and sustained through the course of such activities

      'proactive control'

    3. van Veen, Cohen, Botvinick, Stenger, & Carter, 2000

      Only response conflict triggers univariate ACC?

    4. Specifically, it shares extensive connections with prefrontal cortex

      so it is not PFC?

    5. testable predictions. One of these is that slowdowns in responding should occur not only after errors, but also after correct trials involving a high degree of conflict

      Not explicitly tested yet

    6. focus more effectively

      or predict with more certainty?

    7. control conditions, the presented letter series contained no Xs.

      blocked PET design

    8. pressing a button with each presentation but omitting this response if the presented letter was an X.


    1. error, surprise,posterior,andposterior variance in the ex-post 264period

      probably it should say 'posterior mean'

    2. To satisfy these 115conditions at the population level, the constituent single neurons must betuned to combinations 116of several cognitive variablesat once(“non-linear mixed selectivity”)(Rigotti et al., 2013)

      'Non-linear mixed selectivity'!

    1. normative neuralnetwork model that yields event-predictive encodings

      Since it is normative, it does not have to learn it 'without cheating' I guess?

    2. In the real world, we often receive signals from com-plementary cognitive systems that foreshadow when the cur-rent event is about to end (e.g. from the visual system asthe perceived distance between one’s hand and a glass ofwater approaches zero).

      This is their justification for providing information about 'event is about to end' -> consistent with ACC tracking progression through a (sub)task?

    3. This module re-ceives event boundary information in the form of an increasedvalue when a transition is about to happen

      isn't that just cheating?

    1. Pavlovian attraction parameterthat had proved important in the behavioral study (Huys et al., 2012).This captured the attraction of states based on their average future con-sequences, regardless of whether sufficient choices remained on a trial toexploit those consequences.

      This is exactly like the SR!

  3. Jun 2021
    1. To the extent that participants do not perceive any action–outcome contingencies, f ERN amplitude appears to be less sensitive to reward probability

      But then again, in the Zander study they know they are just watching passively, unless the notion of a BCI engages the fERN process where similar passive viewing without BCI wouldn't.

    1. pseudorewards are distinct from primary rewards

      This is I think one of the most fundamental issues. What constitutes a 'primary reward'?

    2. option terminates when a particularsubgoal is attained, which generates an option-specificprediction error, referredtoasapseudorewardpre-diction error (PPE)

      If everything goes as expected, no prediction error will ever be generated, right?


    1. Formalizing this account of choice may require us to reformulate the RL problem as being one of minimizing distance to goals rather than maximizing discounted future reward

      This is something the RewPos seems to be exactly correlated with! dACC for minimizing distance to goals!?

    2. prioritizing locations that will cause a substantial change in the future behavior of the agent


    3. When visual range is reduced, such as in nocturnal vision, plan-based control may only exist for stable environments over a previously established cognitive map

      This is more what we will do in the modular RL experiment - no visual guidance anymore!

    4. bility to detect the structure of a complex, cluttered environment with high temporal and spatial resolution

      The type of 'plan' where everything you want to do and reach is sensorily available to you


  4. May 2021
    1. This may even be of greater importance than for traditional reinforcement learning research



    1. task set representations will be inhibited after an unexpectedorsurprisingevent

      Ebitz dACC paper?

    2. Nunez Castellar et al


    3. performed a task set

      so can you 'perform a task set'?

    4. N-2 Repetition Cost

      I like this overview of the different effects laid out very clearly

    5. (Mayr & Keele, 2000)

      Unneccesary reference


  5. Apr 2021
    1. Interestingly,if the density is above zero then there is a slight gain by adding an ISI of 2s but no suchbenefit exists when the density is 0

      With 'density' the representation of nodes becomes similar, so in a sense 'additive' up until the cluster is left? --> Randomization breaks this and is thus underpowered! ISI adds small advantage but not enough. In that case, temporal limitation is a bad idea (MOTIVATION FOR PRESELECTION OF PATHS!)


  6. Mar 2021
    1. factoring out the negative lin-ear effect of rCV in rule-free trials

      what is really meant by this??

    2. r

      How is this 'updated in the opposite direction' ?


  7. Feb 2021
    1. ‘archive’ of the different states

      Just as HPC-mPFC for event recognition & memory?

    1. only found forwords in the backward condition

      But 'scene chunking' without temporal information does not really happen. It does for words.


  8. Jan 2021
    1. no main effects of Community

      meaning size - how many communities are there in the Experiment?


  9. Dec 2020
    1. . The PRO model thus accurately accounts for dACC activity over the course of the entire trial in our speeded decision-making task

      This is important - The 'reprogramming' that is sometimes seen dynamically between expected and real outcome etc.

    2. The EVC posits that activity in dACC reflects ‘expected value of control’—a trade-off between cost and benefits resulting in the selection of an optimal control signal.

      So EVC is presented and interpreted as neatly joining together the inhibitory aspects and the motivational aspects of dACC findings!


    1. a number of candidate learning signals measured through fMRI do not reflectlearning rate when considering a broader set of statistical contexts

      dis-implicates perheps dACC? Check out the paper.


  10. Nov 2020
    1. In humans, con-flict signals are apparent in the firing rates of single dACC neu-rons

      is this considered noncontroversial?


    1. only a main effect of phasic versus no-phasic LC response

      pupil correlation fully explained away by accounting for LC phasic

    2. “fake-beep” trials

      phasic response without stimulus doest not cause the same ACC effect?

    3. ACC rsc tended to be larger after versus before the beep stimulus, but only for the subset of trials in which it also elicited the characteristic phasic LC response

      ACC more correlated after beep ONLY IF LC system registered it with phasic 'startle' response => Here effect LC->ACC is opposite as in passive fixation condition?

    4. quenching

      decrease of intrinsic variability following stimulus ( or event?) onset == QUENCHING


    1. slower responses to the first than the second orthird targets

      after grey gratings


    1. frontoparietal network early in the preparation interval thatwas activated when the cues validly indicated that the task wouldchang

      so the association of cue with task has to be learned somehow - agency type of thing (control) -> RL through RewPos?

    2. the benefit of updating task-set on switch trials(switch-tovs.switch-away)

      this one could be interesting: you know you will have to switch (disengage?) but you don't know to what (although you could do some kind of 50/50 engage?)

    3. Frontoparietal theta (FPθ) activity has also been reported.

      FPtheta is something else than MFtheta! It either synchronizes to anterior theta (phase?) or co-increases in power.


    1. strategic mPFC dynamics whereongoing activity is modulated in response to changes in context

      which is what you would also expect in the case of option selection?

    2. non-informativecues

      control for informative cue vs novelty (same as in O'Reilly fMRI paper!)


    1. separate regressors were not constructed for each positionas in the onsets model above

      so it only has the ability to modulate the strength of encoding of simple/comples (and cue/no cue)


    1. deciding which task toperform in the future


    2. exploration of alternative strategies doesnot seem to be driven by unreliability, conflict, or errors

      this is where some of the explore/exploit stuff becomes redundant even...?


    1. Do neurally defined eventboundaries in a continuous movie, evoked by subtler transitionsbetween related scenes, generate the same kind of hippocampalsignature?

      interesting: transition moments in HMM become regressors to evaluate BOLD responses against!

    2. or when our goals change

      seems especially relevant to HRL?


    1. action domain

      consistent with DLPFC as actor in HRL-ACC theory!

    2. OFC neuronsencoded evidence associated with each shape in the shape se-quence, but only transiently

      consistent with update of cognitive map elsewhere in brain theory!


    1. Functional MRI data, however, suggeststhe opposite pattern for the mPFC/ACC, which are more active inresponse to gains compared to losses

      fMRI (and RewPos) suggest actually postive PEs are transferred/influential/coded in the mPFC/ACC


    1. two possible states of the transition probabilities: top/left/bot-tom/right and top/right/bottom/left

      obviously this is exhaustive // 80/20 chance of successive state

    2. three possiblestates of the reward probabilities for the left/right ports

      this is the actual chance of water reward outcome in the second state


    1. interaction between reward and task sequence did notreach significance

      we would expect it to if they argue for it on a neural level?

    2. We were especially interested in the effect of reward on proactivecontrol mechanisms, wherein goal-relevant information is encoded inpreparation for upcoming task demands.

      exactly what we see in ACC - beautiful Ebitz human cell recordings


    1. approach behavior

      the center-in moment when rats put nose in middle hole. (this is trial start you could say -> after this wait variable delay until L-R choice has to be made)

    2. needed to assess whether this result truly reflects an aspect of [DA] signaling that is inherently slow (tonic) or could instead be explained by rapidly changing [DA] levels, that signal a rapidly changing decision variable

      the original niv dayan slow DA as integration of reward to track reward rate will not work for changing reward rates, here changing reward rate was found encoded in the 'lingering' DA -> was it actively altered?

    3. reward rate is a key decision variable

      seems also very tightly linked to ACC activity...


    1. ACC-to-VTA 4-Hz communication increased in the vertex of the mazewhen animals made pre-reversal decisions to the LCLR arm

      because they anticipated a reversal!


    1. re-emergence of prolonged dopamine elevations by the larger than expected reward could be due to the presence of positive pre-diction errors, but could also result from a unidirectional attention, alerting, or motivating function

      motivating function more in line with DA for energizing?

    2. both prior to and at asymptotic performance the amplitude of dopamine elevation prior to sequence initiation significantly negatively correlated with time to complete the immediately following sequence

      motivational signal?

    3. self-paced (i.e., free-operant) action-se-quence task devoid of experimenter-provided initiation cues

      again something that ACC is heavily implied in - the coffee-tea task originally had the instruction to make self-determined choices of one vs the other at some point right?

    4. persistent motivation is required to drive behavior from initial actions through a series of events to earn a distal reward

      This is exactly one of the selection functions proposed to come from ACC right? - model-based forward prediction of a reward and thus selection of an entire sequence


    1. it could be argued that good performance in theinstructed goal task requires intrinsic motivation

      invigoration indeed

    2. prediction error responses indopamine neurons are influenced by task representations, which are putatively maintained inthe prefrontal cortex

      seems to refer to several OFC papers and a nice review - inspect?

    3. interaction between executive and lower-level instrumentallearning circuits

      This has never been explicitly showed but is the focus of the current paper


    1. In the present study, strategic ad-justments in behavior were promoted by both small rewards andby surprising ones

      and this might be the reason for constant difference between large and small reward - no so much the actual outcome value itself but its relatedness to p(switch)

    2. consistent with the idea that monkeys treat am-biguous options as providing less information about outcomes

      so nothing to update behavioural strategy

    3. The activity of 55% of neurons (n51/92, 35in monkey E, 16 in monkey O) signaled the size of the rewarddelivered following risky choices

      but this could also simply be an identity coding for the size of the reward, not directly a value

    4. nexpected small rewards promoted largerwillingness to choose the redder option

      essentially the same interpretation as unexpected large reward on risky choice - exploratory option (maybe red is now better than blue?). Also an instantiation of win-stay lose-shift behaviour (right?)


    1. non-specific RPEsignals did not differ

      between broad and narrow spiking neurons

    2. feature of the chosen stimulus, i.e., they encoded feature-specificRPEs

      so most surprise related signals were feature specific! -> this is good, separate dimensions or channels for reward and expected outcome (identity) for monitoring action-outcome contingencies?


    1. can lead to increases in phasic DA activity

      basically less able to use context from previous states/actions to predict (minimize) DA reward related response

    2. persistently chose one of thegoal arms (only LL reward in 14 behavioral recording sessionsand only SS reward in 3 sessions) even though they sampledboth reward sizes in the preceding forced-choice trials.

      Clearly problems with updating behavioural strategy!


    1. the lOFC may identify in changes in rein-forcement contingencies and signal the mOFC to update apprais-als concerning actions that may be more profitable.

      So it becomes circular - if lOFC infers something, it can signal through mOFC to signal an update at the relevant map site? -> kind of replay-esque scaffolded map

    2. well trained

      but here learning is probabilistic - OFC might be necessary to keep track of changes in that case

    3. lOFC hasbeen reported to attenuate phasic dopamine


    4. welltrained rats

      The rats were already trained on the reversal learning task -> this is not an acquisition study which OFC seemed to be implied in more

    5. inactivation of some ofthe main OFC and medial PFC inputs to the accumbens

      Nacc is relevant for this?

    6. lesions of the dorsolateral PFC (dlPFC) in primates or medialPFC in rats impairs shifts between different strategies or atten-tional sets

      Note that dlPFC in primates is mPFC in rats? -> different functions!


  11. Oct 2020
    1. It is currently unclear as to why signed predictionerrors were observed in the LFPs, but not at the level of theensembles.

      you would expect strength of encoding to be affected here! maybe something about the structure of the task equalizes reward vs nonreward somehow too much...

    2. The present data were collected using the same task in which alocal field potential (LFP) was recorded (Warren et al., 2015)thatexhibited properties consistent with the FN observed in the EEGof humans

      Did this show DA effect?

    3. at the ensemblelevel, responses did not uniformly increase or decrease for unex-pectedly ‘‘good’’ versus unexpectedly ‘‘bad’’ outcomes but onlyconveyed information that the trial was R or NR, regardless ofwhether or not the particular outcome was unexpected

      fundamentally important - identity over value! - But what about reward-positivity aspect of FN -> unexpectedness of identity is controlled for and still there is an effect. This relates to stronger DA encoding of WM / ACC representations? This paper says nothing about that yet afaik.

    4. while prior historycaused eFSRs to progressively resemble FSRs, the precisetiming and the direction of change varied across the population

      some neurons keep representing the FSR - tracking of context in general - vs some ramp up at NP - tracking of timing / event relatedness specifically?

    5. eFSR tracked past outcomes in a port-specific manner, afinding that is consistent with the single-unit analysis whereonly 1% of the total cells had expectancy-related changes infiring for both reversal ports.

      so it is not even about general expectation of reward - there is an understanding that there are different response / feedback channels and these are tracked separately! (course-of-action analogy // also very similar to multiple bandit problems!)

    6. actual outcome feedback scents will be referred to asthe FSR

      pattern during feedback scent (not actual reward but 100% predictive)

    7. ‘‘early feedback-like scent response’’ (eFSR)

      pattern during nose poke

    8. only the responseto the NP that changed, which appeared to reflect the mostcommonly encountered outcome at a given port

      What defines the ensemble activity is the actual feedback - pre-feedback activity during nose-poke gets trained to represent similarly

    9. internalized representation of a future event,it is difficult to show clearly that a neuron encodes this informa-tion

      Fundamental problem with subjective beliefs - you can access them with (learning) models though!


    1. set of allshortest-path problems within the grap

      Unlike Tomov, who also bias the hierarchical representation with reward functions


    1. explicitdecisionsthatpertaineddirectlytotheprobabilisticfeaturesoftheongoingtaskmodeling volatile environments. The latter characteristics possibly ex-plain why the present results contrast with earlier findings

      behavioural relevance towards reward makes ACC tasks much different...!


    1. Participants were the fasteston random high-probability triplets

      not what you would expect at all? Pattern high prob. should be fastest...

    2. P3 amplitude modulations in both di-rections could indicate the implicit sensitivity to or the implicitacquisition of predictable relations in a stimulus stream.

      almost opposing results - modulation to what is unexpected (surprise signal?) // also modulation to onset of higly task relevant more predictable targets??


    1. minimize the RMS prediction error with respect to each subject’sobserved reaction times, RMSE¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1TPtrtðÞ^rtðÞðÞ2q, whereTis the number oftrials

      so no trial-by-trial update of a?

    2. ‘RT~log(Trial)*Stage+Target+Recency+(1+log(Trial)*Stage+Recency|ID)’

      only leave out transition type - this will be predicted by the model

    3. modular-lattice effect

      potentially exactly the pre-activation corresponding to selected context in ACC!

    4. onsider the possibleconfound of stimulus recency: the tendency for people to respondmore quickly to stimuli that have appeared more recently

      This is why Schapiro only analyzed Hamiltonian paths

    5. modular graph with three communities offive densely connectednodes

      The Schapiro organization

    6. 15 dif-ferent stimuli represented a node in an underlying transitionnetwork

      just as in Schapiro, 15 stimuli!


    1. no decr ease in performanc e occurr ed during these fir st sess ion blocks; in Exper iment 1 we even obs erved a modes t performance incr ease

      So to say general skill learning (overall reduction of RT) takes place offline is misguided - online it tends to increase because of fatigue or inhibition release -> despite that however, there is still performance increase after rest period, showing offline above and beyond for general skill learning (has nothing to do with actual sequential knowledge)

    2. random-high tr ial

      When a trial coded as r emits a response that matches exactly the response that was elicted two trials earlier - on another r trial - as it would in a pattern trial (hence random-high)

    3. patte rn trial

      This is basically every trial where a number is filled in in the sequence example


    1. Third,andperhapsmostimportant,in theBaldwinand Kutas study,partic-ipantswererequiredtorespondonly onthosetrials onwhichthetargethadperformedacertainpredefinedmovement,whereasin thepresentexperiment,a responsehad to be given on each trial.

      Response demands again influence the resulting ERPs!

    2. larger P3b effect in the second experi-mentalhalfthan in the first half

      modified more if expectation is stronger!


    1. if ACC is responsible for computing changes in belief

      how analogous is this to changing context, and how analogous is that to selecting different options?

    2. the relationship between pupil dilation andchange in representation strength seems to hold in cases where uncertainty is increasing, but notwhen it is decreasing

      could also have to do with how much the LC-NE system 'drives' the pupil size - after a while of rewarding there might not be learning/upregulation necessary and LC-NE and consequently pupil size could return slowly to a baseline, while neural effect in mOFC stays 'locked in'?

    3. weightings of representations of specific option

      It is not just generally coding for expected values, or even reward functions (?), it is really the strength of the IDENTITY of the response that is decodable, varying with the entropy

    4. which option was the current high-reward option

      so it is a change in the reward function... is that what OFC codes specifically?

    5. irst process is evidence-drivenupdating of beliefs

      simple small adjustment of different parameters - e.g. transition probabilities, successor representations, whatever


    1. n-adjacentdependencies

      Is the learning of triplets really the acquisition of non-adjacent dependencies though? I can imagine triplets themselves have some special status.

    2. tbothprocessesinvolvethelateralprefrontal cortical regions that subserve several cognitive functions

      both model free and model based somehow rely on lateral PFC?


    1. one of themost potent antecedent conditions for the P300 component

      And that might be also why Mars works so well - there are unique responses to each of the 4 states

    2. In our study both perceptual and motor deviants di€ered perceptually fromthe next most likely, regular stimulus. Thus, it makes sense that both bear an e€ecton the N200.

      Deviation from an internal model yields unique N200 effect! P300 in motor case is something additional?

    3. 200 was a€ected by both types of deviants and the P300 bymotor deviants only,

      general idea of LC-NE not completely compatible? Or N200 gets enhanced by LC-NE like P3, but something extra for P3? But N2 also showed unique elevation for the sequential expectation of stimulus identity, not shared by P3. Which might then be an internal function of the ACC...

    4. t all three types of stimuli were processed bythe system in the very same manne

      Realization: The P3 effects in other studies were all related to global probabilities, so that does affect saliency in participants, regardless of sequential expectation? In Mars there is not even any sequential expectation you could have... In Squier it does violate a 'template' but only for recently encountered stimuli, not abstract sequential prediction.

    5. hierarchical structure



    1. when a more effort-ful task was used (semantic categorization), robust learning wasobserved

      so increasing the difficulty / attention / benefit of the task will probably increase the learning effect!


    1. The task consisted of a number(1, 2, or 3) appearing on a computer screen,and the participant was instructed to pressthe corresponding key on the number padwith the right hand as quickly and as accu-rately as possible

      Similar setup to what we would use!

    2. novelty represents a deviation from the ex-pected likelihood of an event on the basis ofboth previous information and internal es-timates of conditional probabilities

      But novelty and surprise are now used for different effects on expectation and lead to differential changes in the P300


    1. irrespective of whether the instruction cues indicate exactlyhow to respond (i.e. to switch to the opposite response) or to guessthe response, so long as the instructions pertain to the present trialrather than to the following trial.

      Segmentation of events into separate trials is important for how the ACC processes strategy and related evaluative information? It could relate to Working Memory, if the next instruction cue is presented BEFORE the evaluation of the last instruction cue, perhaps it is overwritten or simply forgotten (too much going on in short time span). But it can't explain the actual pattern of significant findings.

    2. infrequently occurring stimulus

      infrequency causes the response competition (automatic v controlled activation?)


    1. (despite the fact that both regions are less active overall at thistime),

      Actual level of activity less relevant than the content of information being processed!


    1. fundamental differencein how the hidden variable context is represented in HPCcompared to PFC brain areas

      ACC and DLPFC both do reflect processing affected (hence decodable) by context -> but HPC encodes it in an abstract way: Making more generalization possible!

    2. SROmappings changed simultaneously for all 4 conditions

      This is like in the Collins & Frank paper


    1. PEsignalsaregeneratedbythedACCandthenconvergetowardthebrainstemnuclei

      but all PE signals are reward related in this model right? nothing else

    2. notusefulinclassicalconditioning

      because the system does not have to DO anything to obtain the reward!


    1. ACC neural activity has also been shownto covary with pupil modulations under passiveviewing conditions, over both short (singlespike) and longer (several seconds) timescales [38].

      Very relevant!


    1. facilitates integrating representations

      like into schemata? - very mPFC related


    1. bot-tom-up excitation might be conceived of as determining a probability distribution overhigher-level event schemata

      this is crucial and exactly what the CRP implements


    1. instead of updating our beliefs about representations that we have already learned (infer-ence), we must update our beliefs about theparametersof the generative model itself

      critical difference!

    2. it would be wasteful to invest resources in pre-activatingupcoming information that is irrelevant to our current comprehension goals


    3. Pro-gressively higher levels of the generative model would have increasingly larger (longer)temporal “receptive fields.”

      Just like we see in the ACC (in terms of reward)?

    4. a large prediction error that leads us to infer that the statistical struc-ture of the environment has fundamentally changed is known asunexpected surprise

      A special class for types of prediction errors that fit with a shift in context

    5. can still beexplainedby this overall goal

      Explainability is key in switching event models?

    6. At any given time, this event model, in turn, provides new information that is passedup to the third level of the hierarchy,

      Continuous ramping ACC activity?

    7. schema-relevant information is latent within long-term memory, but linked to goalrepresentations so that it can be proactively retrieved

      LTM would be cognitive map in the HPC ?! - Goal representations that actively retrieve the map would be in the MCC?

    8. Goal end states are usuallyconceptualized as the desired future state of affairs that is associated with a goal’s fulfill-ment

      The exact same as a subgoal!

    9. The change that isinduced by each incoming event can be conceptualized as the change in the probabilitydistribution induced by this new event (the Kullback–Leibler divergence), and it can bethought of as animplicit prediction error.

      This is exactly what is found in the O'Reilly study?!

    10. two-level generative model is often insufficient to explainthe complex and multidimensional structure of our environmental observations

      So something like the CRP idea is NOT enough for natural event segmentation?

    11. probabilistic dependencies

      This is easy for discrete state-to-state transitioning, but for continuous, dynamic, events it gets very messy I presume...

    12. mostuncertain

      another good cue for segmentation. Novelty cannot work I guess, that relates to exploration, something else cues the compression into events

    13. If we were able to observe all the events together as a batch, detecting the boundariesbetween event models would be less challenging

      Inspecting e.g. a full graph allows you to calculate any statistic you want - That would make it relatively easy to find a good (optimal?) metric for clustering

    14. encoded probabilistically such that the events and event sequences that aremost likely to occur within a given schemaclustertogether in representational space

      Schema's as probabilistic clusters in representational space - By definition would require 'bottlenecks' as segregation?

    15. end state of even a single event functions as aprecondition

      cues other likely events to happen!


    1. Surprise?and Novelty?contribute each separately to the ERP components at around 300ms

      P300 is not split into different components! Oh no how about correlation with the LC-NE system now...

    2. we hypothesize that participants estimate the probabilities89of transitions from a given state to another state when performing a given action

      So this is action-based, not passive as in Mars study

    3. surprise independent of novelty

      So 2 contrasts?

      • Early vs late trap states (novelty)
      • Surprising transitions after state replacement (surprise)
    4. find the shortest path to the goal image


    5. 10 states with 4 possible actions per state plus one goal state

      So we might learn from this experimental setup and modify it to include sub-goal states??

    6. Surprise is triggered by the violation of expectations and manifests itself in pupil dilation [21]33and EEG signals

      This is the typical P3 response / comes through LC-NE?


    1. in studies that separately assess encoding of outcome identity versus outcome value, activity in brain areas that are often considered to be emblematic of economic value (in particular, the OFC) turns out to correlate with outcome identity instead

      coding for outcome identity is a pretty cool (maybe even cooler) result too!

    2. Indeed, the idea that value exists on a single scale, also called a “common currency,” has been extended to encompass not only goods, but also effort costs and time delays

      This is part of the ACC theory to some extent?

    1. neural network associated with sur-prise is largely distinct from that of valence

      Except for the cingulate cortex...!

    2. aMCC, dMCC, the pre-SMA

      the typical culprit!

    3. h vary categorically along positive-negative axes,

      This is how they define Valence - based on trial condition

    4. , in additio

      They do acknowledge the full RPE representation in the midbrain DA system

    5. These later valenceand surprise signals appeared in spatially distinct but temporally over-lapping neural signatures

      So they are encoded separately and potentially combined later?

    6. human electroencephalogra-phy (EEG) studies, attempting to offer a temporal account of the corti-cal dynamics associated with RPE processing, did not find a systematicmonotonic response profile consistent with a single RPE representationbut instead offered evidence suggestive of separate representationsfor valence and surprise at the macroscopic level of responses recordedon the scalp.

      But how does this overrule single-cell recordings...?


    1. suppresses N200 amplitude while concomitantly reducingevoked theta power (together with total theta power, to whichit contributes), but less so for induced theta power, with whichthe dopamine signal is relatively inconsistent in phase acrosstrials

      Reward positivity suppresses the N200, but it also suppresses the phase coherent theta power (which IS N200!)

      It does not suppress incoherent theta power

    2. N200amplitude relative to total theta power provides a more sensitiveindex of a reinforcement learning signal

      So total theta power is NOT what we should use if there is some sort of reward-based learning going on!

    3. the effect of probabil-ity on N200 amplitude can be predicted by total theta powerwhereas the effect of valence on total theta power can be predictedby N200 amplitude

      This is a cool relationship to acknowledge

    4. when the effect of N200 ampli-tude is statistically controlled, the evidence for sensitivity of totaltheta power to outcome probability compared to outcome valenceincreases from “weak” to “strong.”

      Controlling for N200 should remove phase-coherent parts of the theta signal -> makes sense it is probability sensitive, since induced theta is also?

    5. sensi-tivity of N200 amplitude to outcome probability is mostly drivenby its correlation with ongoing theta power, as statistically control-ling for this correlation reduced the effect of probability

      So the N200 effect here, which is actually the classic FRN component and yields reward positivity when calculated as difference wave - has good predictive validity for outcome valence (error vs correct). When theta power is controlled for, it does not show much evidence for being sensitive to frequent v infrequent feedback

    6. very strongly support Prob-ability over Valence,pBICª.99

      Even though both are significant - can't it be both?

    7. induced theta powerwas greater for the infrequent oddball condition compared to thefrequent oddball condition

      Similar to evoked theta power!

    8. Valence overProbability,pBICª.9

      So evoked theta is claimed to be sensitive to the valence - because it codes error vs correct in the time estimation task. HOWEVER it also significantly codes infrequent vs frequent oddball, though this frequency effect is dropped for the time-estimation task feedbacks?

    9. calculated the correlationbetween N200 amplitude and total, evoked, and induced thetapower.

      wasn't evoked theta power directly calculated from N200 (and thus induced theta power as 1 - that => heavy correlation? )

    10. Evoked theta power was determined directly from theaveraged ERP

      How does this affect the consistency? Is it inconceivable that anything else contributes to the ERP?

    11. induced portion of frontalmidline theta would be relatively more sensitive to outcomeprobability

      Crucial - addition of the time-frequence method? And one that seems relevant to possible state-prediction error!

    12. ERP technique assumessuch information to be noise

      Any oscillations with low phase-coherence are noise according to ERP - is this necessarily true?

    13. oscillatory activity that is consistent in phase to an event acrosstrials contributes to the generation of an ERP to that event

      This is an important consideration!

    14. sensitivity to important cognitive events in general ratherthan to errors in particular

      So it is much more 'general' then when the fERN is observed? Is that a good thing?

    15. sensitive to thevalenceof feedback

      Both frontal midline theta and the fERN can be used to identify the valence of feedback - reward vs no reward - correct vs incorrect/error


    1. Intertrial phase coherence

      Check across trials if estimate of phase is similar -> would imply it has been 'set'!